Excursion into analytic combinatorics and network calculus

Get Complete Project Material File(s) Now! »

Compatibility constraints

Consider a queue with S servers and let S = {1, . . . , S} denote the set of servers. The customers are divided into a set I = {1, . . . , I} of I classes. For each i 2 I, class-i customers enter the queue according to an independent Poisson process with a positive rate i. The service requirements are independent and exponentially distributed with unit mean and each customer leaves the queue immediately after service completion. For each s 2 S, server s has a positive service capacity denoted by μs. As in Chapter 3, we consider two state descriptors called the microstate and the macrostate. The former retains the arrival order of customers in the queue while the latter does not. More precisely, the microstate is the sequence c = (c1, . . . , cn) 2 I, where n is the number of customers in the queue and cp is the class of the p-th oldest customer, for each p = 1, . . . , n, and the macrostate is the vector x = (x1, . . . , xI ) 2 NI , where xi is the number of class-i customers, for each i 2 I. For each c 2 I, we use the notation |c| = (|c|1, . . . , |c|I ) to denote the macrostate associated with microstate c. Appendix A gives more details on these notations. Compatibility graph. The class of a customer defines the set of servers that can be pooled to process this customer in parallel. Specifically, for each i 2 I, a class-i customer can be processed in parallel by any subset of servers within the set Si S, which is assumed non-empty. This defines a bipartite graph between classes and servers, called the compatibility graph of the queue, in which there is an edge between a class and a server if the customers of this class can be processed by this server. An example is shown in Figure 4.1.
When a customer is in service on several servers at the same time, its service rate is the sum of the rates allocated by each server to this customer. For instance, if the servers represent computers that can cooperate to process a job, making this assumption amounts to neglect the overhead due to parallelization. This example will be considered in Chapter 7. Another interpretation of the parallel processing, less straightforward, will be developed in Chapters 8 and 9. Service policy. In Sections 4.2 and 4.3, we will consider two service policies called balanced fairness and first-come-first served. Balanced fairness is a resource-sharing policy. Like processorsharing, it assumes that the capacity of each server is infinitely divisible among its customers. Chapter 7 will propose a scheduling algorithm that implements this resource-sharing policy but, for now, we will only specify the service rate received by each customer under balanced fairness. The first-come-first served policy of Section 4.3, on the other hand, is a time-sharing policy. Under the assumption that the full capacity of each server is allocated to a single customer at a time, this policy shares the time of each server among its compatible customers. In any case, we let = (1, . . . , I ) denote the vector of per-class service rates, so that, for each i 2 I, i is the overall service rate received by all class-i customers taken together. Depending on the service policy, this vector will be a function of the microstate or the macrostate only. Such a vector does not specify how the server resources are effectively assigned to customers in the queue and, in general, there are several ways of achieving a given vector of per-class service rates. It can also happen that a vector of per-class service rates is not achievable at all.

Balanced fairness

Similarly to processor-sharing, balanced fairness is a resource-sharing policy that assumes that the capacity of each server is infinitely divisible among its customers. However, while processor-sharing imposes that the capacity of each server is equally shared among its customers, balanced fairness imposes that all customers of the same class receive the same service rate—over all servers taken together. Also, the resource allocation depends on the number of customers of each class in the queue but not on their arrival order, that is, it is a function of the macrostate only. Divisible server capacity. For each s 2 S, the capacity of server s is infinitely divisible among the customers of the classes i 2 I such that s 2 Si. In particular, all customers can—and will—be in service at the same time. This implies that each vector of the capacity region is feasible. More precisely, given a macrostate x = (x1, . . . , xI ) 2 NI and a vector of per-class service rates = (1, . . . , I ) 2 , there is a way of dividing the capacities of the servers among their compatible customers such that, for each i 2 Ix, each class-i customer is served at rate i/xi. The two main arguments of the proof, based on the polymatroidal nature of the capacity region, can be summarized as follows. First, each vector of the capacity region is componentwise lower than or equal to a vector that belongs to a face of this capacity region, so that we just need to show the result for the vectors of the faces. The second argument is that each vector of a face is a convex combination of its corners. But we observed earlier that the corners corresponds to feasible resource allocations, in which each server is either idle or dedicated to the customers of a single class. The coefficients of the convex combination then give a way of deriving the desired resource allocation. A complete and slightly different proof can be found in Theorem 1 of [90].

Stationary distribution and service rates

Under balanced fairness, the macrostate of the multi-server queue defines an irreducible Markov process on NI . Under first-come-first-served policy, the stochastic process defined by the macrostate does not have the Markov property in general, but the stochastic process defined by its microstate does. In either case, the queue is said to be stable if the relevant Markov process is ergodic and stationary if this Markov process is stationary. The following theorem shows the relation between these processes.

Table of contents :

Introduction
Multiplexing users on a single computer
The single-server queue
Sharing multiple computers
The multi-server queue
Contributions and structure of the manuscript
I Methodology
1 Whittle networks
1.1 Definition
1.2 Stationary analysis
1.3 Reversibility
1.4 Insensitivity
1.5 Concluding remarks
2 Order-independent queues
2.1 Definition
2.2 Stationary analysis
2.3 Quasi-reversibility
2.4 Concluding remarks
3 Equivalence of Whittle networks and order-independent queues
3.1 Definition
3.2 Imbedding
3.3 Stability condition
3.4 Performance metrics
3.5 Concluding remarks
II Analysis of the multi-server queue
4 The multi-server queue
4.1 Definition
4.2 Balanced fairness
4.3 First-come-first-served
4.4 Stationary analysis
4.5 Concluding remarks
5 Performance analysis by state aggregation 81
5.1 Generic formulas
5.2 Poly-symmetry
5.3 Random class assignment
5.4 Numerical results
5.5 Concluding remarks
Appendix 5.A Proof of Theorem 5.14
Appendix 5.B Proof of the lemmas for Theorem 5.14
6 Performance analysis by server elimination
6.1 Generic formulas
6.2 Random customer assignment
6.3 Local assignment
6.4 Numerical results
6.5 Concluding remarks
III Applications in algorithm design
7 Job scheduling
7.1 Scheduling algorithm
7.2 Queueing analysis
7.3 Numerical results
7.4 Related work
7.5 Concluding remarks
8 Load balancing
8.1 Load-balancing algorithm
8.2 Queueing analysis
8.3 Numerical results
8.4 Related works
8.5 Concluding remarks
9 Extensions
9.1 Combining job scheduling and load balancing
9.2 Load balancing with multiple dispatchers
9.3 Concluding remarks
Appendix 9.A Proof of irreducibility
Conclusion
Better understand the performance of insensitive algorithms
A general framework for insensitivity
Integrate data
Appendix A Notations
A.1 General notations
A.2 Macrostate and microstate
A.3 Index of notations
Appendix B Useful probability notions
B.1 Random variables
B.2 Markov chains and processes
Appendix C Excursion into analytic combinatorics and network calculus
C.1 Introduction
C.2 Reminder on generating functions
C.3 The single-server queue
C.4 Extensions
C.5 Numerical results
C.6 Concluding remarks
Publications
International publications
French publications
Source code to generate the numerical results
Bibliography