In this article, we explore how to optimize dispatching policies in a system with multiple servers when there is a heavy load of work. We use size- and state-aware dispatching policies, which take into account the amount of work remaining at each server and the system state. Our main results show that minimizing the mean response time can be achieved by setting a fraction of the load on each server. Specifically, we define three key quantities related to drifts in work increases or decreases: 𝜀, 𝜌𝑠, 𝜌𝑚, and 𝛼, 𝛽. When 𝑊𝑠 > 𝑐, 𝑊𝑠 has a drift of −𝛼, and when 0 < 𝑊𝑠 ⩽ 𝑐, 𝑊𝑠 has a drift of +𝛽.
To demystify these concepts, let’s consider an analogy: Imagine a busy kitchen with multiple cooks (servers) preparing dishes (jobs). Just like how each cook has their own workstation (server), we have multiple servers in the system. Now, imagine that the kitchen is busier than usual, and the cooks need to work efficiently to keep up with the demand. That’s where size- and state-aware dispatching policies come in – they help ensure that each cook has an appropriate workload based on their capacity (size) and the current system state.
Our analysis shows that setting a fraction of the load on each server can lead to better performance. In other words, we want to distribute the workload evenly among the servers, just like how a skilled chef would allocate tasks to different cooks in the kitchen. By doing so, we can minimize the mean response time and improve the overall efficiency of the system.
In summary, this article provides insights into optimizing dispatching policies in a heavy-traffic system with multiple servers. By taking into account the amount of work remaining at each server and the system state, we can achieve better performance and minimize the mean response time. The key takeaway is that setting a fraction of the load on each server can lead to improved efficiency, just like how a skilled chef allocates tasks in a busy kitchen.