| 1 | Client Scheduling in X |
| 2 | Keith Packard |
| 3 | SuSE |
| 4 | 10/28/99 |
| 5 | |
| 6 | History: |
| 7 | |
| 8 | Since the original X server was written at Digital in 1987, the OS and DIX |
| 9 | layers shared responsibility for scheduling the order to service |
| 10 | client requests. The original design was simplistic; under the maximum |
| 11 | first make it work, then make it work well, this was a good idea. Now |
| 12 | that we have a bit more experience with X applications, it's time to |
| 13 | rethink the design. |
| 14 | |
| 15 | The basic dispatch loop in DIX looks like: |
| 16 | |
| 17 | for (;;) |
| 18 | { |
| 19 | nready = WaitForSomething (...); |
| 20 | while (nready--) |
| 21 | { |
| 22 | isItTimeToYield = FALSE; |
| 23 | while (!isItTimeToYield) |
| 24 | { |
| 25 | if (!ReadRequestFromClient (...)) |
| 26 | break; |
| 27 | (execute request); |
| 28 | } |
| 29 | } |
| 30 | } |
| 31 | |
| 32 | WaitForSomething looks like: |
| 33 | |
| 34 | for (;;) |
| 35 | if (ANYSET (ClientsWithInput)) |
| 36 | return popcount (ClientsWithInput); |
| 37 | select (...) |
| 38 | compute clientsReadable from select result; |
| 39 | return popcount (clientsReadable) |
| 40 | } |
| 41 | |
| 42 | ReadRequestFromClient looks like: |
| 43 | |
| 44 | if (!fullRequestQueued) |
| 45 | { |
| 46 | read (); |
| 47 | if (!fullRequestQueued) |
| 48 | { |
| 49 | remove from ClientsWithInput; |
| 50 | timesThisConnection = 0; |
| 51 | return 0; |
| 52 | } |
| 53 | } |
| 54 | if (twoFullRequestsQueued) |
| 55 | add to ClientsWithInput; |
| 56 | |
| 57 | if (++timesThisConnection >= 10) |
| 58 | { |
| 59 | isItTimeToYield = TRUE; |
| 60 | timesThisConnection = 0; |
| 61 | } |
| 62 | return 1; |
| 63 | |
| 64 | Here's what happens in this code: |
| 65 | |
| 66 | With a single client executing a stream of requests: |
| 67 | |
| 68 | A client sends a packet of requests to the server. |
| 69 | |
| 70 | WaitForSomething wakes up from select and returns that client |
| 71 | to Dispatch |
| 72 | |
| 73 | Dispatch calls ReadRequestFromClient which reads a buffer (4K) |
| 74 | full of requests from the client |
| 75 | |
| 76 | The server executes requests from this buffer until it emptys, |
| 77 | in two stages -- 10 requests at a time are executed in the |
| 78 | inner Dispatch loop, a buffer full of requests are executed |
| 79 | because WaitForSomething immediately returns if any clients |
| 80 | have complete requests pending in their input queues. |
| 81 | |
| 82 | When the buffer finally emptys, the next call to ReadRequest |
| 83 | FromClient will return zero and Dispatch will go back to |
| 84 | WaitForSomething; now that the client has no requests pending, |
| 85 | WaitForSomething will block in select again. If the client |
| 86 | is active, this select will immediately return that client |
| 87 | as ready to read. |
| 88 | |
| 89 | With multiple clients sending streams of requests, the sequence |
| 90 | of operations is similar, except that ReadRequestFromClient will |
| 91 | set isItTimeToYield after each 10 requests executed causing the |
| 92 | server to round-robin among the clients with available requests. |
| 93 | |
| 94 | It's important to realize here that any complete requests which have been |
| 95 | read from clients will be executed before the server will use select again |
| 96 | to discover input from other clients. A single busy client can easily |
| 97 | monopolize the X server. |
| 98 | |
| 99 | So, the X server doesn't share well with clients which are more interactive |
| 100 | in nature. |
| 101 | |
| 102 | The X server executes at most a buffer full of requests before again heading |
| 103 | into select; ReadRequestFromClient causes the server to yield when the |
| 104 | client request buffer doesn't contain a complete request. When |
| 105 | that buffer is executed quickly, the server spends a lot of time |
| 106 | in select discovering that the same client again has input ready. Thus |
| 107 | the server also runs busy clients less efficiently than is would be |
| 108 | possible. |
| 109 | |
| 110 | What to do. |
| 111 | |
| 112 | There are several things evident from the above discussion: |
| 113 | |
| 114 | 1 The server has a poor metric for deciding how much work it |
| 115 | should do at one time on behalf of a particular client. |
| 116 | |
| 117 | 2 The server doesn't call select often enough to detect less |
| 118 | aggressive clients in the face of busy clients, especially |
| 119 | when those clients are executing slow requests. |
| 120 | |
| 121 | 3 The server calls select too often when executing fast requests. |
| 122 | |
| 123 | 4 Some priority scheme is needed to keep interactive clients |
| 124 | responding to the user. |
| 125 | |
| 126 | And, there are some assumptions about how X applications work: |
| 127 | |
| 128 | 1 Each X request is executed relatively quickly; a request-granularity |
| 129 | is good enough for interactive response almost all of the time. |
| 130 | |
| 131 | 2 X applications receiving mouse/keyboard events are likely to |
| 132 | warrant additional attention from the X server. |
| 133 | |
| 134 | Instead of a request-count metric for work, a time-based metric should be |
| 135 | used. The server should select a reasonable time slice for each client |
| 136 | and execute requests for the entire timeslice before yielding to |
| 137 | another client. |
| 138 | |
| 139 | Instead of returning immediately from WaitForSomething if clients have |
| 140 | complete requests queued, the server should go through select each |
| 141 | time and gather as many ready clients as possible. This involves |
| 142 | polling instead of blocking and adding the ClientsWithInput to |
| 143 | clientsReadable after the select returns. |
| 144 | |
| 145 | Instead of yielding when the request buffer is empty for a particular |
| 146 | client, leave the yielding to the upper level scheduling and allow |
| 147 | the server to try and read again from the socket. If the client |
| 148 | is busy, another buffer full of requests will already be waiting |
| 149 | to be delivered thus avoiding the call through select and the |
| 150 | additional overhead in WaitForSomething. |
| 151 | |
| 152 | Finally, the dispatch loop should not simply execute requests from the |
| 153 | first available client, instead each client should be prioritized with |
| 154 | busy clients penalized and clients receiving user events praised. |
| 155 | |
| 156 | How it's done: |
| 157 | |
| 158 | Polling the current time of day from the OS is too expensive to |
| 159 | be done at each request boundary, so instead an interval timer is |
| 160 | set allowing the server to track time changes by counting invocations |
| 161 | of the related signal handler. Instead of using the wall time for |
| 162 | this purpose, the process CPU time is used instead. This serves |
| 163 | two purposes -- first, it allows the server to consume no CPU cycles |
| 164 | when idle, second it avoids conflicts with SIGALRM usage in other |
| 165 | parts of the server code. It's not without problems though; other |
| 166 | CPU intensive processes on the same machine can reduce interactive |
| 167 | response time within the X server. The dispatch loop can now |
| 168 | calculate an approximate time value using the number of signals |
| 169 | received. The granularity of the timer sets the scheduling jitter, |
| 170 | at 20ms it's only occasionally noticeable. |
| 171 | |
| 172 | The changes to WaitForSomething and ReadRequestFromClient are |
| 173 | straightforward, adjusting when select is called and avoiding |
| 174 | setting isItTimeToYield too often. |
| 175 | |
| 176 | The dispatch loop changes are more extensive, now instead of |
| 177 | executing requests from all available clients, a single client |
| 178 | is chosen after each call to WaitForSomething, requests are |
| 179 | executed for that client and WaitForSomething is called again. |
| 180 | |
| 181 | Each client is assigned a priority, the dispatch loop chooses the |
| 182 | client with the highest priority to execute. Priorities are |
| 183 | updated in three ways: |
| 184 | |
| 185 | 1. Clients which consume their entire slice are penalized |
| 186 | by having their priority reduced by one until they |
| 187 | reach some minimum value. |
| 188 | |
| 189 | 2. Clients which have executed no requests for some time |
| 190 | are praised by having their priority raised until they |
| 191 | return to normal priority. |
| 192 | |
| 193 | 3. Clients which receive user input are praised by having |
| 194 | their priority rased until they reach some maximal |
| 195 | value, above normal priority. |
| 196 | |
| 197 | The effect of these changes is to both improve interactive application |
| 198 | response and benchmark numbers at the same time. |
| 199 | |
| 200 | |
| 201 | |
| 202 | |
| 203 | |
| 204 | $XFree86: $ |