I’m in the midst of a BizTalk project where we’re load testing our solution. Our receive adapters are WCF-based and therefore tuning WCF is a critical aspect of the overall solution performance. I have gone through the same tuning times before in every WCF project. But I always was lazy documenting the information and ended up looking up the bits and pieces time and time again to refresh my memory. Now I decided to document this information in a post for myself and others.
However, I will not simply write the keys to adjust and leave it there. I will explain the reason we adjust these keys and what they really mean. So bear with me and follow along.
ASP.NET Threading in IIS 6
Understanding threading for ASP.NET is important to properly scale your applications.
When a request comes to IIS 6, it hands the request to ASP.NET on an I/O CLR Thread Pool thread. ASP.NET then posts the request to the CLR Thread Pool and returns HSE_STATUS_PENDING to IIS. This frees the IIS thread which is now available to serve other requests.
So now the ASP.NET request is “waiting” at the CLR Thread Pool gate to be served. The CLR Thread Pool is a queue that adjusts its thread handling based on the type of load and request execution time (i.e. latency). For example, if there are a lot of concurrent requests that take short time to execute, then the Thread Pool will adjust itself to span few threads and reuse them to process these requests. Because the requests are fast, waiting time will be small.
If however, the requests take long time to execute, then the Thread Pool will span new threads in order to serve incoming requests.
The CLR Thread Pool is controlled by the <processModel> configuration settings (maxWorkerThreads, maxIoThreads, minWorkerThreads, and minIoThreads)
- maxWorkerThreads: default is 20. However, this is implicitly multiplied by the number of processors (or cores). So for example on a dual core computer, there’ll be 40 threads allocated for ASP.NET. so following our discussion, when the request is waiting at the CLR Thread Pool gate, if all 40 threads are busy, then the request will be queued until a thread becomes free.
- maxIOThreads: default is 20. Also implicitly multiplied by number of processors/cored. This specifies the max I/O threads available for ASP.NET to perform I/O operations, such as file operations, database calls, web service calls, or network socket calls, etc…
- minWorkerThreads: default is 1. This indicates the minimum number of threads that are pre-created and kept ready to serve requests. The idea of this setting is to prevent a sudden burst of load exhausting the thread pool creating a lot of threads. So here you set the minimum number of threads that are kept ready.
o When a number of free ASP.NET worker threads fall below this number, ASP.NET starts putting incoming requests into a queue. So, you can set this value to a low number in order to increase the number of concurrent requests. However, do not set this to a very low number because Web application code might need to do some background processing and parallel processing for which it will need some free worker threads.
- minIOThreads: default is 1. This is the same as minWorkerThreads but for I/O threads. Remember that you already know that each ASP.NET request needs 1 I/O thread to be executed (recall that: IIS hand the request to ASP.NET on I/O thread)
- memoryLimit: default is 60% of total system memory. Specifies the maximum allowed memory size, as a percentage of total system memory, that the worker process can consume before ASP.NET launches a new process and reassigns existing requests
Now – and this is very important – ASP.NET on IIS 6 has no mechanism to throttle or limit the number of incoming requests. So if the requests take long time to execute, the CLR Thread Pool will keep spanning new threads to serve new requests (again, because IIS 6 had no mechanism to limit number of concurrent requests) until there are no enough threads or memory consumption goes too high.
To prevent this situation, ASP.NET can limit the number of concurrent threads. To control this, you would use the <httpRuntime> minFreeThreads and minLocalRequestFreeThreads settings.
- minFreeThreads: default is 8. specifies the min number of free threads for new requests
- minLocalRequestFreeThreads: default is 4. specifies the number of free threads for local requests; these are requests originated from local host
So this means – combining <processModel> and <httpRuntime> configurations, that ASP.NET can maximum have (maxWorkerThreads*number of CPUs)-minFreeThreads concurrent threads running. Which under default settings is 12 for a single CPU machine (20*1-8). To scale your application you will likely have to increase the maxWorkerThreads.
Now if this limit is exceeded, request are then queued in another level of queuing done at the AppDomain, and executed later when the concurrency falls back down below the limit.
This way, as ASP.NET sets limit on the number of concurrent threads, this in turn saves the CLR Thread Pool from creating unlimited number of threads.
IIS 7.0 Integrated Mode and IIS 7.5
When ASP.NET is hosted on IIS 7.0 integrated mode, or IIS 7.0, the use of threads is different.
First AppDomain queues are removed. As reported by MS, these queues had really bad performance; first because having a queue at the AppDomain side means we have entered the managed code area so there is associated managed memory with it (remember – even though the request is just sitting there queued). Second these queues do not implement the first-in-first-out behavior of a queue.
Second, ASP.NET now limit the number of concurrent requests rather than the number of concurrent threads. Remember that in IIS 6, ASP.NET could only limit the number of concurrent threads using the httpRuntime settings. But the origin of the problem, was that it was not able to limit the number of requests. Well, in IIS 7 integrated and IIS 7.5 it can. It does this via MaxConcurrentRequestsPerCPU (more in a moment on how to configure this)
But wait. How is limiting the number of concurrent requests different than the number of concurrent threads? Isn’t it the case that each request needs a thread, so shouldn’t the relation be 1:1? Well it is, in case of a synchronous execution mode where each request needs one thread. However, this makes a huge difference when the execution mode is asynchronous in which case much fewer threads can handle a much bigger number of requests.
So this means that the httpRuntime settings (minFreeThreads and minLocalRequestFreeThreads) have no effect in IIS 7 integrated and IIS 7.5. Instead to configure the number of concurrent requests you would use via the applicationPool in aspnet.config:
<applicationPool maxConcurrentRequestsPerCPU = “5000” maxConcurrentThreadsPerCPU = “0” requestQueueLimit = “5000” />
- The requestQueueLimit restricts the overall total number of requests in the system (i.e. the CLR Thread Pool queue). Its goal is to prevent the server from going down due to lack of resources. When the limit is reached, incoming requests will receive a quick 503 “Server Too Busy” response. This has the same meaning of requestQueueLimit in <processModel> but overrides it.
- The maxConcurrentRequestsPerCPU is the same as the registry key (which remember does not exist by default, but overrides it)
- maxConcurrentThreadsPerCPU allows controlling concurrency on the thread level. By default its disabled (set to 0) and in most situations its recommended to keep it that way.
So to summarize the full interaction:
- All HTTP requests are handled by HTTP.sys and are queued in its queue; HTTP.sys maintains a queue for each worker process. The limit of this queue is set using the application pool’s queueLength property – which is set to 1000 by default. When the limit is reached a 503 will be returned.
- The HTTP.sys hands the request to ASP.NET on an I/O CLR Thread Pool thread
- ASP.NET posts the request to the CLR Thread Pool queue. CLR Thread Pool is still managed by the <processModel> settings.
- If requestQueueLimit is exceeded, the call cannot be executed and the caller gets back 503 error code.
- If however requestQueueLimit is not exceeded, a CLR worker thread picks up the request.
- ASP.NET compares maxConcurrentRequestsPerCPU * CPUCount to the total active requests. If the limit is exceeded, the request is queued in a worker process native queue. This worker process queue is global to all applications inside this process, and as indicated by its name, its native meaning there is not managed memory associated with queued requests. Plus FIFO is respected for this queue.
- Finally if the limit (maxConcurrentRequestsPerCPU * CPUCount) is not exceeded, the request will be executed.
WCF (Synchronous) Request Processing in IIS 6
So now that you understand how IIS differently handle threads between 6 and 7 (integrated) and 7.5 versions, you can now see how this affects WCF.
So one more time, let’s recap what you already know: IIS hands the request to ASP.NET on I/O thread. ASP.NET uses CLR ThreadPool worker threads to handle requests. What happens next with WCF differs based on IIS 6 and IIS 7:
By default messages sent to a WCF service hosted under IIS 6.0 and earlier are processed in a synchronous manner:
- Now the request is heading to a WCF service, not an aspx page. So ASP.NET must hand the request to ASP.NET. Therefore, ASP.NET calls into WCF on its own thread (the worker thread from the CLR ThreadPool)
- Now WCF uses another I/O thread from the CLR ThreadPool to process the request. In synchronous mode (System.ServiceModel.Activation.HttpModule and System.ServiceModel.Activation.HttpHandler), WCF holds onto the ASP.NET worker thread until it completes its processing. Now as a side note, notice that this means that even if you use WCF asycn programming model, in reality WCF will still be blocking the ASP.NET worker thread. This is the behavior of WCF up until 3.5 version. Its not until SP1 of .NET 3.5 that the asynchronous support was added through asynchronous HTTP module and handler.
Now you should link things together. You know why WCF embraces this synchronous model – even though WCF has the required asynchronous handlers – right? It does so, because as I explained in detail, IIS 6 cannot limit the number of concurrent requests. So had WCF embraced the asynchronous mode, then it would have ended with unlimited number of requests which would subject it to DOS attacks.
WCF (Asynchronous) Request Processing in IIS 7
But we know that processing requests asynchronously enables greater scalability because it reduces the number of threads required to process a request – WCF does not hold on to the ASP.NET thread while processing the request.
But again as I explained, under IIS 7, the number of incoming requests can be controlled using MaxConcurrentRequestsPerCpu. In this case, WCF can safely embrace the asynchronous model. And it does exactly that. By default under IIS 7, WCF uses the asynchronous request handlers (System.ServiceModel.Activation.ServiceHttpHandlerFactory and System.ServiceModel.Activation.ServiceHttpModule)
Now that you understand how threads are throttled in their journey to serve WCF requests under both sync and async modes, the way you throttle WCF requests does not differ in each case – although the default values change between WCF 4 and previous versions.
WCF 4 throttling settings goes into <serviceThrottling> under <behavior> under <serviceBehaviors>. There are three settings to know:
- MaxConcurrentSessions: default is 100 * number of processors. Sets the maximum number of sessions a ServiceHost object can accept. This setting only makes sense if the selected binding supports sessions.
- MaxConcurrentCalls: default is 16 * number of processors. Specifies the maximum number of messages actively processing across a ServiceHost object.
- MaxConcurrentInstances: default is the sum of the previous two. Defines the maximum number of InstanceContext objects in the service at one time.