nginx使用线程池提升9倍性能(上)

Posted 运维帮

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了nginx使用线程池提升9倍性能(上)相关的知识,希望对你有一定的参考价值。

Introduction

It’s well known that nginx uses an asynchronous, event-driven approach to handling connections. This means that instead of creating another dedicated process or thread for each request (like servers with a traditional architecture), it handles multiple connections and requests in one worker process. To achieve this, NGINX works with sockets in a non-blocking mode and uses efficient methods such as epoll and kqueue.

Because the number of full-weight processes is small (usually only one per CPU core) and constant, much less memory is consumed and CPU cycles aren’t wasted on task switching. The advantages of such an approach are well-known through the example of NGINX itself. It successfully handles millions of simultaneous requests and scales very well.


Each process consumes additional memory, and each switch between them consumes CPU cycles and trashes L-caches

But the asynchronous, event-driven approach still has a problem. Or, as I like to think of it, an “enemy”. And the name of the enemy is: blocking. Unfortunately, many third-party modules use blocking calls, and users (and sometimes even the developers of the modules) aren’t aware of the drawbacks. Blocking operations can ruin NGINX performance and must be avoided at all costs.

Even in the current official NGINX code it’s not possible to avoid blocking operations in every case, and to solve this problem the new “thread pools” mechanism was implemented in NGINX version 1.7.11. What it is and how it supposed to be used, we will cover later. Now let’s meet face to face with our enemy.

The Problem

First, for better understanding of the problem a few words about how NGINX works.

In general, NGINX is an event handler, a controller that receives information from the kernel about all events occurring on connections and then gives commands to the operating system about what to do. In fact, NGINX does all the hard work by orchestrating the operating system, while the operating system does the routine work of reading and sending bytes. So it’s very important for NGINX to respond fast and in a timely manner.

nginx使用线程池提升9倍性能(上)

The worker process listens for and processes events from the kernel

The events can be timeouts, notifications about sockets ready to read or to write, or notifications about an error that occurred. NGINX receives a bunch of events and then processes them one by one, doing the necessary actions. Thus all the processing is done in a simple loop over a queue in one thread. NGINX dequeues an event from the queue and then reacts to it by, for example, writing or reading a socket. In most cases, this is extremely quick (perhaps just requiring a few CPU cycles to copy some data into memory) and NGINX proceeds through all of the events in the queue in an instant.

nginx使用线程池提升9倍性能(上)

All processing is done in a simple loop by one thread

But what will happen if some long and heavy operation has occurred? The whole cycle of event processing will get stuck waiting for this operation to finish.

So, by saying “a blocking operation” we mean any operation that stops the cycle of handling events for a significant amount of time. Operations can be blocking for various reasons. For example, NGINX might be busy with lengthy, CPU-intensive processing, or it might have to wait to access a resource (such as a hard drive, or a mutex or library function call that gets responses from a database in a synchronous manner, etc.). The key point is that while processing such operations, the worker process cannot do anything else and cannot handle other events, even if there are more system resources available and some events in the queue could utilize those resources.

Imagine a salesperson in a store with a long queue in front of him. The first guy in the queue asks for something that is not in the store but is in the warehouse. The salesperson goes to the warehouse to deliver the goods. Now the entire queue must wait a couple of hours for this delivery and everyone in the queue is unhappy. Can you imagine the reaction of the people? The waiting time of every person in the queue is increased by these hours, but the items they intend to buy might be right there in the shop.

nginx使用线程池提升9倍性能(上)

Everyone in the queue has to wait for the first person’s order

Nearly the same situation happens with NGINX when it asks to read a file that isn’t cached in memory, but needs to be read from disk. Hard drives are slow (especially the spinning ones), and while the other requests waiting in the queue might not need access to the drive, they are forced to wait anyway. As a result, latencies increase and system resources are not fully utilized.

nginx使用线程池提升9倍性能(上)

Just one blocking operation can delay all following operations for a significant time

Some operating systems provide an asynchronous interface for reading and sending files and NGINX can use this interface (see the aio directive). A good example here is FreeBSD. Unfortunately, we can’t say the same about Linux. Although Linux provides a kind of asynchronous interface for reading files, it has a couple of significant drawbacks. One of them is alignment requirements for file access and buffers, but NGINX handles that well. But the second problem is worse. The asynchronous interface requires the O_DIRECT flag to be set on the file descriptor, which means that any access to the file will bypass the cache in memory and increase load on the hard disks. That definitely doesn’t make it optimal for many cases.

To solve this problem in particular, thread pools were introduced in NGINX 1.7.11. They are not included by default in NGINX Plus yet, but contact sales if you’d like to try a build of NGINX Plus R6 that has thread pools enabled.

Now let’s dive into what thread pools are about and how they work.

Thread Pools

Let’s return to our poor sales assistant who delivers goods from a faraway warehouse. But he has become smarter (or maybe he became smarter after being beaten by the crowd of angry clients?) and hired a delivery service. Now when somebody asks for something from the faraway warehouse, instead of going to the warehouse himself, he just drops an order to a delivery service and they will handle the order while our sales assistant will continue serving other customers. Thus only those clients whose goods aren’t in the store are waiting for delivery, while others can be served immediately.

nginx使用线程池提升9倍性能(上)

Passing an order to the delivery service unblocks the queue

In terms of NGINX, the thread pool is performing the functions of the delivery service. It consists of a task queue and a number of threads that handle the queue. When a worker process needs to do a potentially long operation, instead of processing the operation by itself it puts a task in the pool’s queue, from which it can be taken and processed by any free thread.


The worker process offloads blocking operations to the thread pool

It seems then we have another queue. Right. But in this case the queue is limited by a specific resource. We can’t read from a drive faster than the drive is capable of producing data. Now at least the drive doesn’t delay processing of other events and only the requests that need to access files are waiting.

The “reading from disk” operation is often used as the most common example of a blocking operation, but actually the thread pools implementation in NGINX can be used for any tasks that aren’t appropriate to process in the main working cycle.

At the moment, offloading to thread pools is implemented only for two essential operations: the read() syscall on most operating systems and sendfile() on Linux. We will continue to test and benchmark the implementation, and we may offload other operations to the thread pools in future releases if there’s a clear benefit.


-------------------------------------------

运维帮订阅号,关注获取更多技术分享

以上是关于nginx使用线程池提升9倍性能(上)的主要内容,如果未能解决你的问题,请参考以下文章

Nginx 学习笔记引入线程池 性能提升9倍

太厉害了!我用 Nginx 提升系统10倍性能

如何将 Nginx 性能提升10倍?这10个“套路”请收好!

如何让你的Nginx 提升10倍性能?

面试官:我想用Nginx提升系统10倍性能,你有哪些建议?

Nginx 的线程池与性能剖析