Introduction of Rate Limiting and traffic shaping with NGINX.

Posted By : Lokesh Singh | 28-Jun-2021

1. Introduction


Rate Limiting - Rate Limiting can be defined as limiting the amount of traffic on a network. We can categories rate-limiting into two parts.

  1. Traffic Policing - It measures the rate of incoming traffic and drops network packets that exceed the maximum allowed rate by any network.
  2. Traffic Shaping - It Queues Network Packets and controls the rate at which they are output. It increases packet latency by introducing delays in traffic but drop network packets if the network queue overflows.

In this blog, we will see how we rate-limit network packets with traffic shaping using NGNIX. It allows you to limit the number of HTTP requests a user can make in a certain amount of time. A request can be a simple HTTP request(GET, POST, PUT, etc..). Rate-Limiting can also be used for security. By using this we can prevent brute-force attacks and password guessing attacks. By implementing Rate-Limiting we can set fix amount of requests a user can make in a certain amount of time. It is generally used to prevent server spikes and protect upstream application servers by limiting the number of user requests.

In this blog, we will discuss basic rate limiting with NGNIX with traffic shaping and burst rate.


2. How Rate-Limiting in NGNIX works


NGNIX uses the leaky bucket algorithm, which is used in a packet-switched computer network. In this also the water is poured from the top and leaks from the bottom; if the poured rate is greater than the leaking limit the water overflows. The bucket represents the queue size of the network and incoming water represents the not of request to the server and it processed according to the first-in-first-out (FIFO) algorith, and leaking water represents requests processed by the server.


3. NGINX configurations for rate Limiting


We can configure rate limiting in two NGNIX directives, limit_req_zone and limit_req. 


limit_req_zone $binary_remote_addr zone=mylimit:15m rate=15r/s;
server {
    location /login/ {
        limit_req zone=mylimit;
        proxy_pass http://my_upstream;


The limit_req_zone directive used to set parameters for rate-limiting and we define it into HTTP block which enables it in multiple contexts. and limit_req enables rate limit in the context where it appears.

The limit_req_zone the directive uses the following parameters:

  • Key - It defines to which against we want to apply rate limit. In this, we have used $binary_remote_addr variable which is the user's IP address the same way we can apply rate-limiting for the server if we want to apply for that.
  • Zone - It defines the shared memory which is used to store the state of every IP address and how fast it is accessed and this is stored memory and can be used by every worked process in NGNIX.
  • Rate - It sets the maximum rate at which a user can make a request to the server. in this, we have used 15 requests in a second if that limit exceeds for a user NGNIX will return 503( Service Temporarily Unavailable).

The limit_req_zone  only used to implement parameters for rate-limiting but it will limit the user requests. For this, to work we have to use limit_req the respective context of the location or the server block. 


Handling Bursts

The Burst parameter is used to define that how many requests a user can make in excess of the rate that is defined by the zone. and we want to buffer any excess requests and serve them in a respective manner, this is where we use burst parameter to limit_req.


location /signup/ {
    limit_req zone=mylimit burst=20;
    proxy_pass http://my_upstream;


Queueing with No Delay

Implementing burst results in the smooth flow of traffic, but it is not very useful because it makes your site appear slow. To solve this problem we implement nodelay with the burst parameter.


location /signup/ {
    limit_req zone=mylimit burst=20 nodelay;
    proxy_pass http://my_upstream;


By implementing the nodelay parameter when any request Hits too soon, NGNIX forwards it to the server as long as there are available slots in the queue. and the nodelay option is used when we want to implement a rate limit without waiting between the requests. for most cases, we implement burst with nodelay into the limit_req context or server block.


4. Conclusion


In this blog, we have covered how we implement basic rate limiting using NGNIX and set rates for the different HTTP locations.

Related Tags

About Author

Author Image
Lokesh Singh

He is Hard Working and Punctual and keen to learn new technologies. He works as both Full Stack Developer and MEAN Stack Developer on both frontend and backend technologies.

Request for Proposal

Name is required

Comment is required

Sending message..