Rate Limiter

Rate Limiter

Objective:

Understanding rate limiter and Implementing rate limiter with Flask.

Introduction to Rate Limiting:

Rate limiting is used to control the amount of incoming and outgoing traffic to or from a network. For example, let’s say you are using a particular service’s API that is configured to allow 100 requests/minute. If the number of requests you make exceeds that limit, then an error will be triggered. The reasoning behind implementing rate limits is to allow for a better flow of data and to increase security by mitigating attacks such as DoS.

Rate limiting also comes in useful if a particular user on the network makes a mistake in their request, thus asking the server to retrieve tons of information that may overload the network for everyone. With rate limiting in place however, these types of errors or attacks are much more manageable.

Types of Rate Limits

There are various methods and parameters that can be defined when setting rate limits. The rate limit method that should be used will depend on what you want to achieve as well as how restrictive you want to be. Different types of rate limiting methods that you can implement.

  1. User rate limiting: The most popular type of rate limiting is user rate limiting. This associates the number of requests a user is making to their API key or IP (depending on which method you use). Therefore, if the user exceeds the rate limit, then any further requests will be denied until they reach out to the developer to increase the limit or wait until the rate limit timeframe resets.
  2. Geographic rate limiting: To further increase security in certain geographic regions, developers can set rate limits for particular regions and particular time periods. For instance, if a developer knows that from midnight to 8:00 am users in a particular region won’t be as active, then they can define lower rate limits for that time period. This can be used as a preventative measure to help further reduce the risk of attacks or suspicious activity.
  3. Server rate limiting: If a developer has defined certain servers to handle certain aspects of their application then they can define rate limits on a server-level basis. This gives developers the freedom to decrease traffic limits on server A while increasing it on server B (a more commonly used server).

   Scaling your API with rate limiters:

Rate limiting can help make your API more reliable in the following scenarios:

  • One of your users is responsible for a spike in traffic, and you need to stay up for everyone else.
  • One of your users has a misbehaving script which is accidentally sending you a lot of requests. Or, even worse, one of your users is intentionally trying to overwhelm your servers.
  • A user is sending you a lot of lower-priority requests, and you want to make sure that it doesn’t affect your high-priority traffic. For example, users sending a high volume of requests for analytics data could affect critical transactions for other users.
  • Something in your system has gone wrong internally, and as a result you can’t serve all of your regular traffic and need to drop low-priority requests.

Using different kinds of rate limiters :

Different types of limiters in production. The first one, the Request Rate Limiter, is by far the most important one. We recommend you start here if you want to improve the robustness of your API.

Request rate limiter

This rate limiter restricts each user to N requests per second. Request rate limiters are the first tool most APIs can use to effectively manage a high volume of traffic.

Our rate limits for requests is constantly triggered. It has rejected millions of requests this month alone, especially for test mode requests where a user inadvertently runs a script that’s gotten out of hand.

Our API provides the same rate limiting behavior in both test and live modes. This makes for a good developer experience: scripts won’t encounter side effects due to a particular rate limit when moving from development to production.

After analyzing our traffic patterns, we added the ability to briefly burst above the cap for sudden spikes in usage during real-time events (e.g. a flash sale.)

Concurrent requests limiter

Instead of “You can use our API 1000 times a second”, this rate limiter says “You can only have 20 API requests in progress at the same time”. Some endpoints are much more resource-intensive than others, and users often get frustrated waiting for the endpoint to return and then retry. These retries add more demand to the already overloaded resource, slowing things down even more. The concurrent rate limiter helps address this nicely.

Our concurrent request limiter is triggered much less often (12,000 requests this month), and helps us keep control of our CPU-intensive API endpoints. Before we started using a concurrent requests limiter, we regularly dealt with resource contention on our most expensive endpoints caused by users making too many requests at one time. The concurrent request limiter totally solved this.

It is completely reasonable to tune this limiter up so it rejects more often than the Request Rate Limiter. It asks your users to use a different programming model of “Fork off X jobs and have them process the queue” compared to “Hammer the API and back off when I get a HTTP 429”. Some APIs fit better into one of those two patterns so feel free to use which one is most suitable for the users of your API.

Fleet usage load shedder

Using this type of load shedder ensures that a certain percentage of your fleet will always be available for your most important API requests.

We divide up our traffic into two types: critical API methods (e.g. creating charges) and non-critical methods (e.g. listing charges.) We have a Redis cluster that counts how many requests we currently have of each type.

We always reserve a fraction of our infrastructure for critical requests. If our reservation number is 20%, then any non-critical request over their 80% allocation would be rejected with status code 503.

We triggered this load shedder for a very small fraction of requests this month. By itself, this isn’t a big deal—we definitely had the ability to handle those extra requests. But we’ve had other months where this has prevented outages.

Worker utilization load shedder

Most API services use a set of workers to independently respond to incoming requests in a parallel fashion. This load shedder is the final line of defense. If your workers start getting backed up with requests, then this will shed lower-priority traffic.

This one gets triggered very rarely, only during major incidents.

We divide our traffic into 4 categories:

  1. Critical methods
  2. POSTs
  3. GETs
  4. Test mode traffic

We track the number of workers with available capacity at all times. If a box is too busy to handle its request volume, it will slowly start shedding less-critical requests, starting with test mode traffic. If shedding test mode traffic gets it back into a good state, great! We can start to slowly bring traffic back. Otherwise, it’ll escalate and start shedding even more traffic.

API Rate Limiting / API Throttling

If you start looking at and end to end scenario, you first have an overall limit of calls your backend can process per time unit. This is often measured by TPS (Transaction per Second). In some cases, systems might also have a physical limit of data that can be transferred in Bytes. For example your backend might be able to process 2000 TPS (Transaction per Second). This is called Backend Rate Limiting.
Very often multiple clients get an overall rate limit they are allowed to send called Application Rate Limiting. If they are starting to send too many requests, their connection gets throttled which means the processing slows down but does not disconnect. This allows to keep the connection open and help with keeping errors down. There is a risk of connections timing out and for sure you risk to keep connections longer which might open a vector for Denial of Service Attacks.

API Burst

Sometimes you want to enable a single client to send more than its actual limit because your system has bandwidth or is idle. This is called API peak. Sometimes clients cannot control the API calls that are emitted. This is where API burst can help. It allows your client to send a certain amount of traffic more than usual. For example you allow your client to send 20 TPS but they send 30 transactions which process very fast. Maybe your systems are able to consume the load.  From an implementation perspective, Leaky Bucket might be a known algorithm.

API Quota

Looking more at a commercial aspect and long term consumption of calls and data, API quotas are used a lot. API quotas usually describe a certain amount of calls for longer intervals. For example your API quota might be 5.000 calls per month. Remember that this could be combined with a rate limit or throttling setup e.g. 20 TPS (Transactions per Second).
To enforce an API quota you need to identify the client or consumer, therefore the term user quota (aka organization quota) is used. Usually this is where API Management solutions help. Consumers come in, select a certain plan which has a quota attached. Quite often you also find a SLA attached which defines the response times and availability of the service. This is important from a consumer side but it’s also important for the provider to keep an eye on when the API is your product.
If you look inside API quota in detail you can also imagine that you not only set a limit based on an overall client / consumer but also on a per consuming application level (e.g. per API Key), this is what we call application quota. But even further you could go and limit certain methods or calls. Reason for this might be that those calls would consume more compute power on your backend.

Rate Limit  Implementation

There are various ways to go about actually implementing rate limits. This can be done at the server level, it can be implemented via a programming language or even a caching mechanism. The two implementation examples below show how to integrate rate limiting either via Nginx or Apache.

Nginx

If you’re using Nginx as your web server and would like to implement rate limiting at the server-level then you can take advantage of the module ngx_http_limit_req_module. This can be implemented directly within your Nginx configuration file. Using this method, Nginx rate limits based on the user’s IP address.

http {
    limit_req_zone $binary_remote_addr zone=one:10m rate=2r/s;
    ...
 
server {
    ...
    location /promotion/ {
    limit_req zone=one burst=5;
    }
}

The snippet above allows not more than 2 request per second at an average, with bursts not exceeding 5 requests.

Apache

Similarly, Apache users can also implement rate limiting within the Apache configuration file using more or less the same method as Nginx users. With Apache, the module: mod_ratelimit must be used in order to limit client bandwidth. Throttling is applied to each HTTP response instead of being aggregated at the IP/client level.

<Location "/promotion">
    SetOutputFilter RATE_LIMIT
    SetEnv rate-limit 400
    SetEnv rate-initial-burst 512
</Location>
 

The values in the snippet above are defined in KiB/s. Therefore the rate-limit environment variable, used to specify the connection speed to be simulated is 400KiB/s while the initial amount of burst data is 512KiB/s.

Implementation of Rate Limiter Using Flask:

INSTALL PYTHON3:

Download Python 3 package on www.python.org

Step-1 :After downloaded Python3 package run the downloaded Python3 package, and you will see the first screen, check the checkbox “Add Python 3.6 to PATH” in bottom option and then click on “Install Now“.

Step-2 :In this screenshot you will see “Disable path length limit“,  Don’t worry this change won’t break anything but will allow Python to use long path names, it can help smooth over any path-related issues you might have while working in windows, click on that.

Step-3:Now you will see this screen.

Congratulations ! Now you have Python3 on your computer.

Installation of Flask:

To install flask you can go here or just follow below steps:

Step1: Install virtual environment

If you are using Python3 than you don’t have to install virtual environment because it already come with venv module to create virtual environments.

Step 2: Create an environment

Create a project folder and a venv folder within:

Your shell prompt will change to show the name of the activated environment.

Step3:

Within the activated environment, use the following command to install Flask:

Step4:

Within the activated environment, use the following command to install limiter:

Implementation of Rate Limiter Using Python Scripts:

a.)Implementation of the Rate Limiter of 4 Request per day.

b.)Implementation of the Rate Limiter of 1 Request per minute

An alternative approach to rate limiting:

Figma:

It combines a few standard techniques to control the rate at which people issue requests to your server, and it’s relatively accurate, simple, and space-efficient. If you’re a company building web applications at consumer scale, our rate limiter can prevent users from harming your website’s availability with a spate of requests. It also happens to be great at stopping spam attacks, as we discovered.

For those unfamiliar, a rate limiter caps how many requests a sender — this could be a user or an IP address — can issue in a specific window of time (e.g. 25 requests per minute). In Figma’s case, it also needed to:

  • store data externally so that the multiple machines running our web application could share it
  • not appreciably slow down web requests
  • efficiently eject outdated data
  • accurately limit excessive use of our web application
  • use as little memory as possible

The first three requirements were easy to meet. At Figma, we use two external data stores — Redis and PostgreSQL — and Redis was the better location to save tracking data. Redis is an in-memory data store that offers extremely quick reads and writes relative to PostgreSQL, an on-disk relational database. Even better, it’s simple to specify when Redis should delete expired data.

Finding a way to satisfy the last two requirements — accurately controlling web traffic and minimizing memory usage — was more of a challenge. Here are the existing rate limiter implementations I considered:

  • Token bucket
  • Fixed window counters
  • Sliding window log

References:

6 thoughts on “Rate Limiter

  1. Select a country you in order to focus blog site audience that can. The best is North
    american or North america. You need to do some keyword research first by typing any keyword assigned to your blog topic.
    Check the low numbers in local amount of searches.
    The best keywords to use are individuals that got 200-800 searches.

    I’ve only briefly mentioned ideas about getting site visitors.

    Experimentation is part of process. Reading about others’
    successes helps enormously and so with time we develop our own style the actual fits to
    our own site.

    The publicity, advertising in this particular case, you put inside your blog offers profitable requirements.
    Many people suggest not doing this because it really is annoying.
    Honestly, what is most likely the problem generally
    if the ads only come around the visits locate engines?

    Blogs are universal. Several benefits, including that anybody who uses the web can access
    your online site. Social media networks like Twitter, Facebook, LinkedIn etc are available only however join. Not Blogs!
    Anyone from anywhere can have the opportunity to your blog with no restrictions.

    A blog allows anyone to achieve a potentially infinite number buyers for your businesses.

    Everything else can be improved upon with practice, experience or by paying someone else to do the writing
    an individual if you have to! But the content.that which you.that is where your experience, understanding comes back in. If you give good content the reader really won’t care should you have a run-on sentence.

    Search engine optimization is utilizing tools and methods
    in a shot to develop site top ranking involving results of search sites.
    As I stated earlier, sometimes on first page and furthermore in the very half of this
    page will make sure that you generate public awareness of your site’s existence and generate more traffic, traffic which lead to potential sales negotiation.

    Create an interactive report. With this type of profile, you be able to get lots
    of interactive content. You can use trivia questions if in comparison. This
    works well because of the simple fact trivia questions are in and of themselves very interactive.
    You’ll still need a link on your profile page, but
    you also need to fill out as up to you can about your self your page as beautifully.
    Make sure you include the picture. You can post a few trivia inquiries to your blog and then post a bulletin with one of the questions about
    it. You can then ask consumers to post answers on web site and spend
    less who guesses right gets two kudos points.

    Now a person can know the right way to make money using Squidoo, Holds
    true that sort of wait to get started on creating your lens or at least, updating might help to prevent already need.
    You’ve now gained an an opportunity to earn to the.
    How you generate the most of this opportunity is entirely to you. http://3win8.city/index.php/other-games/play8oy

  2. I had undertaken a task that I’m beginning to feel was beyond my personal resources carry out.
    I had asked a mate and colleague if I could organize a workshop the
    actual planet New England area that she would develop.

    After many months of collecting information, I was at the attachment site of choosing and doing a conference center.
    Next would come the advertising, invitations, mailings, registrations, therefore.
    It happened yrs ago on the inside early stages of my new career,
    and Being beginning to feel when i was in over my head. I started
    to wish I had not started the project but was afraid to
    say anything, because I didn’t want to permit my friend down.

    Use good communication specialist methods. What you write first should
    be very impressive and make others reply to you. Try to personalize the communication to be sure that the one else develops a romantic feeling with
    regard to you. If a person not efficacious at communication, read some articles and communication certification.

    Then Think about Martin Luther King, Jr. He irritated the ability structure
    of the us for the reason for ending a war and ending racial
    discrimination. I think about Bishop Oscar Romero of San Salvador who irritated the force structure of El Salvador.

    He spoke out against corruption and oppression and was murdered
    in the government’s make sure you stop the irritation. I think about the Alabama judge who irritated people, a
    lot of persons including the force structures, using efforts support the Ten Commandments
    each morning in court building. I thought he was
    wrong. I disagreed with him, but he was trying to right what he perceived was
    an injustice. Essential it may be for you and me – for this
    congregation – to be irritating for right reasons, for the cause
    of justice around us.

    Each prospect that comes across what you are offering will attend a different level of comfort in spending
    cash with you. Using a marketing funnel filled with products and services that come at different price points gives you leverage
    give something every and every qualified prospect that happens.

    Think of that particular. You want to do something silly.

    Is not really anything bad, just just a little goofy.
    Widely recognized your loved ones will provide a hard time if they find out you made it happen. Even with the teasing, nonetheless got
    want execute it, and would don’t wait if you knew merchandise without knowing be caught.

    You avoid doing it, though, when your friends have succeeded for making you seem like a fool for even thinking
    to sort it out.

    The widow in the parable did finally get justice through unjust judge by being persistent — being bothersome.
    When I picture myself in this story, it’s very as the judge.

    I do not see myself being necessarily unjust – just as very human and at risk from the weaknesses of mankind.
    When I mull over the irritating people I’ve encountered, I
    do believe about product sales person who is
    effective at getting me to buy their product so that they like me or have a
    good opinion of me. That’s very irritating! I think an authority figure:
    a boss, a teacher, a supervisor who generally seems to know
    the thing express to me that triggers a deep-seeded response of shame or irresponsibility.
    I end up feeling bad about myself and angry at myself all
    at the same time. That is very irritating!

    Ask your teen to make a list of what s/he to complete
    in place of the counter-productive behaviours.
    S/he needs to write down the positive words s/he plans to utilize to describe him/herself, note what s/he is deserving of, and write for the qualities and values from the new friends s/he is seeking.
    The more detail the less confusion there will be going to and less chance s/he will fall back in the old
    traits.

    These two examples highlight different things in drinks as well .
    person. So depending of the reason one needs to write
    the letter as required. Also, do write within a way that may help both, the person and the recipient. http://ntc33.fun/index.php/other-games/3win8

Comments are closed.

Comments are closed.