We use different forms of rate limiting throughout DomainTools. Common use cases are detecting abusive actors trying to scrape our data and limiting requests to external services to avoid placing any undue load.
I have seen and used many different algorithms for rate limiting requests but none of which met the requirements for a project I was recently working on. These requirements being strict limits avoiding time boundary issues, multiple limit rules, manual blocks, and a distributed architecture so that multiple systems can use it concurrently.
The first problem I see with a lot of algorithms is how they handle time boundaries. For example, lets say we wanted to limit to 20 requests per minute. We could increment a counter when we make a request and clear this count at the start of every new minute. If the counter is at 20 and we want to make another request, this would be over our limit. This appears to do the right thing as we would average only 20 requests per minute. The problem with this simple algorithm is that it is possible to make 20 requests in the last second of a given minute, and another 20 in the first second of the next. 40 requests in 2 seconds. I need something where we never make more than 20 requests in a sliding 60 second window.
To minimize this effect, you can use more time buckets for a higher resolution but then counting requests back days or longer will require a large number of buckets.
I also needed more sophisticated limit rules. I may want to limit to 800 requests daily, but I do not want a burst of 800 requests in the first minute of that day. What I want was to define multiple limit rules which work together. Say let a maximum of 1 request through per second, 20 per minute, 200 per hour, and 800 per day.
Rate limiting solution: Redis
The Redis datastore was an obvious choice to implement my own version of a rate limiter:
- It is extremely fast, persistent, and is a lot more than just a simple key/value store.
- The algorithm and data structure I chose was straightforward.
- I keep a timestamp log of requests in a Redis list data type which allows fast appending and trimming of elements.
As an example, I will use the table below to show the timestamps of five previous requests.
If I have two rules of 1 per second and 5 per minute, I grab the 1st and 5th timestamps representing the last request and 5 requests ago.
If the 1st timestamp is less than 1 second old the first rate limit rule will fail. Likewise if the 5th timestamp is less than 1 minute old the second rule fails.
At 12:34:31 the first timestamp is 3 seconds old, but the 5th timestamp is only 56 seconds old triggering the second rule. We could calculate we need to wait 4 seconds before trying again.
At 12:34:40 the first rule passes as it is 12 seconds old, the second rule also passes as it is now 65 seconds old.
We can then allow the request and append the current time to the timestamp log.
As we never look past the 5th request to check our two rules, we can trim the list to 5 elements. The maximum time we need to keep requests is 1 minute from the longest limit rule, so this is what I set the time to live. If we never use this list again, it will drop out of the database after 60 seconds trimming the dataset of unnecessary entries.
After inserting our new timestamp the redis list now looks like this.
We store a large number of these timestamp lists in our redis store, all referenced by user definable keys. This allows us to track a large number of requests independently.
To give an indication on how this will scale, I use this to track over 100,000 timestamp logs averaging 60 requests in each list. Timestamp logs are kept for one day and the redis server sits easily in a few hundred MB of RAM.
I have made a Python implementation of this algorithm available on GitHub at https://github.com/DomainTools/rate-limit