Building a Distributed Rate Limiter in Go
Recently at Cloudscape, we needed to implement rate limiting across our microservices cluster. While simple rate limiters work fine for single-instance applications, distributed systems require a different approach. Let’s build a Redis-based distributed rate limiter in Go that’s both efficient and accurate.
The Challenge
When running multiple service instances, a traditional in-memory rate limiter won’t work because each instance only sees its own traffic. We need a centralized way to track and enforce rate limits across all instances.
Building the Rate Limiter
Let’s create a distributed rate limiter using the token bucket algorithm. Here’s the core implementation:
type DistributedRateLimiter struct {
client *redis.Client
maxTokens int64
refillRate time.Duration
keyPrefix string
}
func (rl *DistributedRateLimiter) Allow(ctx context.Context, key string) (bool, error) {
// Redis Lua script that atomically checks and updates tokens
result, err := rl.client.Eval(ctx, tokenBucketScript, []string{key}).Int()
return result == 1, err
}
Usage Example
Here’s how to use our rate limiter in a web service:
limiter := ratelimit.NewDistributedRateLimiter(
redisClient,
100, // max tokens
time.Minute, // refill rate
"ratelimit:", // key prefix
)
http.HandleFunc("/api", func(w http.ResponseWriter, r *http.Request) {
if allowed, _ := limiter.Allow(r.Context(), r.RemoteAddr); !allowed {
http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
return
}
// Handle request...
})
Performance Considerations
- Redis Connection Pool: Use a properly sized connection pool to handle concurrent requests.
- Key Design: Include enough information in the key to differentiate between users/resources while avoiding key explosion.
- Monitoring: Track rate limiter metrics like rejection rate and Redis latency.
Load Testing Results
In our production environment, this implementation handles:
- 10,000 requests/second across 5 service instances
- Average latency of 2ms for rate limit checks
- Redis memory usage of ~100MB for 1M active buckets
Next Steps
You could extend this implementation to support:
- Different bucket types (leaky bucket, fixed window)
- Multiple rate limits per request
- Rate limit sharing across service groups
The complete code is available in my GitHub repository.
Conclusion
Distributed rate limiting is a crucial component of modern microservices. This implementation provides a solid foundation that you can build upon based on your specific needs.