Written by Poogle
on December 25, 2022

SDI Example - Design a Rate Limiter

참고 링크

책 - 가상 면접 사례로 배우는 대규모 시스템 설계 기초

Example - Design a Rate Limiter

Q. What kind? client-side rate limiter or server-side API rate limiter?
- A. server-side
Q. Does the rate limiter throttle(limit rule) API requests based on IP, the user ID, or other properties?
- A. flexible enough to support different sets of throttle rules
Q.scale of the system? (startup / big company)
- A. must be able to handle a large number of requests
Q. work in a distributed environment?
- A. O
Q. Is the rate limiter a separate service or should it be implemented in application code?
- A. up to you
Q. Do we need to inform users who are throttled?
- A. O

Accurately limit excessive requests.
Low latency. The rate limiter should not slow down HTTP response time.
Use as little memory as possible.
Distributed rate limiting. The rate limiter can be shared across multiple servers or processes.
Exception handling. Show clear exceptions to users when their requests are throttled.
High fault tolerance. If there are any problems with the rate limiter (ex. cache server goes offline) -> it does not affect the entire system.

Need a counter -> to keep track of how many requests are sent from the same user, IP address, etc
If the counter is larger than the limit -> the request is disallowed.
Where?
- DB: X -> slowness of disk access
- In-memory cache(Redis): O -> fast and supports time-based expiration strategy
- 2 commands: INCR(Increase counter +1) / EXPIRE(Timeout value)

Blank diagram - Page 1

Blank diagram - Page 1 (2)

Race Condition
- If two requests concurrently read the counter value before either of them writes the value back,
- Each will increment the counter by one and write it back without checking the other thread.
- => Solutions
  - Lock -> but slow down the system
  - Lua Script
  - Redis Sorted Set
Synchronization Issue
- When multiple RLM used -> synchronized need
- => Solutions
  - Sticky Session: allow a client to send traffic to the same rate limiter -> X advisable (X scalable, X flexible)
  - Use centralized data stores (Redis)

Multi-data center setup
- latency is high for users located far away -> multi reduce latency
Synchronize data with an eventual consistency model

CheckList
- The rate limiting algorithm is effective.
  - check effectiveness for supporting burst traffic
- The rate limiting rules are effective.
  - not to drop so many valid requests

Hard vs Soft rate limiting
- hard: requests X exceed threshold
- soft: requests can exceed threshold for a short period
Rate Limiting in different levels
- Application Layer(7) -> HTTP
- Network Layer(3) -> IP tables
Avoid being rate limited
- Use client cache to avoid making frequent API calls.
- Understand the limit and do not send too many requests in a short time frame.
- Include code to catch exceptions or errors so your client can gracefully recover from exceptions.
- Add sufficient back-off time to retry logic.

← → Top