Written by
Poogle
on
on
Tips for System Design Interview
참고 링크
2 power n
2^n | estimate | name | type |
---|---|---|---|
2^10 | 1,000 | 1천(thousand) | 1KB |
2^20 | 1,000,000 | 1백만(million) | 1MB |
2^30 | 100,000,000 | 10억(billion) | 1GB |
2^40 | 1,000,000,000,000 | 1조(trillion) | 1TB |
2^40 | 1,000,000,000,000,000 | 1000조(quadrillion) | 1PB |
4 Steps for interview
Step 1. 문제 이해 및 설계 범위 확정 (3 ~ 10min)
Good questions to ask
- Main features in detail
- How many users?
- How fast the size of service will grow?
- What are the main technology stacks for the company?
Step 2. 개략적인 설계안 제시 및 동의 구하기 (10 ~ 15min)
- Make a blueprint -> ask
- Draw a diagram
- Clients(Mobile/Web)
- API
- Web Server
- Database
- Cache
- CDN
- Message queue…
- Calculate estimation(ask) and speak loud
- Consider real example
- API endpoints or DB Schema(ask)
Step 3. 상세 설계 (10 ~ 25min)
Accomplished Goals
- Main features
- Blueprint of total system
- Interviewer’s opinion
- Parts to deep dive
Step 4. 마무리 (3 ~ 5min)
Follow-up Questions
- System Bottleneck
- Refactoring Points
- Summary
- Error(Server Error, Network Problems…) -> What will happen?
- Operation Issues
- Metric, Log, Roll-out(배포)
- How to expand service
- what if user * 10
TODO
- Clarification w. Questions
- Understand Requirements
- There is no perfect answer(Design could be different - start-up / big company)
- Communication -> Interviewer should understand Interviewee
- Suggest various answers together
- Move on to the detail only after Interviewer agreed to the blueprint
- Focus on main components
- Derive Interviewer’s ideas
- Be prepared
- Question all the requirements and assumptions
- Ask Hint!
Key Concepts
Single Server Setup
- Users - access -> websites (through domain names)
- Domain Name System (DNS): paid service provided by 3rd parties and not hosted by our servers
- Internet Protocol (IP) address - returned -> browser / mobile app
- Once IP obtained, Hypertext Transfer Protocol (HTTP) requests - are sent directly -> web server
- Web server - returns -> HTML pages or JSON response for rendering.
- In Single Server design
- users are connected to the web server directly
- X access the website if the web server is offline
- 🤔 if many users access the web server simultaneously
- reaches the web server’s load limit
- users generally experience slower response or fail to connect to the server * => ✨ load balancer is the best technique to address these problems
- In Single Server design
Horizontal vs Vertical Scaling
Horizontal = Scale Out
- adding more power (CPU, RAM, etc.) to servers
- desirable for large scale applications due to the limitations of vertical scaling
- DB Scale Out: Sharding
- Sharding separates large databases into smaller, more easily managed parts called shards
- Each shard shares the same schema, though the actual data on each shard is unique to the shard.
- Vertical Partitioning
- partitioning by feature
- drawback: if one of these tables gets very large, you might need to repartition that database (possibly using a different partitioning scheme)
- Key-Based (or Hash-Based) Partitioning
- uses some part of the data (ex. ID) to partion
- allocate N servers and put the data on
mod(key, n)
- the number of servers you have is effectively fixed
- Adding additional servers == reallocating all the data-a very expensive task
- Directory-Based Partitioning
- maintain a lookup table for where the data can be found
- easy
- drawbacks: (1) lookup table can be a single point of failure (2) constantly accessing this table impacts performance
- Choice of the Sharding Key (partition key)
- consists of one or more columns that determine how data is distributed
- retrieve and modify data efficiently by routing database queries to the correct database
- choose a key that can evenly distributed data
- introduces complexities and new challenges to the system:
- Resharding data: Resharding data is needed when
- 1) a single shard could no longer hold more data due to rapid growth.
- 2) Certain shards might experience shard exhaustion faster than others due to uneven data distribution. When shard exhaustion happens, it requires updating the sharding function and moving data around.
- Celebrity problem: hotspot key problem
- Excessive access to a specific shard could cause server overload
- need to allocate a shard for each celebrity -> Each shard might even require further partition.
- Excessive access to a specific shard could cause server overload
- Join and de-normalization:
- Once a database has been sharded across multiple servers, it is hard to perform join operations across database shards.
- de-normalize the database -> queries can be performed in a single table.
- adding redundant information into a database to speed up reads
- Resharding data: Resharding data is needed when
Vertical = Scale Up
- scale by adding more servers into your pool of resources
- When traffic is low -> vertical scaling is a great option
- simplicity
- ⚠️ -> hard limit
- impossible to add unlimited CPU and memory to a single server
- does not hav failover and redundancy
- 🥲 If one server goes down, the website/app goes down with it completely
Load Balancer
- evenly distributes incoming traffic among web servers that are defined in a load-balanced set
- web servers are unreachable directly by clients anymore
- load balancer communicates with web servers through private IPs
- 🙌 When web server are added (with load balancer)
- => successfully solved no failover issue and improved the availability of the web tier
Database
- With the growth of the user base -> need multiple servers
- Separating web/mobile traffic (web tier) and database (data tier) servers allows them to be scaled independently.
Which DB to use?
Relational databases, Relational database management system (RDBMS), SQL database
- MySQL, Oracle database, PostgreSQL, etc.
- represent and store data in tables and rows
- Perform join operations using SQL across different database tables
Non-Relational databases, NoSQL databases
- CouchDB, Neo4j, Cassandra, HBase, Amazon DynamoDB, etc
- 4 categories:
- key-value stores
- graph stores
- column stores
- document stores
- Join operations -> generally X supported
- When to use?
- Application requires super-low latency
- Data are unstructured, or X have any relational data
- Need to serialize and deserialize data (JSON, XML, YAML, etc.)
- Need to store a massive amount of data
Database replication
- master/slave relationship between the original (master) and the copies (slaves)
- master database
- generally only supports write operations
- data-modifying commands (insert, delete, or update) must be sent to the master database
- slave database
- gets copies of the data from the master database
- only supports read operations (higher ratio of reads to writes)
- => slave > master databases
- 👍 Advantages
- Better performance
- all writes & updates -> master nodes
- read operations -> slave nodes
- improves performance (=> allows more queries to be processed in parallel)
- Reliability
- If one of your database servers is destroyed by a natural disaster(typhoon, earthquake…) -> data is still preserved
- X worry about data loss -> data is replicated across multiple locations
- High availability
- By replicating data across different locations -> website remains in operation
- even if offline -> can access data stored in another database server
- Better performance
- What if one of the databases goes offline?
- if) only 1 slave DB is available & goes offline
- read operations -> master DB temporarily
- after issue found -> a new slave DB will replace the old one
- if) multiple slave DBs are available
- read operations are redirected to other healthy slave DB
- new database server will replace the old one
- if) master DB goes offline
- slave DB will be promoted to be the new master
- all operations -> temporarily executed on the new master DB
- new slave database will replace the old one for data replication immediately
- if) production systems -> more complicated: data in a slave DB X up to date
- missing data needs to be updated by running data recovery scripts
- +) multi-masters and circular replication
- if) only 1 slave DB is available & goes offline
Cache
- temporary storage area
- stores the result of expensive responses or frequently accessed data in memory
- -> subsequent requests are served more quickly
Cash Tier
- temporary data store layer, much faster than the database
- better system performance
- ability to reduce database workloads
- ability to scale the cache tier independently
Considerations
- Decide when to use cache
- Consider using cache when data is read frequently but modified infrequently
- volatile memory -> X ideal for persisting data
- Expiration policy
- Once expired -> removed from the cache
- X make the expiration date too short -> the system to reload data from the database too frequently
- X make the expiration date too long -> data can become stale
- Consistency
- involves keeping the data store and the cache in sync
- Inconsistency <- data-modifying operations on the data store and cache are not in a single transaction
- when scaling across multiple regions -> maintaining consistency: difficult
- Mitigating failures
- single cache server represents a potential single point of failure (SPOF)
- if it fails, will stop the entire system from working => avoid SPOF by multiple cache servers across different data centers
- overprovision the required memory by certain percentages
- provides a buffer as the memory usage increases
- single cache server represents a potential single point of failure (SPOF)
- Eviction Policy
- Once full -> add -> cause existing items to be removed
- Least-recently-used (LRU)
- Least Frequently Used (LFU)
- First in First Out (FIFO)
CDN
- network of geographically dispersed servers used to deliver static content
- ex. CDN servers cache static content like images, videos, CSS, JavaScript files, etc.
Considerations
- Cost
- run by third-party providers -> charged for data transfers in and out of the CDN
- Setting an appropriate cache expiry
- time-sensitive content -> setting a cache expiry time is important
- CDN fallback
- Invalidating files
- remove a file from the CDN before it expires
- serve different version by object versions
Stateless
- consider scaling the web tier horizontally -> move state (for instance user session data) out of the web tier
- store session data in the persistent storage (ex. relational database or NoSQL)
- each web server in the cluster -> access state data from databases: stateless web tier
Stateful Server vs Stateless Server
- stateful server: remembers client data (state) from one request to the next
- every request from the same client must be routed to the same server
- sticky sessions in most load balancers -> ⚠️ however, this adds the overhead
- Adding or removing servers is much more difficult
- challenging to handle server failures
- stateless server: keeps no state information
- HTTP requests from users can be sent to any web servers
- fetch state data from a shared data store
- State data is stored in a shared data store and kept out of web servers
- simpler, more robust, and scalable
- HTTP requests from users can be sent to any web servers
- move session data -> out of the web tier & store them in the persistent data store -> auto-scaling of the web tier is easily achieved by adding or removing servers based on traffic load
- autoscaling: adding or removing web servers automatically based on the traffic load
- shared data store: ex. relational database, Memcached/Redis, NoSQL(easy to scale), etc.
Data center
- To improve availability and provide a better user experience across wider geographical areas -> supporting multiple data centers is crucial.
- in normal operation: users are geoDNS-routed(geo-routed), to the closest data center, with a split traffic of x% in US-East and (100 – x)% in US-West
- geoDNS: DNS service that allows domain names to be resolved to IP addresses based on the location of a user
Technical challenges to achieve multi-data center
- Traffic redirection
- GeoDNS can be used to direct traffic to the nearest data center depending on where a user is located
- Data synchronization
- Users from different regions could use different local databases or caches
- Failover -> traffic might be routed to a data center where data is unavailable
- replicate data across multiple data centers
- Test and deployment
- test website/application at different locations
- automated deployment tools are vital to keep services consistent through all the data centers
Message Queue, Asynchronous
- Message Queue
- durable component, stored in memory, that supports asynchronous communication
- serves as a buffer and distributes asynchronous requests
- Structure
- Input services(producers/publishers): create messages, and publish them to a message queue
- consumers/subscribers: connect to the queue, and perform actions defined by the messages
- Decoupling makes the message queue a preferred architecture for building a scalable and reliable application
- producer can post a message to the queue when the consumer is unavailable to process it
- consumer can read messages from the queue even when the producer is unavailable
Log, Metric, Automation
Logging
- Monitoring error logs is important -> helps to identify errors and problems in the system
- monitor error logs at per server level
- use tools to aggregate them to a centralized service for easy search and viewing
Metrics
- Collecting different types of metrics help us to gain business insights and understand the health status of the system
- Host level metrics: CPU, Memory, disk I/O, etc.
- Aggregated level metrics: the performance of the entire database tier, cache tier, etc.
- Key business metrics: daily active users, retention, revenue, etc.
Automation
- build or leverage automation tools to improve productivity
- Continuous integration: each code check-in is verified through automation, allowing teams to detect problems early
- automating your build, test, deploy process, etc. -> could improve developer productivity significantly
Network
Bandwidth
- maximum amount of data that can be transferred in a unit of time
- typically expressed in bits per second (or GB per second)
Throughput
- the actual amount of data that is transferred
Latency
- how long it takes data to go from one end to the other
- delay between the sender sending information (even a very small chunk of data) and the receiver receiving it
Example - Conveyor Belt; transfers items across a factory
- Latency: time it takes an item to go from one side to another
- Throughput: the number of items that roll off the conveyor belt per second
- Building a fatter conveyor belt …
- will not change latency
- however will change throughput and bandwidth
- get more items on the belt, thus transferring more in a given unit of time
- Shortening the belt…
- will decrease latency, since items spend less time in transit
- will not change the throughput or bandwidth
- the same number of items will roll off the belt per unit of time.
- Making a faster conveyor belt…
- will change all three
- time it takes an item to travel across the factory decreases
- more items will also roll off the conveyor belt per unit of time
- Bandwidth…
- is the number of items that can be transferred per unit of time, in the best possible conditions
- Throughput…
- is the time it really takes, when the machines perhaps aren’t operating smoothly.
Map Reduce
- associated with Google + used broadly
- typically used to process large amounts of data.
- requires you to write a Map step and a Reduce step.
- The rest is handled by the system.
- Map takes in some data and emits a
<key, value>
pair. - Reduce takes a key and a set of associated values and “reduces” them in some way, emitting a new key and value.
- The results of this might be fed back into the Reduce program for more reducing.
- MapReduce allows us to do a lot of processing in parallel, which makes processing huge amounts of data more scalable.