System design interview: how to design a feeds system (e.g., Twitter, Instagram and Facebook news fe
Posted baozitraining
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了System design interview: how to design a feeds system (e.g., Twitter, Instagram and Facebook news fe相关的知识,希望对你有一定的参考价值。
System design interview: how to design a chat system (e.g., Messenger, WeChat or WhatsApp)
Methodology: READ MF!
Please use this "READ MF!" framework for software engineer system interview purpose.
Key designs and terms
- So far the best detailed explanation on designing twitter is from Raffi (used to be VP of Twitter) on QCon. Presentation, Slides [All credits go to QCon). Raffi is very smart and articulate, really solid guy!
- Read heavy system, not write heavy, optimized for user timeline. To be clear, there are two timelines, one is user‘s own tweets (easy to do), the other is the main timeline which is an aggregation of all the tweets from the people that user follows.
- Pre-calculate all the timelines. This is the interesting part of the design vs using a mysql and having index to query in realtime, which would not be scalable. When a tweet is posted, the tweets service would
- Store this tweet in memory, and that later would be flushed to a main DB
- Call the fanout deliver service to publish this tweet to all the users‘ timeline that followed this particular user. It could simply store a tweet ID (and later the content could be retrieved from the Tweets Cache) or hydrating the entire text content is also fine (note how we want to handle eidt or delete, twitter probably doesn‘t allow delete)
- Call search service to index (Lucene). The search index is also hosted in memory on Redis. Note search here needs to fanout to all search clusters but due to the in memory hosting, it‘s acceptable.
- Always remember disk access is at least 100x slower than memory access, e.g, disk is 10ms vs 100ns on memory. https://gist.github.com/jboner/2841832
- With the pre-caculate timeline design, there might be race conditions when celebrity (people with millions of followers) starts to talking to each other with replies. E.g., celebrity A tweets something, takes 30 sec to deliver to all the followers, celebrity B replies before deliver finishes, some followers follow both A and B might see B‘s reply first before A‘s original post. One cheat workaround is to sort by timestamp or tweet IDs, but they are also experimenting with only pre-calculate non-celebrity tweets, and when generating timeline, realtime fetching the celebrity tweets. It depends which way is better and also the user experience. This is a good stop point to talk to your interviewers in real world about tradeoffs.
- Since twitter is heavily relying on cache, you might want to checkout how they optimized the caching with twemproxy.
Baozi Youtube Video
以上是关于System design interview: how to design a feeds system (e.g., Twitter, Instagram and Facebook news fe的主要内容,如果未能解决你的问题,请参考以下文章
System Design Interviews: NoSQL Databases and When to Use Them.
Dropbox Interview – Design Hit Counter
Top 5 Object Oriented Design Interview Questions for Programmers, Software Engineers
Mono 中的 System.Data.Entity.Design.PluralizationServices 错误