11. Design A Chat System
Design a Chat System, like Whatsapp, Facebook messenger, Wechat, Line, Discord
A chat app performs different functions for different people. It is extremely important to nail down the exact requirements. For example, you do not want to design a system that focuses on group chat when the interviewer has one-on-one chat in mind. It is important to explore the feature requirements.
References
Step 1 - Understand the problem and establish design scope
It is vital to agree on the type of chat app to design. In the marketplac:
- One-on-one chat apps like Facebook Messenger, WeChat, and WhatsApp
- Office chat apps that focus on group chat like Slack
- Game chat apps, like Discord, that focus on large group interaction and low voice chat latency.
The first set of clarification questions should nail down what the interviewer has in mind exactly when she asks you to design a chat system. At the very least, figure out if you should focus on a one-on-one chat or group chat app.
In the chapter, we focus on designing a chat app like Facebook messenger, with an emphasis on the following features:
- A one-on-one chat with low delivery latency. 一对一聊天
- Small group chat (max of 100 people). 群组聊天
- Online presence. 在线状态
- Multiple device support. The same account can be logged in to multiple accounts at the same time. 支持多账户同时登录
- Push notifications. 推送通知
It is also important to agree on the design scale. We will design a system that supports 50 million DAU.
Step 2 - Propose high-level design and get buy-in
Clients do not communicate directly with each other. Instead, each client connects to a chat service, which supports all the features mentioned above. Let us focus on fundamental operations. The chat service must support the following functions:
- Receive messages from other clients.
- Find the right recipients for each message and relay the message to the recipients.
- If a recipient is not online, hold the messages for that recipient on the server until she is online.
When a client intends to start a chat, it connects the chats service using one or more network protocols. For a chat service, the choice of network protocols is important. Let us discuss this with the interviewer.
Requests are initiated by the client for most client/server applications. This is also true for the sender side of a chat application. 对于聊天应用程序, 请求都是由客户端发起的。
In Figure 2, when the sender sends a message to the receiver via the chat service, it uses the time-tested HTTP protocol, which is the most common web protocol. In this scenario, the client opens a HTTP connection with the chat service and sends the message, informing the service to send the message to the receiver. The keep-alive is efficient for this because the keep-alive header allows a client to maintain a persistent connection with the chat service. It also reduces the number of TCP handshakes. HTTP is a fine option on the sender side, and many popular chat applications such as Facebook [1] used HTTP initially to send messages.
However, the receiver side is a bit more complicated. Since HTTP is client-initiated, it is not trivial to send messages from the server. 由于 HTTP 是客户端发起的,因此从服务器发送消息并非易事。 Over the years, many techniques are used to simulate a server-initiated connection: polling, long polling, and WebSocket. Those are important techniques widely used in system design interviews so let us examine each of them.
Polling
As shown in Figure 3, polling is a technique that the client periodically asks the server if there are messages available. Depending on polling frequency, polling could be costly. It could consume precious server resources to answer a question that offers no as an answer most of the time.
Long polling
In long polling, a client holds the connection open until there are actually new messages available or a timeout threshold has been reached. Once the client receives new messages, it immediately sends another request to the server, restarting the process. Long polling has a few drawbacks:
- Sender and receiver may not connect to the same chat server. HTTP based servers are usually stateless. If you use round robin for load balancing, the server that receives the message might not have a long-polling connection with the client who receives the message.
- A server has no good way to tell if a client is disconnected.
- It is inefficient. If a user does not chat much, long polling still makes periodic connections after timeouts.
WebSocket
WebSocket is the most common solution for sending asynchronous updates from server to client. Figure 5 shows how it works.
WebSocket connection is initiated by the client. It is bi-directional and persistent. WebSocket连接由客户端发起, 它是双向且持久的。
It starts its life as a HTTP connection and could be “upgraded” via some well-defined handshake to a WebSocket connection. Through this persistent connection, a server could send updates to a client. WebSocket connections generally work even if a firewall is in place. This is because they use port 80 or 443 which are also used by HTTP/HTTPS connections. 通过 HTTP 连接开始, 通过某些特定的握手升级为 WebSocket 连接, 并使用 80/443 端口
Earlier we said that on the sender side HTTP is a fine protocol to use, but since WebSocket is bidirectional, there is no strong technical reason not to use it also for sending. Figure 6 shows how WebSockets (ws) is used for both sender and receiver sides.
By using WebSocket for both sending and receiving, it simplifies the design and makes implementation on both client and server more straightforward. Since WebSocket connections are persistent, efficient connection management is critical on the server-side.
Step 3 - High-level design
Just now we mentioned that WebSocket was chosen as the main communication protocol between the client and server for its bidirectional communication, it is important to note that everything else does not have to be WebSocket. In fact, most features (sign up, login, user profile, etc) of a chat application could use the traditional request/response method over HTTP. Let us drill in a bit and look at the high-level components of the system.
As shown in Figure 7, the chat system is broken down into three major categories: stateless services, stateful services, and third-party integration.
Stateless Services 无状态服务
Stateless services are traditional public-facing request/response services, used to manage the login, signup, user profile, etc. 无状态服务是传统的面向公众的请求/响应服务,用于管理登录、注册、用户配置文件等。
Stateless services sit behind a load balancer whose job is to route requests to the correct services based on the request paths. These services can be monolithic or individual microservices. The one service that we will discuss more in deep dive is the service discovery. Its primary job is to give the client a list of DNS host names of chat servers that the client could connect to.