An Overview of Load Balancer: System Design
While building apps in this era where its been a norm just to vibe code the project using Claude or Codex, what I noticed is that not many people thinking about...
While building apps in this era where its been a norm just to vibe code the project using Claude or Codex, what I noticed is that not many people thinking about core fundamentals of software development which is highly important not only if you are using it in a small peer group, but extremely important when the project is gaining attraction and you are looking to scale the project, the whole project which was built earlier might collapse as the infrastructure was not built to cater the new bundle of sudden request which would enable to making a lot of your backend services vulnerable to DDoS attack or lose access to your environment variables if the system breaks down.
I remember one of the subjects which I took up during my sophomore year there was a subject called Software Design and one of the striking thing which my professor told was before building any project or solution devote the most of the time in designing the system as that is the core fundamental of the project which will hold all the other components together. So whatever project I build now on I want to ensure that these fundamentals of system design is covered not only will help to scale the project in the future but also keep an habit of building good habits while building applications.
So I was working on my distributed system project and in that I noticed a topic called Load Balancer and many place in the interview its a often asked question in system design rounds so I decided to learn and apply this skills in my coming projects.
Lets start with Load Balancer Concept:
Scenario:
Consider you client request and a server, then lets say if your application gets 20k request/second and your server has the capacity of 50k request/second, then in this case the request would be easily accommodated. Now lets say, if the total user request is 70k request/second and server capacity is 50k, the 20k users would not be able to use the service so to solve this we need to add another server to accommodate more users in the server , then once the server is added to split where exactly the request would go would need a decision maker which is a load balancer, so this basically works on a set of algorithms which decides where the client request would go to which server.
Lets say a data needs to be communicated from the client which is the browser and server which is the backend, so for a successful data transfer it should have a TCP connection and TCP connections works on the basis of a three way handshake. In this methodology when we use POST protocol and send the data, there is a vulnerability that the message content might be taken during the protocol transfer so therefore we add a TLS connection which adds the encryptions, which is there in the Presentation Layer of OSI. The data would be decrypted at the server which will see exactly what cryptography method is used to encrypt and decrypt the data in the server side.
There are two Types of Load Balancer which are L4 and L7:
L4: Transport Layer (NAT Mode)
Transport Layer ensures the dats which is being sent is reliable and delivered successfully. In this layer there exist a client with a message to be delivered to the clients and the servers lie in a private network.
One of the main think about the Load Balancer is that it acts as a gateway so client and server cannot communicate directly, it has to go through the load balancer. So when the client sends the request to the load balancer it runs a hashing algorithm and decides to give it a server destination IP Address and then sends the request to the server where the data is decrypted there. This layer ensures that the server IP is hidden and it goes through the Load Balancer
L4: Transport Layer (Proxy Mode)
In this layer the setup remains similar just that the once the message reaches the load balancer it terminates the connections not close the connection during this phase the load balancer does another TCP connections with the servers and then returns the result from the server and returns. In this layer it allows the complex algorithms to take place compared to NAT.
NAT is faster than Proxy as it does not deal with managing TCP connection which becomes a overheard TCP connection and cause latency issues. The encryption remains similar to NAT.
L7: Application Layer - Load Balancer
In this Load Balancing type, both the TCP and TLS connection is terminated which acts in a similar fashion but it decrypts the data at the LB, rather than the server side as it will allow the Load Balancer to get an idea of the data overhead like its content to see where to send it data based on the compute needed.
Example: Lets say you are a premium user of leetcode and when you submit code to compile, the client will send a message in packet which will mention in the json data that they are premium user which then the load balancer will take a decision to navigate the request to a special server which is allocated for premium user for faster compute and give a compute edge.
Also since the all the servers are inside the private network most of the communication protocols can be in HTTP rather than HTTPS.
Algorithms of Load Balancer:
Round Robin Algorithm:
In this algorithm, it works on the method of distributing the requests based on the servers available one by one to each of them and then restart from starting. One of the major drawback in this algorithm is basically imagine when one of the request is computationally heavy compared to other request which means the server would take a lot of time to compute this and another request which would come would get bounded back. So immediate solution for this many would think we be to implement a queue and fetch request and run based on the queue but again same problem arises if a computationally heavy algorithms comes it would delay the latency.
So our solution for this is a Weighted Round Robin, where we rank the request based on computational need and then assign it to those server which are high server end and low weighted request to lower end of server. This process can be automated by understanding the request weight and server weight and allocate automatically.
Least Connection Algorithm:
As the name says the request will go to the server which has least request and accommodate the request from the client.
Few Important Concepts which is necessary while building Load Balancer are:
1. Health Check: This involves to keep a check if a certain server in the private network is running, so we do a multiple check so that the request doesn't go in the server which is current stuck, so we use the concept of polling the server on a regular interval to check the status and bring back the server if its functioning again and notify the load balancer.
2. Session Persistance: This also plays a key role which is basically lets say a scenario where one server in the private network does the authentication and now if its rerouted its path to another server, the new server would not know any details of the authenticated user, so to fix this problem there are two ways which is making all the request of a particular user in the same server but this is not ideal as sometimes the computation might get heavy and to scale vertically its not worth it, so rather what we can do is basically keep both the servers connected to a redis which would save the data like session token and reroute task easily.
3. Forward Proxy: Proxy is something which will live between a client and server. In Forward Proxy is used when you client puts it in front of him, which would show that the origin of request is coming from the proxy not the actual client. Example: if a Forward Proxy has been applied between a worker of a company and Google Server, the google would see a IP coming from company not the worker.
4. Reverse Proxy: In this method the proxy lies in front of the servers, this is where they decide if a request comes from the client it decides where to go, when to go and why to go based on its algorithm. So the client interacting would see the end point of reverse proxy not the servers .
There are many providers for Load Balancer like AWS, Azure and Nginx which serves the purpose of the load balancer but as long as the fundamentals remains perfect it becomes easier to setup and maintain how a Load Balancer works.
That's all for the day, the implementation of load balancer would be demonstrated in detail in my distributed system project.
Note: None of the content in this blog is written by AI , so there might be some grammatical mistake, do ignore those.