Load balancing is a technique employed to spread workloads evenly across backend services or other computational resources. In the current world, a high-traffic server handles thousands of requests from users or customers each minute and sends images, videos, data, messages, and other things in return. The user also expects a fast and reliable response. So we add more servers to handle these big volumes and to offer quick responses.
However, if incoming requests from clients or users are not routed correctly, one server may become overburdened with requests compared to another server. This results in a slow response or may cost the service provider as they have to handle a lot of traffic. A miniOrange API gateway acting as a load balancer might be the solution for this case. Here, the load balancer distributes the request among different servers correctly using various algorithms.
It sits between the client interface and backend services. When any request arrived at the gateway, it uses some technique to determine which server is available for processing and sends the request to that server. Just like that by routing requests among all the servers, the gateway can ensure maximum speed of response, reliable capacity utilization of backend services, and prevents overburdening of servers.
For example, you have two servers that perform the activity of facial recognition and in response, the server sends images with a face marked in it. When the user uses the API gateway as a load balancer, the gateway recognizes the current conditions of load on each backend server and routes the request to the appropriate server which can handle the request. Additionally, it can transmit traffic to a new server that has been added and can divert incoming requests to another server if one goes offline. We can also perform periodic health checks.
This is a simple method for performing load balance. It is useful for cases where multiple identical servers are configured to provide the same services and all servers are configured to use the same domain name but different IP addresses.
In the round-robin algorithm, requests are assigned to the given number of servers in a cyclic manner. As we can see from the above figure, the first request goes to server 1, the second to server 2, and the third back to server 1, and so on. By using this method, the load is reduced and requests can be distributed equally among the servers.