Scaling is the ability of a system to grow or shrink in size to meet the increasing user requirement. A scalable system has an advantage because it is adaptable to the changing needs or demands of its users. Scaling can be achieved either by adding more resources to the current system, or by adding new systems in the existing one, or by both.
When we are talking about System Design, we generally talk about two types of Scaling:
- Vertical Scaling
- Horizontal Scaling
In vertical scaling, we add more new resources to the same system i.e, increase the amount of RAM, CPU, GPU and other resources to meet the increased computing requirement. It is easy to accomplish. It also consumes less power.
But, Vertical Scaling does not make the system fault-tolerant, i.e if we are scaling application running with a single server and if that server goes down, our entire system will go down. Also, it is often limited to the capacity of a single machine i.e, scaling beyond that capacity of a machine often involves downtime.
In Horizontal Scaling, we scale by adding more systems into the existing pool of systems. Since all server works independently and is equally capable of handling the request, hence this will eventually decrease the load of requests on each server and also make the entire system fault-tolerant i.e, even if one server goes down other systems will handle the upcoming request.
Also, there is no limit, we can add as many new servers as we want.
It is hard to implement as it requires all of the systems to be synchronized with each other also, we need to use a load balancer to distribute the load equally among all systems, which is an additional overhead.