Elements of a Successful System and Network Load Balancing Design
The big-picture considerations essential to developing a server load balancing system that will enable your business objectives today and tomorrow.
Server Load Balancing (SLB) technologies are not new to the technology market. Since the early days of the computing era, networking engineers have discovered that stand-alone computing platforms are not able to provide 100% uptime or desired performance consistency. Individual server systems are susceptible to hardware failures and require periodic downtime for maintenance tasks and Operating System updates. Believably, it’s virtually impossible to provide highly-available service when the backend application is running on a single server or a virtual machine.
Load balancing solutions provide an intelligent traffic distribution among a cluster of identically configured systems. A vast majority of modern mission-critical applications use some form of backend system clustering or automated failover techniques to optimize the availability of the systems.
It’s no secret, however, that product vendors influence the design of many load-balancing projects, whether due to a customer’s familiarity with a specific SLB product or guidance from a procurement team to work with a preferred vendor. Unfortunately, in such a case, the detailed design of the proposed SLB deployment is rarely informed by specific business level objectives leading to significant challenges during implementation phases. While business objectives can vary widely, there are three important technical considerations (independent of vendor) for the design of a robust network load balancing solution:
Scalability is the ability to dynamically expand service capacity to accommodate variations of the network traffic to provide the same level of application responsiveness and to maintain acceptable backend application performance. It is rare for user traffic to remain constant throughout the day and much more common for a network to see spikes of traffic over time. When designing a network balancing solution, it’s absolutely vital to identify and document any limitations of underlying application architecture to avoid unexpected performance issues.
Use application load-testing and benchmarking to simulate the performance of an application under variety of user traffic conditions. Based on the results, you can then anticipate the range required for scalability and architect in accordance with the company’s business and technology requirements in mind.
Availability, or uptime, is defined by the ability of application architecture to withstand intermediate faults in the subsystem while still providing the committed levels of service. In other words, system elasticity. One of the initial steps in any solution design process is to define approved target levels of system availability. The cost of downtime, including the impact it has on the business reputation, should closely inform the overall design of a service. While additional investment in the internal component redundancy can be justified based on the service availability requirements, multiple layers of local system redundancy are frequently not enough when the cost of downtime is too great.
A disaster-recovery site capable of handling all user traffic when a primary site is unavailable is often a must for a mission-critical or revenue-generating business application. Network fiber cuts, human errors, electricity outages and acts of nature remain leading factors that could disrupt even the best system designs. While the cost of a geographically dispersed disaster-recovery site can effectively double the cost of the system investment, thoughtfully designed system architecture by an experienced team of network engineers and system architects can greatly reduce initial expenditures and ongoing maintenance costs while meeting long-term application availability requirements.
Scalable and highly available systems rely on many moving parts designed to work in concert. The collective degrees of intricacies introduced by a complex system frequently lead to excessive management overhead just to keep the systems in operating order. Front-end network traffic load-balancing products only get you so far. In real-word scenarios, unplanned software and hardware maintenance, security changes and misbehaving application code modifications can negate the benefit of the load-balancing products and remain undetected. Consider implementing an application monitoring and real-time alerting platform capable of detecting possible discrepancies in overall system health before the system is being put in production to serve live customer traffic.
Keeping the system in top working order requires a cohesive collaboration of engineering teams responsible for maintaining the system. Clearly documented system management processes and procedures provide an essential roadmap to avoid dramatic events. At the end of the day, it all often boils down to people and their expertise to leverage available technologies to produce a highly-available system that meets business requirements.
Looking for platform-agnostic, collaborative network load balancing? Talk to a VIMRO engineer about your options >>