Photo by Andyone on Unsplash

Automation with Ansible — Setting up Loadbalancers using HAproxy and Ansible Roles

Akshaya Balaji
9 min readFeb 16, 2021

This is the fifth article in the Automation with Ansible series. For the fourth article, please refer to this link.

In this series, we will be looking at different ways in which Ansible can be used to implement automation in the IT industry

In this article, we will be exploring the basics of load balancers and how we can set up a load balancer using HAproxy and Ansible.

What are Load Balancers?

Figure: Diagrammatic scenario of deploying service on one server. Image Source: Author

Consider a scenario where you are launching a service or application on a server. Every server handles a certain number of client requests and if the number of requests is more than what it can handle, then it crashes. This is not a good sign and you can even lose clients if the service keeps going down repeatedly.

Figure: Initial solution — Deploying service on multiple servers. Image Source: Author

A possible solution is to deploy the same service on multiple servers and let the clients know of the options they have in interacting with the server. But this is not a feasible model either, because the clients will not know which of the servers can handle their client requests.

Figure: Final Solution — Load Balancer as the front-end for the servers. Image Source: Author

To hide the complexity of choosing the server which can handle the clients’ requests, we introduce the concept of a load balancer. The load balancer acts as a front-end or interface to which the clients can send requests. The load balancer then sends the requests of the clients to one of the servers (registered with it), which can handle the client’s request. The load balancer listens to the clients’ requests on a unique port called the front-end port and sends the client requests to capable servers.

The uses of load balancers are more than what has been described above, but it is one of the simplest uses of load balancers.

There are many products available in the industry today, to implement load balancers. We will work with a popular one named HAProxy. You can learn more about HAProxy itself here.

Now, let’s start with a simple example to see how we can set up an RHEL8 VM as a load balancer for two or more RHEL8 VMs which are configured with a web server. The entire setup will be completed using Ansible Playbooks and a new concept of Ansible called Roles.

Why do we need Ansible Roles?

Till now, we have seen how we can define multiple tasks within a single playbook and execute them with only one command on the terminal. This approach served us well till now because we did not have too many tasks to do and we could fit all the desired static files, templates, inventory, and other playbooks and variable files within a single directory. But as our requirements become more complex, we need to be able to organize the files that we are using for our tasks. Good organization of the important files also ensures that it is very easy to simply copy the entire main directory to other controller nodes and perform the same configurations as before. This brings us to the use of Ansible roles.

As defined in the official docs,

Roles let you automatically load related vars_files, tasks, handlers, and other Ansible artifacts based on a known file structure. Once you group your content in roles, you can easily reuse them and share them with other users.

Ansible’s community Galaxy already has many roles shared by the members. We can simply go and search for our desired roles and then download them into our controller nodes. Ansible also provides us the option of creating our roles. using the command

ansible-galaxy init <role-name>

Effectively, roles are simply a pre-defined set of directories that demarcate the locations where we store different parts of our tasks, templates, files, etc. When we create a role using the command provided above, Ansible creates a directory with the name <role-name> and also provides a default structure with seven subdirectories. It is not compulsory to use all the directories, but we must use at least one of them for Ansible to recognize it as a role. Each default directory stores a specific kind of file. Some of them include:

(i) tasks stores a file named main.yml where all the tasks to be performed by the role are defined. The hosts on which the tasks are to be performed are not mentioned.

(ii) handlers stores a file named main.yml where all the handlers that may be notified by the tasks of the role are present.

(iii) files stores all the files which must be copied to the target nodes statically (using the copy module).

(iv) templates stores all the files which must be copied to the target nodes after parsing the special keywords (using the template module).

(v) vars stores a file with all the variables defined for this role. If host-specific or group-specific variables must be defined, then such variables must be defined in a file named after the host or group in the host_vars or group_vars subdirectory.

To learn more about these directories, please check out the official docs for Ansible Roles.

With the basics of both load balancers and Ansible Roles, we can now start with the setup of the complete example.

Setting up the webservers

We are trying to set up two services — Web server and a load balancer. For this, we can create two roles within our workspace with the names webserver_role and lb_role.

mkdir setup-haproxy # Creating workspace directory
mkdir setup-haproxy/roles # Creating director to store all the roles
ansible-galaxy init webserver_role
ansible-galaxy init lb_role

Now, we begin defining the tasks to set up a web server in the tasks/main.yml files of webserver_role.

Figure: Contents of the main.yml in the tasks subdirectory of the role webserver_role

In the tasks/main.yml file of the webserver_role, we list out the tasks in the order we want them to be executed. The tasks are:

(i) Install httpd on the target nodes

(ii) Install PHP package on the target nodes. Note that this step does not contribute to setting up the webserver. Rather, it is used for deploying a test webpage that helped us see the behavior of a load balancer in real-time.

(iii) Copy the desired web-page to the target nodes using the copy module. You can find the web-page I have used in this link here, along with all the files and directories needed to do this example. If the test page is changed at any point, the task will also notify the restart httpd handler.

(iv) Allow traffic through the default HTTP port (port 80)

(v) Start the webserver

Note that the tasks are written by assuming that the RHEL8 system has YUM set up in it. If you have not set up YUM, then you can use Ansible to set it up by following this link.

Next, we will be writing a handler, to restart the web server, if the test webpage is changed at any point. The handler is written in the handlers/main.yml file.

Setting up the Load Balancers

Now, we can move on to the lb_role directory to fill in the tasks, variables, and handlers. First, we begin by writing the tasks we want to perform on the load balancer target nodes in the tasks/main.yml file of the lb_role directory. The tasks are:

(i) Install HAproxy on the RHEL8 VM

(ii) Install PHP package on the RHEL8 VM. Note that this step does not contribute to setting up the webserver. Rather, it is used for deploying a test webpage that helped us see the behavior of a load balancer in real-time.

(iii) Copy the configuration file using the template module. If the configuration file is changed at any point, the task will also notify the restart haproxy handler.

(iv) Allow traffic through the default HTTP port (port 80) and the front-end bind port (Port 8081 as defined in the vars directory)

(v) Start the HAproxy service.

There are two variables specified amongst the tasks — bind_port and httpd_port. These variables will have their values defined in the vars/main.yml file of the lb_role directory and they will also be used in the dynamic parsing and transfer of the HAproxy configuration file to the target nodes using the template module.

The configuration file of HAproxy is directly available when we manually install HAproxy. Using that file as the basis, we can make a template file in the Jinja2 template such that, the values of the front-end bind port, IP addresses of the backend web servers, and the port used by the backend servers to route HTTP network traffic are all taken from variables defined in the vars/main.yml file of the lb_role directory.

Figure: Setting up the template file for the haproxy.cfg.j2 file — Part I
Figure: Setting up the template file for the haproxy.cfg.j2 file — Part II

In the image above, we use a loop to iterate over all the IP addresses of the hosts under the backend_webservers group. For every IP, we define the server with a unique name (app<number>), followed by the IP address and the port number used by the host to route HTTP network traffic. To get the unique,we use some in-built variables of Ansible which track the iteration of the loop we have written.

Next, we write the handler restart haproxy to restart the HAproxy service, if there is any change in the configuration file when the tasks are executed.

Configuring the Ansible Configuration File and Inventory

Now, we configure the ansible.cfg file which is customized for our current example.

A new part in this configuration file is compared to the ones used in the previous articles in the privilege_escalation section. Privilege escalation is a concept used in Operating Systems, where a general user can gain the power (privileges) of another user, by certain methods. Often, privilege escalation is used to give the general users the power of a root user (admin), which is limitless.

We are doing the same in our configuration file, so that the user we log into remotely in the target nodes is provided privilege escalation to run commands as a root user using the sudo method, without asking for the root user’s password every time a command is executed.

Next, we create a new directory for the inventory and create the inventory file with the name inventory. Here, we define all the webserver targets under the backend_webservers group and the load balancer target under the lb group.

Writing Final Playbook to invoke the roles

After defining all the roles, configuration files, and inventory, we can finally write a main Playbook that will invoke the roles we have defined on the respective groups.

Once all the files have been created, we see the following directory structure

Figure: The directory structure of the workspace used for the complete example. Image source: Author

Let’s test out our load balancer!

To complete the entire setup in one step, we can simply run the main playbook on the controller node. Before running the playbook, ensure that all of your syntaxes are right to avoid the playbook from crashing when executing.

ansible-playbook --syntax-check setup_lb.yml
ansible-playbook -vv setup_lb.yml #For more verbosity in output

If the plays have run successfully, then we will be able to access the test page using the URL <loadbalancer IP>:<bind_port>/test.php.

The page displays the IP address of the system from which it is accessed, which also helps us observe how the load balancer work.

Figure: Testing the Load Balancer (Part I). Image Source: Author
Figure: Testing the Load Balancer (Part II). Image Source: Author

Every time we refresh the web page, the IP address changes to the system from which we are viewing the webpage. Hence, we know that the requests to view this page are being routed to different backend servers at different points in time.

Conclusion

We have seen how we can use Ansible to perform different kinds of configurations in this series of articles. Ansible offers many more features like Dynamic Inventories, Exception Handling, etc which can also be used to set up more complex configurations. In fact, Ansible is often not used directly in many real-world applications. Instead, Ansible Towers, which run on the Ansible Engine, are used.

Ansible is a very versatile tool, which is capable of tackling many requirements, some of which I have discussed here.

--

--

Akshaya Balaji

MEng EECS | ML, AI and Data Science Enthusiast | Avid Reader | Keen to explore different domains in Computer Science