Ansible limitations for network automation
Ansible can be a powerful tool for network automation, but users should be aware of its limitations with debugging, performance, complex data structures and control flow.
Ansible is a good system to start exploring network automation. It’s easy to learn and provides valuable functionality. However, certain Ansible limitations mean it isn’t suitable for all network automation tasks.
Before we dig into the limitations of Ansible, let’s start with a quick review of Ansible’s strengths. It is easy to learn. Ansible playbooks are a sequence of tasks, defined in a file using YAML syntax.
Ansible modules are libraries of functions that provide additional functionality, such as executing commands on a network device and collecting the output of those commands. Other modules aid in making device configuration changes.
Another benefit is the Jinja2 templating engine enables the use of variables in Ansible playbooks. Do not underestimate Ansible’s combination of easy-to-learn YAML syntax, the extensive module library and the templating capability for quickly getting started with network automation.
Debugging. Ansible has a simple playbook debugger that supports five debugging triggers:
always — always trigger the debugger;
never — never trigger the debugger;
on_failed — trigger the debugger only if the task failed;
on_unreachable — trigger the debugger if the host is unreachable; and
on_skipped — trigger the debugger if the task is skipped.
The debugger enables users to print or set variables, and it supports redo task execution. But diagnosing complex playbooks can get tedious because it can only print and set variables. The internal operation of the modules isn’t visible, and many of the modules have a lot going on within them. I once spent several hours trying to diagnose why a playbook wasn’t working, only to finally discover it was because my DNS server wasn’t configured correctly. It can be discouraging to have a playbook fail and not be able to efficiently pin down why.
Performance. Another Ansible limitation is it can’t handle large volumes of data. Ansible serializes and deserializes JSON data internally between tasks, and this processing consumes a lot of CPU if large amounts of data are present. Performance isn’t a problem with simple playbooks and smaller data volumes.
Ansible disadvantages include debugging, performance, complex data structures and control flow.
Complex data structures. Many network automation tasks require complex data structures. One of the first things I considered when learning Ansible was to use it to perform network discovery. I immediately encountered two problems. The first challenge was to create a table of discovered devices and a table of newly discovered devices to check. I needed to store each device’s neighbors and use that data to learn neighbors of the neighboring devices. It quickly became complicated.
Ansible is a great tool for many network device automation tasks; it’s just not well suited for certain classes of problems.
The second problem involved network drawing maintenance — another common function. The list of connected neighbors enables users to construct a Graphviz DOT language file to show physical device connectivity. But, while it’s easy to get the list of Cisco Discovery Protocol neighbors, the data structures needed to create the DOT file get complicated.
Both the problems above require complex data structures to correlate the relationships between multiple devices.
Control flow. Some automation tasks require good programmatic flow control with looping and conditionals — if-then-else functionality. The network discovery task quickly encountered Ansible’s limitations. While Ansible offers looping and conditionals, the resulting playbook gets complicated. The combination of complex data structures and challenging control flow convinced me that Ansible is not the right tool for building a network discovery system.
Don’t avoid Ansible
Ansible is a great tool for many network device automation tasks; it’s just not well suited for certain classes of problems. Every tool has its limitations.
As a playbook’s complexity grows, its readability goes down. A playbook eventually becomes difficult to follow, especially one that includes Jinja2 variables. For example, consider the task of revising a complex playbook that has been in production for six months. Would you recall all the tricks you incorporated in order to make the playbook work?
Do these limitations mean users should avoid Ansible? Absolutely not. Ansible is a great tool for network engineers to start learning automation. The YAML configuration language is easy to learn for anyone who has had to learn a device command-line interface — think quality of service, access control lists and route maps, for example. Use Ansible when the job requires a sequence of reasonably well-defined tasks to be executed on many devices, like collecting troubleshooting data or reporting on statistics that aren’t available via other methods.
The important takeaway is to recognize when you should replace the tool you’ve chosen with a more suitable tool. The only way to develop that level of understanding is through experience. Knowing Ansible will ease the transition to other tools.