In the DevOps model, moving software from development to production is much faster. But, we need to be able to identify actions that occur in our increasingly complicated applications. We now have many chefs working at the same time, and visibility suffers.
Logging helps us better understand applications throughout each stage in the process so that we can identify problems or larger issues that might arise already during the development process. This is particularly true within CI/CD environments where items are regularly pushed into production. We need methods for identifying not only if there are critical errors, but also the ability to attribute the errors to specific versions throughout the cycle.
To deal with the increasing complexity and decreasing visibility, DevOps identifies complex issues using logs to keep track of services, APIs, containers, infrastructure issues, network activity and also security related logs.
Log files provide a tremendous amount of useful information about what goes on beyond the CI/CD pipeline and also identify issues that directly impact customers. Good log analysis tools make it possible for DevOps teams to get much closer to the end user. And open source is the preferred starting point for many organizations.
Why Open Source?
Lack of Lock-in
Many DevOps teams make use of a wide-range of tools and find that they can work much better without vendor lock-in, which can be caused when using proprietary stacks exclusively. This makes it far easier to make any needed changes as the need arises, and not be reliant on individual vendors. This also helps avoid many risks, such as vendors disappearing, ending support, changing pricing models, etc.
Within CI/CD, there are many commonly used open source tools, which makes it easier for new members to be on-boarded. There is already a widely accepted suite of tools that almost all DevOps teams use. It’s hard to imagine working without Jenkins, Docker, Kubernetes, Git, etc.
Regular Skill Development
Open source work ensures that DevOps teams are actively learning and directly interacting with software and systems. It becomes a way of ensuring that skills and abilities of teams are continually staying updated. This can translate into better troubleshooting and problem solving skills. When working with closed systems, there’s considerably less opportunity for knowledge advancement, as closed systems tend to encourage rote work with existing software rather than skill development.
Fears about subpar open source software are largely unfounded, particularly if the projects are widely used and have a large community. There are far more developers working on a project in the open source community than in proprietary systems. Closed systems can lag behind, and there may actually be bad code but that is hidden due to its closed-source nature. The ability to make modifications to create functionality beyond that which already exists in closed source systems, makes open source extremely appealing for many DevOps personnel.
It’s important to note there are a few drawbacks when attempting to work entirely within open source environments. Many resources used are not open by nature–for example leading clouds are decidedly not open source in nature. Individual users (outside these companies) have much less control over how they actually operate. However, if we look at the toolkits that we described above, we can think of open source tools as needed resources for working in and around these environments.
Another drawback to consider is the actual costs of managing open source software versus a fully-managed service such as Coralogix, for example. Many times, teams will start with the self-managed approach and move to a managed service when faced with scaling and complexity issues as the organization grows.
Types of Features to Look for in Open Source Logging Models
When setting up to work with open source logging tools, it’s important to understand there is a range of different types of models available, which can work better or worse depending upon needs and capacities.
OpenAPI: provides a useful interface into many different systems. It is language agnostic and can generate clear documentation of methods, parameters and models. In most cases it will work within RESTful interfaces, but will also work with other protocols such as SOAP.
Open Standards: Even if a tool is not open source, typically these should follow a set of standards to make sure that different pieces can talk to each other. OpenAPI is one example, but not the only one.
Federated Model: This is a broader type of open source model which is like many default open tools, but allows a much wider set of input and may allow individual users to provide input and maintain some control over local development areas. This means that aggregation, processing and control remain local. However there’s still a central organization which will be able to collect summarized or complete code. The advantage of a federated model is it increases flexibility for individual teams, while still contributing largely to the project as a whole.
Elasticsearch, Logstash, Kibana (ELK)
One of the most popular open source logging stacks is known as ELK. It is particularly powerful and effective for the purposes of those wishing to create a clear aggregation and understanding of their log files, with the ability to find both problems and solutions quickly.
ELK is a combination of three tools.
Elasticsearch: a popular NoSQL Lucene search engine.
Logstash: a pipeline system which will take data from logs, ingest it and transform it into a data store (such as Elasticsearch), so that it can be searched and analyzed.
Kibana: a tool which makes it possible to create clear visualizations for Elasticsearch to appear in a human-readable format.
What ELK provides is a single place for all of your data, where it can be stored, searched and analyzed in a method. It is useful for DevOps teams wishing to make use of the vast amounts of data stored in log files.
Upfront expenses for open source tools like ELK are minimal; the software itself is free. However, this does come with a caveat. Professional management of ELK tools is required, otherwise costs can grow quickly out of control.
Generally, for low volumes of data, one can manage ELK for as little as a few hundred dollars a month on a platform such as AWS. However, without careful management, the cost can grow exponentially for large amounts of data.
Work required for ELK involves:
Setup, which may take some time but that’s typically not a problem for many ops teams—and is a project many will enjoy.
Maintenance of elastic search clusters can be quite a bit of work—several hours per week simply solving issues or resolving downtime. This can vary based on the size and growth of clusters. The more activity you have occurring, the more time you will need to spend maintaining your logs.
FluentD is an open source alternative to Logstash which can collect, parse and transform data for further analysis. It has features which may be particularly helpful if you are running your logs in a Kubernetes environment.
One of the drawbacks of Logstash is that it is written in JRuby, which means it requires a Java runtime for each implementation. If you are running many different microservices in Kubernetes, this can result in a large amount of your memory being taken up.
Because FluentD is developed in CRuby, it doesn’t require as much memory to use with each new pod creation. This can help solve some memory allocation issues which could cause your logs to slow down your applications.
FluentD also has a wide range of plugins, which can cover pretty much any use case that you come up with. Another advantage is that its tag-based routing makes it slightly easier for developers to work with than Logstash.
What to Log
With continuous integration, there are many things that merit monitoring. These can include everything from checking the frequency of merge conflicts in Git, or whether pulls are preceded by pushes prior to any other pulls.
Are you receiving a high number of compiler warnings? You can identify code that has a strong likelihood of breaking builds, change your unit testing procedures and implement better automated checking in Jenkins.
Much of this data is worth visualizing in Kibana for analysis. Deciding what specific activities to log can be a bit involved. Areas that are a good idea to log include:
Diagnostics: For example, understanding the causes of repeated errors.
Auditing: Track how well errors in code have been fixed, and whether they succeed or fail a second time.
Profiling: For example, get an idea of how long it takes for certain parts of code to execute, for the purpose of identifying whether this has any impact on customer experience, and identify areas for improvement.
Statistics: Code performance, such as execution time or memory consumption.
Gaining complete visibility or your infrastructure will go a long way to make sure that your operation runs smoothly and that your CI/CD pipelines function at optimum efficiency. Open source tools remain an attractive option for DevOps because of its advantages and knowing how to get around drawbacks.
Finding the right tools is, of course, only the first step, but if you’ve picked good architectures and have implemented them properly, you are already moving in the right direction to a more streamlined operation.