Habitat application portability and understanding dynamic linking of ELF binaries
Source – blog.chef.io
I do not come from a classical computer science background and have spent the vast majority of my career working with Java, C# and Ruby – mostly on Windows. So I have managed to evade the details of exactly how native binaries find their dependencies at compile time and runtime on Linux. It just has not been a concern in the work that I do. If my app complains about missing low level dependencies, I find a binary distribution for Windows (99% of the time these exist and work across all modern Windows platforms) and install the MSI. Hopefully when the app is deployed, those same binary dependencies have been deployed on the production nodes and it would be just super if its the same version.
Recently I joined the Habitat team at Chef and one of the first things I did to get the feel of using Habitat to build software was to start creating Habitat build plans. The first plan I set out to create was .NET Core. I would soon find out that building .NET Core from source on Linux was probably a bad choice for a first plan. It uses clang instead of GCC, it has lots of cmake files that expect binaries to live in /usr/lib and it downloads built executables that do not link to Habitat packaged dependencies. Right out the gate, I got all sorts of various build errors as I plodded forward. Most of these errors centered around a common theme: “I can’t find X.” There were all sorts of issues beyond linking too that I won’t get into here but I’m convinced that if I knew the basics of what this post will attempt to explain, I would have had a MUCH easier time with all the errors and pitfalls I faced.
What is linking and what are ELF binaries?
First lets define our terms:
There are no “Lord of the Rings” references to be had here. ELF is the Extensible and linkable format and defines how binary files are structured on Linux/Unix. This can include executable files, shared libraries, object files and more. An ELF file contains a set of headers and a number of sections for things like text, data, etc. One of the key roles of an ELF binary is to inform the operating system how to load a program into memory including all of the symbols it must link to.
Linking is a key part of the process of building an executable. The other key part is compiling. Often we refer to both jointly as “compiling” but they are really two distinct operations. First the compiler takes source code files and turn them into machine language instructions in the form of object files. These object files alone are not very useful to running a program.
Linking takes the object files (some might be from source code you wrote) and links them together with external library files to create a functioning program. If your source code calls a function from an external library, the compiler gleefully assumes that function exists and moves on. If it doesn’t exist, don’t worry, the linker will let you know.
Often when we hear about linking, two types are mentioned: static and dynamic. Static linking takes the external machine instructions and embeds them directly into the built executable. If all external dependencies of a program were statically linked, there would be only one executable file and no need for any dependent shared object files to be referenced.
However, we usually dynamically link our external dependencies. Dynamic linking does not embed the external code into the final executable. Instead it just points to an external shared object (.so) file (or .dll file on Windows) and loads that code into the running process at runtime. This has the benefit of being able to update external dependencies without having to ship and package your application each time a dependency is updated. Dynamic linking also results in a smaller application binary since it does not contain the external code.
On Unix/Linux systems, the ELF format specifies the metadata that governs what libraries will be linked. These libraries can be in many places on the machine and may exist in more than one place. The metadata in the ELF binary will help determine exactly what files are linked when that binary is executed.
Habitat + dynamic linking = portability
Habitat leverages dynamic linking to provide true application portability. It might not be immediately obvious what this means or why it is important or if it is even a good thing. So lets start by describing how applications typically load their dependencies in a normal environment and the role that configuration management systems like Chef play in these environments.
How you manage dependencies today
Lets say you have written an application that depends on the ZeroMQ library. You might use apt-get or yum to install ZeroMQ and its binaries are likely dropped somewhere into /usr. Now you can build and run your application and it will consume the ZeroMQ libraries installed. Unless it is told otherwise, the linker will scan the trusted Linux library locations for shared object files to link.
Now the time comes to move to production and just like you needed to install the ZeroMQ libraries in your dev environment, you will need to do the same on your production nodes. We all know this drill and we have probably all been burned at some point – something new is deployed to production and either its dependencies were not there or they were but they were the wrong version.
Configuration Management as solution
Chef fixes this right? Kind of…it’s complicated.
You can absolutely have Chef make sure that your application’s dependencies are installed with the correct versions. But what if you have different applications or services on the same node that depend on a different version of the same dependency? It may not be possible to have multiple versions coexist in /usr/lib. Maybe your new version will work or maybe it won’t. Especially for some of the lower level dependencies, there is simply no guarantee that compatible versions will exist. If anything, there is one guarantee: different distros will have different versions.
Keeping the automation with the application
Even more important – you want these dependencies to travel with your application. Ideally I want to install my application and know by virtue of installing it, everything it needs is there and has not stomped over the dependencies of anything else. I do not want to delegate the installation of its dependencies and the knowledge of which version to install to a separate management layer. Instead, Habitat binds dependencies with the application so that there is no question what your application needs and installing your application includes the installation of all of its dependencies. Lets look at how this works and see how dynamic linking is at play.
At install time, Habitat installs your application package and also the packages included in its dependency manifest (the DEPS file shown above) in the pkgs folder under Habitat’s root location. Here it will not conflict with any previously installed binaries on the node that might live in /usr. Further, the Habitat build process links your application to these exact package dependencies and ensures that at runtime, these are the exact binaries your application will load.
Habitat guarantees that the same binaries that were linked at build time, will be linked at run time. Even better, it just happens and you don’t need a separate management layer to enforce this.
This is how a Habitat package provides portability. Installing and running a Habitat package brings all of its dependencies with it. They do not all live in the same .hart package, but your application’s .hart package includes the necessary metadata to let Habitat know what other packages to download and install from the depot. These dependencies may or may not already exist on the node with varying versions, but it doesn’t matter because a Habitat application only relies on the packages that reside within Habitat. And even within the Habitat environment, you can have multiple applications that rely on the same dependency but different versions, and these applications can run side by side.
The challenge of portability and the Habitat studio
So when you are building a Habitat plan into a hart package, what keeps that build from pulling dependencies from the default Linux lib directories? What if you do not specify these dependencies in your plan and the build links them from elsewhere? That could break our portability. If your application builds magically from a non-Habitat controlled location, then there is no guarantee that those dependencies will land when you install your application elsewhere. Habitat constructs a build environment called a “studio” to protect against this exact scenario.
The rules of dependency scanning
So before we go any further lets take a look at how the linker works and how Habitat configures its build environment to influence where it finds dependencies at both build and run time. The linker looks at a combination of environment variables, cli options and well known directory paths and in a strict order of precedence. Here is a direct quote from the ld (the linker binary) man page:
The linker uses the following search paths to locate required shared libraries:
1. Any directories specified by -rpath-link options.
2. Any directories specified by -rpath options. The difference between -rpath and -rpath-link is that directories specified by -rpath options are included in the executable and used at runtime, whereas the -rpath-link option is only effective at link time. Searching -rpath in this way is only supported by native linkers and cross linkers which have been configured with the –with-sysroot option.
3. On an ELF system, for native linkers, if the -rpath and -rpath-link options were not used, search the contents of the environment variable “LD_RUN_PATH”.
4. On SunOS, if the -rpath option was not used, search any directories specified using -L options.
5. For a native linker, search the contents of the environment variable “LD_LIBRARY_PATH”.
6. For a native ELF linker, the directories in “DT_RUNPATH” or “DT_RPATH” of a shared library are searched for shared libraries needed by it. The “DT_RPATH” entries are ignored if “DT_RUNPATH” entries exist.
7. The default directories, normally /lib and /usr/lib.
8. For a native linker on an ELF system, if the file /etc/ld.so.conf exists, the list of directories found in that file.