Three Overlooked Lessons about Container Security
I’ve just joined container security specialists Aqua Security and spent a couple of days in Tel Aviv getting to know the team and the product. I’m sure I’m learning things that might be obvious to the seasoned security veteran, but perhaps aren’t so obvious to the rest of us! Here are three aspects I found interesting and hope you will too, even if you’ve never really thought about the security of your containerized deployment before:
#1: Email Addresses in Container Images
A lot of us put contact email information inside our container images. Even if the MAINTAINER directive in Docker files is deprecated in favor of using the more generic LABEL, it’s natural to think that users would find it helpful to be able to contact the image author.
If you work for a large corporation or you’ve worked in security before, it might be completely obvious to you that you shouldn’t publish individual company email addresses, and you might be as horrified as the Aqua team to find them in public container images. But it wasn’t something I had thought about. In the open source world, we put our names (or at least our online identities) on the work we do, and it seems natural to do so.
Here’s the problem: if John Doe leaves a personal email address like email@example.com inside a public container image, anyone can easily find out that there is someone at (fictional) Giant Corp called John Doe and that they are responsible for that particular component. That creates a potential opening for a spear-phishing attack.
Of course, we’re still vulnerable to spear-phishing if we work on open source projects, but there’s a compromise to be reached between security, trust and reputation. Users have greater trust in given project if they can see the individual developers behind it, and individuals build their reputation by publicly putting their name to the code they’ve written; these concerns can outweigh the security risks. In contrast, it’s Giant Corp’s brand reputation that inspires trust in its customers, and looks good on its employee’s CVs, so there is less reason to put the individual’s names on the code or the container image. If you’re working on open source code you’re probably using a personal email address rather than a company one anyway, right?
Hackers can, of course, look up Giant Corp on LinkedIn and find out lots about John Doe there, so removing the personal email address doesn’t remove the total risk — but it does reduce the attack surface.
So if you’re working for a corporation, you should think about using a generic email address like hello@, engineering@ or your-container-name@ in public container images, rather than your individual company email address.
#2: There’s More to Image Scanning Than Meets the Eye
There are lots of products that will scan your container images to let you know if they include known vulnerabilities. Aqua does this, but so do Docker, CoreOS’s Clair, and several others. They’re scanning the image looking for known vulnerabilities documented in databases such as the U.S. National Institute of Standards and Technology’s National Vulnerability Database. Sounds simple, right?
But it turns out there is quite a lot of subtlety to the way vulnerabilities are detected in container images.
Let’s say that there’s a fictional package mypkg version 1.2.7 with a vulnerability (that I’m making up) called CVE-999. The maintainers of mypkg fixed the vulnerability, and the fix made it into the release of mypkg 1.3.0. So the vulnerability database might say that CVE-999 is known to exist in any version of mypkg before 1.3.0.
Now, several other fixes and features went into 1.3.0 upstream, and let’s imagine that the owner of the image being scanned didn’t want all of them. Instead, she decided to cherry-pick just the fix for CVE-999. She rebuilt her own version of the mypkg and called it 220.127.116.1145.
According to the database, it would seem that the vulnerability is still there, because 18.104.22.16845 is lower than 1.3.0 where the fix is known to have been applied. But it’s a false positive.
The fact that it’s a container image makes no difference to this kind of false positive — but there is a similar issue when we think about container layers.
Suppose that you have a base image that includes mypkg 1.2.7 (without the fix), so we know it is vulnerable. If the scanner simply reports vulnerabilities layer by layer, it will appear that the issue is present in any child image, even if the package gets replaced with version 1.3.0 in a different layer. Another false positive.
A third type of false positive can occur when the vulnerability is present in the code you deploy, but only comes into play when interacting with another component. If you don’t use that component, the vulnerability can’t actually affect your deployment. As an example, consider a vulnerability that is exploited through network access. If the container doesn’t have any network interfaces, it doesn’t matter that the vulnerability is present.
These false positives are more than just annoying — they make it confusing for people to understand whether the code in production is genuinely problematic. The more people get used to ignoring false positives, the less likely they are to notice and react when a significant real positive shows up.
#3: Detection as Well as Prevention
By definition, there’s a time gap between the introduction of a zero-day exploit and its detection. In that gap, while the vulnerability is unknown, image scanning can’t help you avoid deploying that vulnerability. To paraphrase Donald Rumsfeld we have the known unknowns under control, but it’s the unknown unknowns we really need to worry about.
One day, even with the strictest of policies, you’ll have a vulnerability deployed to production, because it’s impossible to prevent the deployment of something you don’t know is a problem. If that vulnerability is exploited, it will change the behavior of your code in some way — perhaps writing to an unusual file, or sending unexpected network traffic.
Figuring out what constitutes “normal” behavior can be non-trivial, but a containerized microservice architecture actually helps here. The smaller the amount of functionality in the container, the easier it is to define limits on what it’s supposed to do — and this gives us opportunities to enhance security.
For example, a microservice running inside a container is typically only going to communicate with a small number of other services, using only a very few ports. If you started to observe that the same code was also trying to send requests on a totally different port, you might think this is suspicious and needs to be stopped and investigated. Similarly, as a general rule, you could predict the set of files or executables that a microservice should be accessing. Anything else would be a smoking gun.
Detecting unusual behavior means you can spot that a problem exists, and you can take it a step further to ban unexpected behaviors from happening.
I’m just starting to scratch the surface of the interesting concepts in container security, and the important things we need to think about to make sure our containerized deployments are as secure as they can be. If there are angles you’d say I need to know, or questions you’d like to learn more about, I’d love to hear from you.