LEARNING TO SKATE WITH DEVOPS

Source:-builtin.com

I found that learning DevOps was a lot like learning to skate, but, with enough practice, I eventually stopped falling.

I spent most of my childhood growing up in Canada, and one pastime Canadians enjoy is ice skating. I tried it out for myself in fifth grade, and the first time I went out on the ice, I fell so many times I ended up as bruised as the piece of fruit you pass over at the grocery store. It was so painful I cried, and my mom said I didn’t need to go back if I didn’t want to. But I insisted on going again the next week, and I stopped falling so much. My skills started to get better, and as I kept practicing, I eventually learned to skate fast and even do some simple tricks.

My current experience with DevOps reminds me of learning to skate. DevOps is a lot like the ice rink of my youth. On the ice, you can play a bunch of different games, just like you can deploy different stacks in DevOps. You can skate in different directions, just like assuming different roles. Whether you’re chasing a hockey puck or typing on a command line, you choose your direction, and the world is your oyster. You can also end up falling down quite a bit, in ways that aren’t really fathomable in other software engineering disciplines. In this article, I want to talk about my DevOps experience so far, and what I learned about this expansive and open setting.

I had a flawed mindset when I first approached DevOps around two months ago. I was arrogant and believed DevOps was just glue for existing services, which would be a couple of bash scripts. I’m also fiercely independent and stubborn. When there’s a stack I like, I go full steam ahead, and I have a hard time pivoting away. I like sticking to services that I believe will render great underlying value for their price point. All this was exacerbated by the fact that I’m on sabbatical right now and focusing on learning new things as opposed to shipping products.

This experience also proved harsh because of the project I wanted to build. You can find the proof of concept here, but long story short, it’s PostgreSQL sprouting custom REST endpoints and WebSockets. If I wanted to build this project my way, deployment and management would remain special tasks.

Firstly, I wanted to install my own PostgreSQL extensions. I found this PostgreSQL extension called pg_cron that was perfect for what I wanted to create. It schedules cron jobs on the database server and exposes those jobs as a database table. It also allows for generation of server-side events based on SQL queries, describe those events using SQL, and co-locate the process with the database.

Unfortunately, pg_cron isn’t an officially supported PostgreSQL on AWS Relational Database Service extension. It’s built by Citus Data, and, after their acquisition by Microsoft, they pulled their offerings from AWS and exist solely on Azure now. Citus Data doesn’t maintain a build/test pipeline for pg_cron, so beyond a specific OS/arch/database version you need to compile the extension from source.

This led to the second requirement I had: root access to the database instance. If you want to install your own PostgreSQL extensions, you’ll need at least SSH access to copy over the extension’s build artifacts to the server and the correct permissions to install the extension into PostgreSQL. AWS RDS doesn’t support SSH access into RDS instances, and it doesn’t grant sudo permissions to install extensions outside of the officially supported ones. At this point, without access to an IT department, you may as well obtain root access to the server.

This isn’t necessarily a bad thing. Over the long term, having root access grants flexibility and agency. It provides a great deal of understanding in database performance and availability that RDS would otherwise abstract away. For example, the guys who made this great PostgreSQL configuration dashboard mentioned the performance benefits of tuning your own configurations, representing significant gains that you can’t get via query tuning.

The first two requirements pretty much dictate the need to deploy a custom database, but the need to communicate server-side events to clients presents additional challenges. This Hacker News post described how to implement pub/sub via PostgreSQL’s CREATE TRIGGER and LISTEN/NOTIFY features, where you can also run your own stored procedures using PL/pgSQL or even PL/Python. This may present additional challenges from a security perspective, as you may need to implement granular permissioning within PostgreSQL on a per-table or even per-row or per-column basis, to ensure those stored procedures cannot access data outside their problem domain.

If you’re looking to monetize a software product based in the cloud, AWS RDS makes total sense. It’s scalable, maintainable, and AWS Support turnaround times are likely much lower. If you’re like me, though, and you want your own independent, personal data lake architecture all on PostgreSQL, RDS makes less sense.

I didn’t immediately see anything off-the-shelf that satisfied my requirements, so I figured I should learn to build it myself. With the aforementioned constraints in mind, I went ahead and explored various options for what my backend tech stack would look like.

Initially, I decided to table the database conundrum and see whether I could just get a backend process up and running with AWS Elastic Beanstalk, so that I could keep my momentum up. Elastic Beanstalk initially appealed to me for a number of reasons, namely keeping operations overhead to a minimum and having access to underlying resources.

After a week of playing around with it though, I ultimately decided that Elastic Beanstalk wasn’t the right fit for me and moved on. I found Elastic Beanstalk provides a great deal of subjective safety, such as a user-friendly CLI, with the tradeoff of objective safety, or understanding what’s underneath the hood when things go wrong. Elastic Beanstalk provides a sandbox to iterate on one specific portion of the entire deployment stack. Since my requirements didn’t fit with Elastic Beanstalk’s vision, I eventually worked myself into a corner. I realized I didn’t want a sandbox. I want an ice rink.

My experience with Elastic Beanstalk helped me better value accessible and descriptive system logs, an open-source, locally available, and reproducible deployment framework, and an independent and self-managed dependency suite. Given this updated understanding of my requirements, I reached for Docker.

What I particularly like about Docker is its multi-stage builds. You can define build, test, and production stages in one Dockerfile. This keeps the final image size slim by reusing intermediate images from the Docker build cache. This was especially relevant for me after I realized I would need to build some dependencies from source. It also provides a rich ecosystem of tools. I wanted to deploy resources on a single server but compartmentalize resources on a per-process basis. For example, your database might require block storage, while your web server might require static file storage. I found that, for single-host, multi-container local deployments, Docker Compose fits quite well.

Now, how do you deploy a locally defined Docker Compose stack to AWS? The natural answer is AWS Elastic Container Service, which is an AWS-native method for deploying containers. I found my experience with AWS ECS frustrating at first. There’s some level of compatibility with Docker Compose, but it was hard to find tutorials, and there’s a bit of a learning curve. One particular sticking point is persistent volumes. Every Docker filesystem is ephemeral as containers are expected to be wiped out, so you need to attach an AWS Elastic Block Store volume to your container if you wish to persist files, like database data.

Two problems exist there. First, AWS Fargate, a serverless container orchestration service, doesn’t support EBS deployment. You must use Elastic Compute Cloud (EC2) “task definitions” to do so, which may make deployments stateful and difficult to manage. Second, EBS volumes are tightly coupled to EC2 instances, and if you want to reproducibly deploy an EC2 instance with an EBS volume provisioned at startup, you might need a framework like Packer to define an Amazon Machine Image.

I was pretty incredulous at this point. How can infrastructure operations be this complicated? I just want an MVP! I definitely felt like a bruised fruit just like I did after wiping out on the ice. I even weighed whether to move forward or give up.

As with skating, though, I decided to keep going. There had to be a way to deploy a database using some kind of configuration tool. It’s not like RDS is magic. Eventually, I found this blog post by AWS, which describes EC2 task definition support for Docker volume drivers, which binds a Docker volume to an underlying state store. Implementing this solution involves a keen understanding of AWS CloudFormation, an AWS-native infrastructure-as-code solution. CloudFormation has an even steeper learning curve than ECS, but there are far more tutorials on CloudFormation, and I think it’s the single source of truth behind all AWS deployments. I picked up a book, Docker on AWS by Justin Menga, and reading through it confirmed for me that ECS goes very well with CloudFormation — and showed me just how much I don’t know. After reading through this book, I’d much rather prefer the objective safety in defining my own CloudFormation templates, which I know sticks closely to the underlying structure of AWS and grants me the flexibility to do what I want.

After these past two months, I think I finally figured out what kind of stack I want. Docker Compose, deployed using AWS Elastic Container Service using an EC2 task definition with mounted Elastic Block Store in a custom AMI, templated using AWS CloudFormation. So why did I learn? Well, like I said, my experience with DevOps was a lot like learning to skate.

Rinks come in all shapes and sizes. So do environments! Your neighborhood’s frozen pond is a very different experience than the skating rink in Rockefeller Center. And deploying apps on your local dev machine is very different than deploying apps into production, or even deploying the same app to a cloud-based development stage. You have to adapt your approach for the environment you’re in.

It takes time and effort to get good! It’s hard to learn to play hockey if you don’t practice skating for a long time. Similarly, you can’t get good at DevOps if you don’t keep updating your knowledge base with new tools and new paradigms.

Ultimately, I’m really grateful I gave myself the time to learn, unpack, and internalize these DevOps lessons. I think being able to deploy a stack from first principles quickly and confidently goes a long way toward changing my BATNA from a technical perspective, because shipping quality software products is so important to standing on your own two feet.