Using Ansible to Manage Jet Infrastructure

Or, Taming the Cloud with Copious Amounts of YAML

Ansible is a configuration management tool. It treats your infrastructure and configuration as simple text files in YAML. You describe what you want in text, without caring about the implementation details. In other words, we can ask Ansible, “Make me a pizza,” and it will make us one without describing the process of obtaining each ingredient and assembling them. Our team uses it to create machines and set up software.

Scale necessitates automation. Without it, you’ll quickly find yourself overwhelmed with the torrent of requests for new and updates to existing servers. We also like to treat our infrastructure as code. Code can be reviewed, compared, refactored, and re-deployed. Jet runs more than 1500 VMs in the Azure Cloud. We’re not at a massive scale, we’re but too large to manage manually.

Take a Look at Ansible

Tasks make up the core of our infrastructure configuration. Each task runs a Module. A module knows how to get that task done in an idempotent way. This means a module is like a function, and a task is like calling that function with arguments.

Here are some example tasks: One that creates an nginx group, and then another that creates an nginx user.

- name: create nginx group
  group:
    name: nginx
    system: yes
 

- name: create nginx user
  user:
    name: nginx
    group: 'nginx'
    system: yes
    shell: '/usr/sbin/nologin'

A Play is a sets of tasks that achieve some goal on a set of hosts. A Play Book can execute any number of plays. You should also gather related tasks into Roles thatcan be referenced in plays instead of writing tasks into them directly. So, you can think of a role as a super-task – it has a goal that requires several tasks to complete.

Ansible Playbook Diagram
Ansible Playbook Diagram

Below is an example of a playbook that contains a single play. The play runs on every host in the my_websites group and installs common stuff, nginx, and the nginx website config. Then it uses a task directly to reload nginx.

- hosts: 'my_websites'
  roles:
  - common
  - nginx
  - nginx_website_config
  post_tasks:
  - name: reload nginx
    service:
      name: nginx
      state: reload

Why Do We Like Ansible?

Agentless

Sorry 007, but Ansible uses the standard remote capabilities of the OS. Therefore, we can configure Linux machines directly over SSH and Windows machines using PowerShell Remoting (WinRM). This means that when you harden your servers access for WinRM and SSH, you also harden Ansible as well.

Batteries Included

The Ansible distribution contains core modules and extra modules. Between these two sets, it is rare to find a gap in functionality. Also, there are modules for interacting with the Azure API, and modules for creating users and groups. You’ll also find modules for interacting with big iron switches and networking equipment. Since windows support is newer, there are less available than for Linux, but the number of modules are still growing.

Easy to Extend

You can extend Ansible in many ways. We make use of custom Jinja template filters to alter the template language. Custom PowerShell modules can fill the gap since windows modules are currently scarce. Also, if you need more power, you can write plugins to hook into the Ansible playbook lifecycle. And, if you really want, you can hack the core – it is open source after all.

Try It!

If you’re using a bundle of ad hoc scripts to provision and configure your VMs, I’d urge you to give it a try. Since you don’t need to install anything on the server, there is very little commitment upfront.

I’ll be giving a talk at AnsibleFest Brooklyn 2016 this Oct 11th, and I’ll cover some of the ways we use it in more detail. If you couldn’t make it, we’ll link the video here. (Update 10/28:2016: It’s up! Go view it, and definitely leave comments.)