AWS DevOps Blog

Using Capistrano to run arbitrary commands on AWS OpsWorks instances

AWS OpsWorks customers frequently request the ability to run arbitrary commands. And while OpsWorks sets up and manages the Amazon EC2 instances your application runs on and manages users’ access to your instances, it doesn’t allow running arbitrary commands. Let’s say, for example, that you wanted to run uptime across your fleet. You could create a custom Chef recipe that executes uptime, create an execute_recipes deployment, and check the log files for the output. However, that’s fairly heavyweight for something as simple as running uptime across your fleet.

While OpsWorks doesn’t natively support running arbitrary commands, it’s not too difficult to come up with a decent solution using existing tools. In this post, I’m going to show how you can use Capistrano to run arbitrary commands on OpsWorks instances.

Capistrano, according to its website, is “a remote server automation and deployment tool written in Ruby.” It can run commands on remote machines in parallel, collecting their return status and outputs. The commands are organized into tasks that are described with a simple Ruby Domain Specific Language (DSL) — much like Chef recipes. If you haven’t heard of Capistrano, feel free to get familiar with it before reading on.

At the time of this post, the Capistrano website actually includes an example that executes the uptime command:

role :demo, %w{example.com example.org example.net}

task :uptime do
  on roles(:demo), in: :parallel do |host|
    uptime = capture(:uptime)
    puts "#{host.hostname} reports: #{uptime}"
  end
end

As explained on the Capistrano website, you set up Capistrano by running cap install in your project’s root directory. This will create the following files and directories, as needed.

.
+-- Capfile
+-- config
|   +-- deploy
|   |   +-- production.rb
|   |   -- staging.rb
|   -- deploy.rb
-- lib
    -- capistrano
            -- tasks

My example files are located in a GitHub repository. Instead of using cap install, we will use the files from this repository.

The repository’s root directory structure looks like this:

.
+-- config
|   +-- deploy
|   |   -- meta.rb
|   -- deploy.rb
-- lib
    -- capistrano
        -- tasks
            -- run.rake

Capistrano uses the concepts of stages, servers, and roles. In the generated example files, the stages are called production and staging. Each stage has a set of servers that can have many roles. Technically, each stage doesn’t need to have its own set of servers, but that’s what is commonly done in practice.

Here’s how OpsWorks concepts translate into their Capistrano counterparts.

OpsWorks Capistrano
Stack Stage
Layer Role
Instance Server

To do anything on your servers, you first need to let Capistrano know about them. That’s what the stage-specific config files are for. Here are the contents of the generated staging.rb file.

set :stage, :staging

# Simple Role Syntax
# ==================
# Supports bulk-adding hosts to roles, the primary
# server in each group is considered to be the first
# unless any hosts have the primary property set.
role :app, %w{example.com}
role :web, %w{example.com}
role :db,  %w{example.com}

# Extended Server Syntax
# ======================
# This can be used to drop a more detailed server
# definition into the server list. The second argument
# is something that quacks like a hash and can be used
# to set extended properties on the server.

server 'example.com', roles: %w{web app}, my_property: :my_value

With Capistrano, each role corresponds to one or more servers, and a server can have multiple roles. When you run a command, you specify the roles, and then Capistrano runs the command on the associated servers. You set up the roles for each server in the stage config files. The last line in this example tells Capistrano about your servers and roles.

In this example, I use the AWS SDK for Ruby to generate Capistrano config files that reflect stacks and running instances in OpsWorks. This example can easily be made more or less dynamic. By making API calls to update the list of servers before running commands, you will never miss an instance that just launched. But, the additional API calls will take some time. On the other hand, having the list of servers change while you’re executing commands might lead to unexpected results. Anyway, keep in mind that the example shown here is just one of many ways to use Capistrano with OpsWorks.

To model your stacks, layers, and instances in Capistrano, you just need to iterate over all of them and use Capistrano’s DSL to declare each instance as a server. What makes this slightly more complex than just three nested loops is the fact that in OpsWorks an instance can be in multiple layers. To ensure that we have all of the layers we need for the server definition, we need to determine which layers each instance belongs to.

Capistrano uses the config/deploy directory for stage configuration and holds task definitions in lib/capistrano/tasks. The files that do the heavy lifting in my example are meta.rb and run.rake. deploy.rb can be empty, but Capistrano complains if it’s missing, so we will just let it sit there.

In meta.rb we iterate over all stacks and generate a stage config file for each stack. The meta.rb file defines two commands:

·      populate creates a set of stage files.

·      extinguish removes the created stage files.

You run these commands with cap meta commandname. After making sure you have Capistrano installed, the first command you run is cap meta populate. As described earlier, Capistrano’s top-level organizational unit is the stage. Commands are executed on stages, so populate iterates over stacks, layers, and instances, and then writes one stage file per stack, with one server entry per instance.

extinguish simply removes all files created by populate. I will not use it in this example.

Before we generate stage files, let’s have a look at the stacks I have in my account.

aws opsworks describe-stacks --output table --query 'Stacks[*].[StackId,Name]'
----------------------------------------------------------
|                     DescribeStacks                     |
+---------------------------------------+----------------+
|  b23dd487-1469-42bd-8d87-8f9e7aabdbc7 |  interview     |
|  122caab8-e407-4b44-8709-72c255b1fef2 |  demo          |
|  d9af7eb2-aeb6-4290-9522-f4f85793ed25 |  os-benchmark  |
+---------------------------------------+----------------+

Now run bundle exec cap meta populate to generate stages in config/deploy.

As I just explained, cap meta populate should have generated some stage config files. Let’s have a look at our project directory again.

.
+-- config
|   +-- deploy
|   |   +-- demo.rb
|   |   +-- interview.rb
|   |   +-- meta.rb
|   |   -- os-benchmark.rb
|   -- deploy.rb
-- lib
    -- capistrano
        -- tasks
            -- run.rake

Now let’s look at the instances in one of those stacks:

aws opsworks describe-instances --stack-id d9af7eb2-aeb6-4290-9522-f4f85793ed25 --output table --query 'Instances[*].[Hostname,Os,Status]'                                                              
-------------------------------------------------------------
|                     DescribeInstances                     |
+------------------------+------------------------+---------+
|  amazon-linux-2014-03i |  Amazon Linux          |  online |
|  amazon-linux-2014-09i |  Amazon Linux 2014.09  |  online |
|  ubuntu-12-04-ltsi     |  Ubuntu 12.04 LTS      |  online |
|  ubuntu-14-04-ltsi     |  Ubuntu 14.04 LTS      |  online |
+------------------------+------------------------+---------+

Let’s look at the stage config file for that stack.

role "blank", []
server "54.188.203.46", {:user=>"ec2-user", :roles=>["blank"]}
server "54.190.72.211", {:user=>"ec2-user", :roles=>["blank"]}
server "54.70.104.224", {:user=>"ubuntu", :roles=>["blank"]}
server "54.70.157.58", {:user=>"ubuntu", :roles=>["blank"]}

This means that Capistrano now knows about that stack’s instances, and we should be able to run commands on those instances.

Hint: I have set up private key authentication for all my instances, so there won’t be any password prompts. Here’s how to set this up.

In Capistrano, there are two modes for running commands: interactive and non-interactive. The interactive mode is helpful when you need immediate feedback, like when you don’t know yet which command you want to run on your servers. Use non-interactive mode when you know which command you want to use and just want to see the result.

Here’s an example of the output for interactive mode.

cap os-benchmark console
capistrano console - enter command to execute on os-benchmark
os-benchmark> uptime
INFO[477ac70f] Running /usr/bin/env uptime on 54.70.104.224
DEBUG[477ac70f] Command: /usr/bin/env uptime
INFO[496f9726] Running /usr/bin/env uptime on 54.190.72.211
DEBUG[496f9726] Command: /usr/bin/env uptime
INFO[92222ac7] Running /usr/bin/env uptime on 54.188.203.46
DEBUG[92222ac7] Command: /usr/bin/env uptime
INFO[8391c30a] Running /usr/bin/env uptime on 54.70.157.58
DEBUG[8391c30a] Command: /usr/bin/env uptime
DEBUG[496f9726]       16:03:37 up 21 min,  0 users,  load average: 0.00, 0.01, 0.05
INFO[496f9726] Finished in 3.021 seconds with exit status 0 (successful).
DEBUG[92222ac7]       16:03:37 up 20 min,  0 users,  load average: 0.02, 0.02, 0.05
INFO[92222ac7] Finished in 3.079 seconds with exit status 0 (successful).
DEBUG[8391c30a]       16:03:37 up  8:48,  0 users,  load average: 0.00, 0.01, 0.05
INFO[8391c30a] Finished in 3.076 seconds with exit status 0 (successful).
DEBUG[477ac70f]       16:03:37 up 20 min,  0 users,  load average: 0.02, 0.02, 0.05
INFO[477ac70f] Finished in 3.336 seconds with exit status 0 (successful).

This shows that we executed the uptime command on all running servers. Let’s say that we want to run uptime again, but in non-interactive mode, specifying the command with an environment variable. Here’s what that output looks like.

COMMAND=uptime cap os-benchmark run
INFO[6967488c] Running /usr/bin/env uptime on 54.70.157.58
DEBUG[6967488c] Command: /usr/bin/env uptime
INFO[b68600cf] Running /usr/bin/env uptime on 54.190.72.211
DEBUG[b68600cf] Command: /usr/bin/env uptime
INFO[a108126d] Running /usr/bin/env uptime on 54.70.104.224
DEBUG[a108126d] Command: /usr/bin/env uptime
INFO[e5e1af8f] Running /usr/bin/env uptime on 54.188.203.46
DEBUG[e5e1af8f] Command: /usr/bin/env uptime
DEBUG[b68600cf]       16:05:09 up 22 min,  0 users,  load average: 0.00, 0.01, 0.05
INFO[b68600cf] Finished in 3.054 seconds with exit status 0 (successful).
DEBUG[6967488c]       16:05:09 up  8:50,  0 users,  load average: 0.00, 0.01, 0.05
INFO[6967488c] Finished in 3.094 seconds with exit status 0 (successful).
DEBUG[e5e1af8f]       16:05:09 up 22 min,  0 users,  load average: 0.00, 0.01, 0.05
INFO[e5e1af8f] Finished in 3.055 seconds with exit status 0 (successful).
DEBUG[a108126d]       16:05:09 up 22 min,  0 users,  load average: 0.00, 0.01, 0.05
INFO[a108126d] Finished in 3.128 seconds with exit status 0 (successful).

It’s very simple to pass any command through Capistrano. console is one of Capistrano’s built-in commands. run is a custom command I created for this example; it’s located in run.rake.

desc "Run arbitrary command on hosts"
task :run do
  on roles(:all) do |host|
    execute(ENV["COMMAND"])
  end
end

The roles(:all) part doesn’t necessarily mean that each command has to run on every instance in every role in your stack. It just means that the list of roles isn’t constrained in the source code. By default, Capistrano runs against all roles that appear in config files. Feel free to narrow this down using Capistrano’s command line switches, for example, -r.

This should get you started with Capistrano and OpsWorks. For more information, see the Capistrano website and the GitHub repository that contains the complete source code mentioned in this example.