AWS DevOps Blog

Quickly Explore the Chef Environment in AWS OpsWorks

AWS OpsWorks recently launched support for Chef 12 Linux. This release changes the way that information about the stacks, layers, and instances provided by OpsWorks is made available during a Chef run. In this post, I show how to interactively explore this information using the OpsWorks agent command line interface (CLI) and Pry, a shell for Ruby. Our documentation shows you what’s available, this post shows you how to explore that data interactively.

OpsWorks manages EC2 or on-premises instances by triggering Chef runs. Before running your Chef recipes, OpsWorks prepares an environment. This environment includes a number of data bags that provide information about your stack, instances, and other resources in your stack. You can use data bags to write cookbooks that adapt to changes in your infrastructure.

When an instance has finished its setup or when it leaves the online state, OpsWorks triggers a Configure event. You can register your own custom recipes to run during Configure events, and use a custom recipe as a light-weight service discovery mechanism. For example, you could use custom recipes to grant database access to an app server after it’s started, or revoke access after it’s stopped, or discover the IP address of the database server within your stack.

Typically, you access data about stacks, layers, and instances through Chef search. For earlier supported versions of Chef on Linux, this data was made available as attributes. In Chef 12 Linux, the data is available in data bags.

To access this data, I’m going to use only tools that are already present on OpsWorks instances: the OpsWorks agent CLI and Pry. Here’s the elevator pitch for Pry, taken from the Pry website:

Pry is a powerful alternative to the standard IRB shell for Ruby. It features syntax highlighting, a flexible plugin architecture, runtime invocation and source and documentation browsing.

Because Pry is already present on OpsWorks instances, there’s no need to install it.

I execute all terminal commands shown in the rest of this post as the root user.

How Do You Use Pry with OpsWorks?

First, let’s take a look at the OpsWorks agent CLI. The agent CLI lets you explore and repeat Chef runs on an instance.

To see a list of completed runs, use opsworks-agent-cli list:

[root@nodejs-server1 ~]# opsworks-agent-cli list
2015-12-16T13:37:2        setup
2015-12-16T13:40:56       configure

For an instance that has just finished booting, you should see a successful Setup event, followed by a successful Configure event.

Let’s repeat the Chef run for the Configure event. To repeat the last run, use opsworks-agent-cli run:

[root@nodejs-server1 ~]# opsworks-agent-cli run
[2015-12-16 13:44:55]  INFO [opsworks-agent(26261)]: About to re-run 'configure' from 2015-12-16T13:40:56
...
[2015-12-16 13:45:01]  INFO [opsworks-agent(26261)]: Finished Chef run with exitcode 0

Because the agent CLI can only repeat Chef runs, it doesn’t allow me to execute arbitrary recipes. I can do that in the OpsWorks console with the Run command. For demo purposes, I’ll use a custom cookbook named explore-opsworks-data to trigger a Chef run so I can then execute a recipe during the run.

The Chef run failed because I tried to execute a recipe that doesn’t exist. Let’s create and run the recipe and do it in a way that opens up a Pry session.

[root@nodejs-server1 ~]# mkdir -p /var/chef/cookbooks/explore-opsworks-data/recipes
[root@nodejs-server1 ~]# echo 'require "pry"; binding.pry' > /var/chef/cookbooks/explore-opsworks-data/recipes/default.rb
[root@nodejs-server1 ~]# opsworks-agent-cli run
...
[2015-12-16T13:55:32+00:00] INFO: Storing updated cookbooks/explore-opsworks-data/recipes/default.rb in the cache.
From: /var/chef/runs/35e8a98a-c81e-46a9-84e3-1bbd105f07dd/local-mode-cache/cache/cookbooks/explore-opsworks-data/recipes/default.rb @ line 1 Chef::Mixin::FromFile#from_file:
 => 1: require "pry"; binding.pry

That doesn’t look very good. In fact, the output appears truncated. That’s because I’m now using an interactive shell, Pry, right in the middle of the Chef run. But, I can now use Pry to run arbitrary Ruby code within the recipe I created. I’ll try searching on the data bags for the stack, layer, and instance.

The aws_opsworks_stack data bag contains details about the stack, like the region and the custom cookbook source, as shown in the following example:

search(:aws_opsworks_stack)
=> [{"data_bag_item('aws_opsworks_stack', '8bd5b1e5-6f45-4d3d-9eb1-5cdaecaf77b8')"=>
   {"arn"=>"arn:aws:opsworks:us-west-2:153700967203:stack/8bd5b1e5-6f45-4d3d-9eb1-5cdaecaf77b8/",
    "custom_cookbooks_source"=>{"type"=>"archive", "url"=>"https://s3.amazonaws.com/opsworks-demo-assets/opsworks-linux-demo-cookbooks-nodejs.tar.gz", "username"=>nil, "password"=>nil, "ssh_key"=>nil, "revision"=>nil},
    "name"=>"My Sample Stack (Linux)",
    …
"data_bag"=>"aws_opsworks_stack"}}]

The aws_opsworks_layer data bag contains details about layers, like the layer name and Amazon Elastic Block Store (Amazon EBS) volume configurations:

search(:aws_opsworks_layer)
=> [{"data_bag_item('aws_opsworks_layer', 'nodejs-server')"=>
   {"layer_id"=>"a8127c0d-749a-4192-aad7-8e512c8942b4", "name"=>"Node.js App Server", "packages"=>[], "shortname"=>"nodejs-server", "type"=>"custom", "volume_configurations"=>[], "id"=>"nodejs-server", "chef_type"=>"data_bag_item", "data_bag"=>"aws_opsworks_layer"}}]

The aws_opsworks_instance data bag contains details about instances, like the operating system and IP addresses:

search(:aws_opsworks_instance)
=> [{"data_bag_item('aws_opsworks_instance', 'nodejs-server1')"=>
   {"ami_id"=>"ami-d93622b8",
    "architecture"=>"x86_64",
    …
    "id"=>"nodejs-server1",
    "chef_type"=>"data_bag_item",
    "data_bag"=>"aws_opsworks_instance"}}]

Now I’ll access a data bag directly. As the following example shows, the data I get this way is identical to the data the search command returns:

data_bag("aws_opsworks_stack")
=> ["8bd5b1e5-6f45-4d3d-9eb1-5cdaecaf77b8"]
data_bag_item("aws_opsworks_stack", "8bd5b1e5-6f45-4d3d-9eb1-5cdaecaf77b8")
=> {"data_bag_item('aws_opsworks_stack', '8bd5b1e5-6f45-4d3d-9eb1-5cdaecaf77b8')"=>
  {"arn"=>"arn:aws:opsworks:us-west-2:153700967203:stack/8bd5b1e5-6f45-4d3d-9eb1-5cdaecaf77b8/",
   "custom_cookbooks_source"=>{"type"=>"archive", "url"=>"https://s3.amazonaws.com/opsworks-demo-assets/opsworks-linux-demo-cookbooks-nodejs.tar.gz", "username"=>nil, "password"=>nil, "ssh_key"=>nil, "revision"=>nil},
   "name"=>"My Sample Stack (Linux)",
   …
    "data_bag"=>"aws_opsworks_stack"}}

As a practical example of how I would use search in one of my recipes, I’ll look up the current instance’s root device type and layer ID:

myself = search(:aws_opsworks_instance, "self:true").first
...
Chef::Log.info "My root device type is #{myself['root_device_type']}"
[2015-12-16T18:19:55+00:00] INFO: My root device type is ebs
...
Chef::Log.info "I am a member of layer #{myself['layer_ids'].first}"
[2015-12-16T18:20:17+00:00] INFO: I am a member of layer a8127c0d-749a-4192-aad7-8e512c8942b4
...

And just to make it clear that this shell isn’t just about Chef, but about Ruby code in general, here’s a Ruby snippet that would list all files and directories below /tmp, without using Chef:

Dir.glob("/tmp/*")
=> ["/tmp/npm-1967-e4f411bc", "/tmp/hsperfdata_root"]

After I’m done exploring, I can leave the shell by typing exit or by pressing Ctrl+D.

Summary

By using Pry in the middle of a Chef run, you can inspect the data that’s available during the run. If you’re troubleshooting a failed run by making a change on your workstation, updating cookbooks on your instance, and triggering another deployment, using this approach can save you a significant amount of time.

There’s no need to limit yourself to a single Pry session. If there are more areas in your code you need to explore, just put binding.pry in the appropriate place in your cookbook. Keep in mind, though, that you don’t want to permanently include this in your recipe, so don’t put this kind of a change under version control.