Wednesday, November 5, 2014

Parsing JSON with Ansible ...

Recently, I have been doing a heap of work around the automation
and provisioning of resources with AWS cloud services. This entails working
frequently with the AWS API.

A useful filter we have been using in Ansible is taking output from a shell
action and turning it into something we can consume via variables.

An example playbook is below:

- shell: |
    export AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY
    export AWS_SECRET_ACCESS_KEY=$AWS_SECRET_KEY
    lib/ec2.py --list --refresh-cache
  register: output
- set_fact:
    ec2_output: "{{ output.stdout|from_json }}"
- shell: "rm addresses.txt"
- shell: "echo '{{ item.value['ec2_private_ip_address'] }}' >> addresses.txt"
  when: item.value["ec2_tag_Name"] is defined and item.value["ec2_tag_Name"] == "deis-{{ environment_unique_id }}"
  with_dict: ec2_output._meta.hostvars
- shell: "cat addresses.txt"
  register: output
- add_host:
    hostname: "{{ output.stdout_lines.0 }}"
    groupname: "deis_node"
- debug: var={{ groups['deis_node'].0 }}

This example uses an Ansible plugin script ec2.py to retrieve a list of instances
running in an AWS account. It then filters out the ones that we want via the
`ec2_tag_Name` key and adds the private IP address to a file.

After running the following shell action, `output` holds a heap of information
about what the commands did and the output from stdout and stderr (if there was
any.)

- shell: |
    export AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY
    export AWS_SECRET_ACCESS_KEY=$AWS_SECRET_KEY
    lib/ec2.py --list --refresh-cache
  register: output

The Ansible output for the registered variable output looks something like this:

TASK: [debug var=output] ******************************************************
ok: [localhost] => {
    "output": {
        "changed": true,
        "cmd": "export AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY \nexport AWS_SECRET_ACCESS_KEY=$AWS_SECRET_KEY \nlib/ec2.py --list --refresh-cache",
        "delta": "0:00:05.268682",
        "end": "2014-10-05 11:43:05.235594",
        "invocation": {
            "module_args": "export AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY \nexport AWS_SECRET_ACCESS_KEY=$AWS_SECRET_KEY \nlib/ec2.py --list --refresh-cache",
            "module_name": "shell"
        },
        "rc": 0,
        "start": "2013-10-05 11:42:59.966912",
        "stderr": "",
        "stdout": "<suppressed to keep it short>",
        "stdout_lines": [
            "{",
            "  \"_meta\": {",
            "    \"hostvars\": {",
            "      \"xx.xxx.xxx.xxx\": {",
            "        \"ec2__in_monitoring_element\": false, ",
            "        \"ec2_ami_launch_index\": \"0\", ",
            "        \"ec2_architecture\": \"x86_64\", ",
            "        \"ec2_client_token\": \"xxxxxxxx-xxxx-xxxx-xxxx-c55977fc0029_us-east-1a_1\", ",
            "        \"ec2_dns_name\": \"ec2-xx-xxx-xxx-xxx.compute-1.amazonaws.com\", ",
            "        \"ec2_ebs_optimized\": false, ",
            "        \"ec2_eventsSet\": \"\", ",
            "        \"ec2_group_name\": \"\", ",
            "        \"ec2_hypervisor\": \"xen\", ",
            "        \"ec2_id\": \"i-xxxxxx\", ",
            "        \"ec2_image_id\": \"ami-xxxxxx\", ",
            "        \"ec2_instance_profile\": \"\", ",
            "        \"ec2_instance_type\": \"m1.large\", ",
            "        \"ec2_ip_address\": \"xx.xxx.xxx.xxx\", ",
            "        \"ec2_item\": \"\", ",
            "        \"ec2_kernel\": \"aki-xxxxxxxx\", ",
            "        \"ec2_key_name\": \"ambari\", ",
            "        \"ec2_launch_time\": \"2013-10-01T04:53:00.000Z\", ",
            "        \"ec2_monitored\": true, ",
            "        \"ec2_monitoring\": \"\", ",
            "        \"ec2_monitoring_state\": \"enabled\", ",
            "        \"ec2_persistent\": false, ",
            "        \"ec2_placement\": \"us-east-1a\", ",
            "        \"ec2_platform\": \"\", ",
            "        \"ec2_previous_state\": \"\", ",
            "        \"ec2_previous_state_code\": 0, ",
            "        \"ec2_private_dns_name\": \"ip-xx-xx-x-xxx.ec2.internal\", ",
            "        \"ec2_private_ip_address\": \"xx.xx.x.xxx\", ",
            "        \"ec2_public_dns_name\": \"ec2-xx-xxx-xxx-xxx.compute-1.amazonaws.com\", ",
            "        \"ec2_ramdisk\": \"\", ",
            "        \"ec2_reason\": \"\", ",
            "        \"ec2_region\": \"us-east-1\", ",
            "        \"ec2_requester_id\": \"\", ",
            "        \"ec2_root_device_name\": \"/dev/sda1\", ",
            "        \"ec2_root_device_type\": \"ebs\", ",
            "        \"ec2_security_group_ids\": \"sg-xxxxxxxx\", ",
            "        \"ec2_security_group_names\": \"ambari\", ",
            "        \"ec2_sourceDestCheck\": \"true\", ",
            "        \"ec2_spot_instance_request_id\": \"\", ",
            "        \"ec2_state\": \"running\", ",
            "        \"ec2_state_code\": 16, ",
            "        \"ec2_state_reason\": \"\", ",
            "        \"ec2_subnet_id\": \"subnet-xxxxxxxx\", ",
            "        \"ec2_tag_Name\": \"hdpmaster1\", ",
            "        \"ec2_tag_aws_autoscaling_groupName\": \"ambari-LargeClusterGroup-148W3OVR9LSSE\", ",
            "        \"ec2_tag_aws_cloudformation_logical-id\": \"LargeClusterGroup\", ",
            "        \"ec2_tag_aws_cloudformation_stack-id\": \"arn:aws:cloudformation:us-east-1:167609138788:stack/ambari/xxxxxxxx-xxxx-xxxx-xxxx-50fa526be49c\", ",
            "        \"ec2_tag_aws_cloudformation_stack-name\": \"ambari\", ",
            "        \"ec2_tag_long_hostname\": \"hdpmaster1.domain.com\", ",
            "        \"ec2_virtualization_type\": \"paravirtual\", ",
            "        \"ec2_vpc_id\": \"vpc-xxxxxxxx\"",
            "      }, ",
            "}"
        ]
    }
}

As you can see stdout and stdout_lines aren't in a great format to do much with,
here enters from_json:

- set_fact:
    ec2_output: "{{ output.stdout|from_json }}"

This then allows us to do the following and `ec2_output` is now in a standard
JSON format that Ansible can then use to iterate over like so:

- shell: "echo '{{ item.value['ec2_private_ip_address'] }}' >> addresses.txt"
  when: item.value["ec2_tag_Name"] is defined and item.value["ec2_tag_Name"] == "deis-{{ environment_unique_id }}"
  with_dict: ec2_output._meta.hostvars

This Ansible task creates a file that contains all of the private IP addresses
of ec2 instances that match the `ec2_tag_Name` criteria.

We could then take the first of these IP addresses and use it later in our playbook:

- shell: "cat addresses.txt"
  register: output
- debug: var={{ output.stdout_lines.0 }}

This becomes a very powerful pattern when you mix in RESTful APIs, calls to them
and processing output from them.

2 comments: