How to handle JSON

2019-07-22

JSON data is ubiquitous, constantly flowing between web services. But when you have a largish blob of the stuff how do you inspect it's structure or quickly extract the piece you need?

Enter the handy little utility jq. jq is a tool to query, filter, reshape, and otherwise be your JSON swiss army knife.

Lets dive into how it works. As our example JSON data we'll be using the list of IP addresses for AWS services that is published at: ip-ranges.amazonaws.com/ip-ranges.json

Here is a sample from the head of the ip-ranges.json file:

{
  "syncToken": "1563369545",
  "createDate": "2019-07-17-13-19-05",
  "prefixes": [
    {
      "ip_prefix": "18.208.0.0/13",
      "region": "us-east-1",
      "service": "AMAZON"
    },
    {
      "ip_prefix": "52.95.245.0/24",
      "region": "us-east-1",
      "service": "AMAZON"
    },

One of the simplest things we can do with jq is access a property using the dot . operator. Lets get the creation date of the file:

jq .createDate ip-ranges.json

Another handy feature is that . pretty prints it's output, and can be used alone to prettify a JSON file.

Lets try something more fun, and more "query"-like, lets create a list of all AWS regions:

jq '[.prefixes[].region, .ipv6_prefixes[].region] | unique' ip-ranges.json

Lets break that down. The .prefixes makes sense, that is another property access like in the first sample. `.prefixes` is an array of objects, which we then iterate over with [] which should remind one of JSON's own array syntax. For each object in the array we then pull out the .region key. Then things get cooler. We have two arrays that we would like to combine, which can be done with the comma , operator. We then end up with a list, which can be converted back into an array by surrounding the entire query with another [].

Notice the placement of the single quotes in the statement above, we are not taking the output of jq and then using a unix pipe to sent it to uniq, rather jq includes a built in idea of pipes, and many useful functions.

Here we are piping our 2,233 line array — cough

jq '[.prefixes[].region, .ipv6_prefixes[].region] | length' ip-ranges.json

— to jq's unique returning a sorted list of regions.

Lets do one more: what are all the current ipv4 addresses, for EC2 in us-west-2?

jq '.prefixes[] | select(.service=="EC2" and .region=="us-west-2") \
 | .ip_prefix' ip-ranges.json

I'll leave interpreting it as an exercise for the reader.

jq is really pretty awesome, but as usual, there is more than one way to do it. If jq isn't quite your cup of tea, then there is an entire set of related tools: