Our beloved GNU utils, especially sed and awk, work better with some file types than others. JSON, YML or XML files can be a bit of a pain to work with.
Jq is a parser specifically designed to handle JSON files, and thereās a bonus tool at the end for YML and XML files as well!
The basics
Letās take a simple JSON as an example: run curl https://til.hashrocket.com/api/developer_posts.json?username=doriankarter
on your command line to see the data.
Since this data is presented as a one-liner, we can use jq
to format the output:
We can query the interesting bits and remove some noise simply by referring to its node name:
To output the data as an array we can just enclose the query in []
:
Or we can do some interesting manipulation to the data and present a parsed version:
Again, we are accessing the data by their node name and doing some string concatenation.
Notice how we use a pipe (|
) to pass the data from one command to the next.
The not so basics
This tool has a bunch of very useful functions available, weāll go over a few of them.
From now on, there will be no reference to the curl
command to keep the code blocks more concise.
Delete node
Use it to clear out unwanted noise:
Filter data
Select only the entries that match the given condition:
Add a node
You can add nodes to the JSON:
Conditional logic
Following the previous example, we can use if
statements to add a node with variable content.
Here we create a new one called IS_VALID
with the value "Too short!"
or "yes"
depending on the length of the .title
.
Perhaps more useful, we can add the new node or not depending on the condition:
Group by
Group nodes by values using group_by()
:
Notice how in this case we create a new array with the data before sending it to group_by()
.
Sort by length
Sorting is also possible and can the result be reversed if needed:
Notice that we add a .len
node with the result of passing .title
to the length
built-in function.
Modify in place
So far weāve always focused on the content of the posts
array, losing it and the data
node names in the process.
This might be what you want, but in some cases one needs to modify the data āin placeā, keeping the original data structure.
This can be done swapping the pipe operator (|
) for the modify-in-place operator (|=
), so for this simple example from before:
If we wanted to modify the original data structure including the data
and posts
node names, we could instead do:
Handle other file types with yq
Since this is so useful, someone took the time to create yq
(as in YAML query). It actually doesnāt just handle YAML files, but also XML, CSV and TSV.
Not only that, you can easily use this application to convert one file type into another!
Check the docs to find out more.
Keep in mind that apart from what is shown below, all the previous operations can be applied to any of these file types.
Since yq
uses similar syntax as jq
, Iāll keep it out of the examples to keep things simple.
This is just a quick overview of how you might want to use the tool, it can achieve much more than Iām showing here.
YAML to other types
For a cool.yaml
file of the structure:
The command yq -o xml '.' your_cool.yaml
would output it with XML structure:
Or you can run it like yq -o json '.' your_cool.yaml
to get a JSON instead:
Any Input, Any Output
Say you have a cool.csv
file of the structure:
Convert it to YAML with yq -o yaml -p csv '.' your_cool.csv
:
Again, use the -o
flag to change the output format yq -o json -p csv '.' your_cool.csv
:
Notice the use of the -p
flag to indicate the input format, since by default it will expect a YAML.