Interactive examples for learning jq

Daily Digest email

Get the top HN stories in your inbox every day.

Octabrain

For me, bash and jq are, literally, the opposite of riding a bicycle. It doesn't matter the amount of time I spend on a given week working with them, a month later, I am gonna have to skim through my bookmarks and Kagi results (and now also chatGPT) for knowing how to do stuff I was easily doing a month ago.

absoluteunit1

I also observed this when using most cli tools… I think it’s a common problem for tools you have to reach for a couple times a month/quarter (versus programming language when you’re coding almost everyday)

My solution was literally to create Anki cards every time I discover a neat feature that I might not remember but it would be useful too. I just go through it once a day for 10 minutes (my anki cards) and it works like a charm. My memory for various cli tools has drastically improved. Rarely do I need to reach for Google, man docs or ChatGPT for most cli tools usages. I’d recommend spaces repition for cli tools

mistrial9

new to me! link https://en.wikipedia.org/wiki/Anki_(software)

absoluteunit1

Thanks for adding the link!

luckman212

I do similar things with Obsidian (markdown).

hiAndrewQuinn

I get Bash, but for jq I found that my small fusillade of Anki flash cards was more than enough to get a fingertip feel for its syntax. Amazing what 50 flashcards of jq (or awk, or sed, or regexes, or any DSL really) gets you in the long run.

absoluteunit1

I wrote my response about using anki cards for this right before seeing your comment ! :D

Happy to see others have been doing that too!

ta8645

It's more important to understand the possibilities than remember the details. Details can always be quickly looked up, as long as you know what to look for and can conceptualize which tools to combine to achieve a goal.

racl101

Agreed. Somethings I accept I will never mentally memorize nor muscle memory memorize. JQ is one of them. So this is a great reference.

I will never truly memorize how to use this because it's not my primary goal, nor is it the end product to process data.

Rather, it is a means to a means to a means to an end.

bobbylarrybobby

If I find myself struggling with a task I’ve done a handful of times, I just make a page for it in obsidian with the snippet I need and an explanation of how it works.

sorenjan

I save snippets in a markdown file for that reason.

ollysb

ChatGPT is a great UI to both though

adamgordonbell

I always struggled with understanding JQ. Each time I was just googling things. But actually it does make a lot of sense if you understanding the building blocks. I wrote it all down [1] but here is my summary:

jq lets you select elements like it's a JavaScript object using dot notation and array indexing.

    jq '.key.subkey.subsubkey'
    jq '.key[].subkey[2]'

You can turn wrap things in array constructors, or object constructors to create new objects and lists:

    jq '[ .[].key ]'
    jq '{key1: .key1, key2: .key2}'

You can combine filters with pipes (|) to build complex transformations. Built-ins like map() and select() are useful for transforming arrays.

You put it all together into something like this:

    curl https://api.github.com/repos/stedolan/jq/issues | 
      jq 'map({title: .title, labels: .labels}) | 
       map(select(.labels)) |
       map({issue: .title}) |
       sort_by(.issue) | 
       [{issues: .[]}]

This query fetches GitHub issues, transforms them into a simplified structure, filters out unlabeled issues, sorts them, and wraps the results in an array - demonstrating how you can chain together jq's query language to wrangle JSON data.

[1]: https://earthly.dev/blog/jq-select/

kspacewalk2

Had to add `-L` to the curl command to follow redirects, since their API endpoints seem to have changed.

evgpbfhnr

I was curious so looked up how it works before reading the summary at the end, and that led me to find another user of aioli.js jq implementation: https://jiehong.gitlab.io/jq_offline/ (featured https://news.ycombinator.com/item?id=28627172 two years ago); jqplay.org still sends all the data on every modification so they should learn from it...

Anyway, this article is neat! Good work!

If I were to nitpick one of the last examples with path has no explanation and flew over my head (would have to open the documentation), and a reset button for each example might be nice after messing with it a bit, but it was a nice play.

ishandotpage

Hey, thanks for your kind words!

Regarding the reset button: I think that's a great suggestion and now it bugs me so much that I can't reset it. I'll add a reset button later tonight when I'm off work.

Regarding the confusing example: Yes, some of the examples are missing explanations (mainly because I spent more than a month on this post and I just did not want to put off putting it out any longer). Sorry haha. I'll try to improve the explanations and add more.

nonlogical

JQ is an insanely powerful language, just to put to rest any of your doubts about what it is capable of here is an implementation of JQ... in JQ itself:

https://github.com/wader/jqjq

It really is a super cool little, super expressive nearly (if not entirely) turing complete pure functional programming language.

You can:

* Define your own functions and libraries of functions

* Do light statistics

* Drastically reshape JSON data

* Create data indexes as part of you JQ scripts and summarize things

* Take JSON data, mangle it into TSV and pipe into SQLite

  cat data.json | jq '<expr>[]|@tsv' | sqlite3 -cmd ".mode tabs" -cmd ".import /dev/stdin YourTable"

And also for prototyping you can also use it to tailor output of APIs to what you need in a pinch, using JQ as a library especially with something like python:

https://pypi.org/project/jq/

As a part of the library you can compile your expressions down to "byte-code" once and reuse them.

Saying JQ is a best kept secret is an understatement. JQ gets more amazing the deeper you dig into it. Also it is kind of crazy fast for what it is.

edit: Formatting fixes

seanp2k2

JQ + journald is great too, but 20 years of muscle memory writing bash / python / perl / awk / sql / ruby / JS / CSS selectors / xpath / xmlstarlet one-liners keep getting in my way. I keep long notes on both with examples of common tasks. I still dislike yaml (significant whitespace is my “ick” as the kids say) too much to learn whatever the equivalent is for that and still find CSV/TSV easier to slice and dice at will due to my own personal history.

I’m sure at this point that many ETL jobs in notebooks we run at $BigCo today could be reduced to jq expressions that run 100x faster and use 1/10th the memory.

jameshart

The ‘nearly’ Turing complete is something I wonder about. It feels like jq might have some limitations - transformations it can’t do, due to some inherent limitation of how it handles scope or data flow. The esoteric syntax makes it hard to determine sometimes whether what you are attempting is actually possible.

As soon as jq scripts reach a certain level of complexity I break out to writing a node script instead.

And given how rapidly jq scripts acquire complexity, that level is pretty low. One nested lookup, and I’m out.

nonlogical

jq does often feels like a code golf language. I would say it does have some of those Perl one liner vibes, that is to say that it is often a write-only language.

Also the ‘nearly’ part is because I don’t remember if it has infinite loops or if it is more like Starlark and thus decidable. I do have vague recollections of causing infinite cycles in JQ, it quite as well could be entirely Turing complete.

So far I have not found a single task that JQ was incapable of. And I have abused it pretty bad on my spare time =], for intellectual challenge.

cryptonector

jq lacks coroutines, which means some tasks can be hard to accomplish in jq. It's still a very powerful language, and it is Turing complete, not just nearly.

wwader

Thanks for the jqjq shoutout! :) i'm quite sure jq is turing complete, jq (and jqjq!) can implement brainfuck https://github.com/01mf02/jaq/blob/main/examples/bf.jq

nonlogical

Thank you so much for piecing together a great example (jqjq) to help open everyone’s eyes that JQ is not just a JSONpath implementation with weird syntax! I often reference it to drive home the fact that JQ is a full blown language.

The brainfuck one is also gonna be going into my notes. That implementation is quite a terse implementation.

wwader

Great to hear and that was one of my hopes! but honestly it initially came to be because i was fiddling with some jq AST-tree stuff for fq :) weirdly it was much easier to implement than i expected. Hardest part was how to handle infix operators, +/- etc, parsing without infinite recision. But once i found and managed to implement precedence climbing things got a lot easier, it's still a bit of magic to me how well it works :) the eval part had some difficulties but mostly straight forward when you can piggy-back on the "host jq", but i tried to stay away from piggy-back too much, to not piggy-back at all probably requires implement a VM somehow.

BTW your very welcome to help improve the jq documentation. Me and some other maintainers have been talking about that it probably needs an overhaul to be more approachable and also better document some nice hidden features. Join the discord if you want!

zmmmmm

For whatever reason jq is one tool that I simply can never remember the syntax for. It's ChatGPT every time for me. I just can't remember the specifics of how it differs from jsonpath vs jmespath (used by AWS) .... I wish there was a way for every tool to just use jsonpath instead.

imp0cat

Gron( + grep) can also be a handy combination is this case.

hiAndrewQuinn

+1 for gron. Breaking a JSON field like { "foo": { "bar": ["baz", "bop"]}} into a greppable stream like

foo.bar.[0] = baz

foo.bar.[1] = bop

and then copying and pasting the results into jq makes the whole iteration loop much much tighter.

pacha--

Not a 1:1 replacement, but I created https://github.com/pacha/cels because I wanted to have a more intuitive way of working with JSON and YAML files

reegnz

JSONpath and jmespath are objectively worse than the jq language though, as you can only query, but jq also allows for very powerful transformations.

As for learning it, it's the same with any tool, key is repetition and regular use.

zmmmmm

i get that it's more powerful but I almost consider it an anti-feature. I have very good tools for doing transformations (sed,awk,perl, etc etc) the problem is they all want line oriented format and JSON breaks that. So all I want is a tool to go from json => line oriented and I will do the rest with the vast library of experience I already have at transformations on the command line.

reegnz

> So all I want is a tool to go from json => line oriented and I will do the rest with the vast library of experience I already have at transformations on the command line.*

The tool for that is likely https://github.com/tomnomnom/gron It's probably the best tool to go back and forth easily between json and line oriented.

cryptonector

`jq -c paths` will show you all the paths in a JSON text. You can totally use jq in a very gron-like manner. Try it.

cryptonector

We need to write a new tutorial.

The first and foremost thing to know about jq is that it's built on path expressions, so the first thing to learn is how to write path expressions. Fortunately path expressions are easy in jq!

  .a    # Get the value of the "a" key
        # in the current input object

  .[0]  # Get the value of the first
        # element in the current input
        # array

  .a[0] # Get the value of the first
        # element in the array at the
        # key named "a" in the current
        # input object.
        #
        # I.e., path expressions chain:

  .a[0].b # Get the value of the "b"
          # key in ...

Things get more interesting when you see that `.[]` is the iterator operator, and that you can use it in path expressions.

Things get really interesting when you see that `select(conditional expression)` can be used in path expressions joined with `|`.

Just this can be very useful. It's also useful to know about the magic `path()` function, and `paths`, which I often use to just list all the paths in an input JSON text. Try applying `jq -c paths` to a `kubectl get -o json pods` command's output!

navels

This is my main usage of ChatGPT :-)

Calzifer

One feature I found very useful when using jq in Bash is the "alternative operator" '//'.

  result=$(echo "$data" | jq -r '.optional // ""')
  if [[ -n "$result" ]] ...

feels more natural in Bash than

  result=$(echo "$data" | jq -r '.optional')
  if [[ "$result" != null ]] ...

Especially if an empty field should be handled the same way.

And when using the raw-output option it helps with the ambiguity between "null" and null.

city41

Since it's a JSON tool, I wish it leaned more on JavaScript syntax. In this case I wish it was || or even better, ??

llimllib

I suspect they didn't use || because it makes parsing easier, given jq's reliance on the pipe operator.

It doesn't use ?? because it predates that operator's introduction into the language by about 8 years

reegnz

Let me drop a link to my jq zsh plug-in: https://github.com/reegnz/jq-zsh-plugin

I find the biggest problem with jq is that the feedback loop is not tight enough. With this jq-repl the expression is evaluated at every keystroke.

rustyminnow

Nice!

Let me piggyback to mention the (neo)vim plugin I use for tightening the loop... https://github.com/phelipetls/vim-jqplay

It's great for building large complex queries that will eventually live in scripts, but your zsh plugin seems to hit a real sweet spot of fast feedback for ad-hoc queries too! Huge props!

reegnz

Yeah, I tend to use that one as well, but for me it just feels 'right' as a line editor plugin. I'm running a lot of kubectl commands and for me this plugin proved to be invaluable.

neuromanser

https://github.com/reegnz/jq-zsh-plugin/blob/e61804e35a593ad...

zshbuiltins(1): Unlike parameter assignment statements, typeset's exit status on an assignemt that involves a command substitution does not reflect the exit status of the command substitution. Therefore, to test for an error in a command substitution, separate the declaration of the parameter from its initialization.

reegnz

Fixed: https://github.com/reegnz/jq-zsh-plugin/pull/25

ykonstant

Nice plugin; I got it and will be using it. Browsing the code, I saw a couple of small errors; not too serious, but some error handling is incorrect. In your `jq_complete()` function, for instance, you have

    local query="$(__get_query)"
    local ret=$?

Unless the `local` assignment to `query` fails, `ret` will always be 0 regardless of the return value of `__get_query`. To fix this, you would need your first line to be

    local query; query="$(__get_query)"

and so on.

neuromanser

I pointed out exactly this in https://news.ycombinator.com/item?id=38188500

reegnz

Fixed: https://github.com/reegnz/jq-zsh-plugin/pull/25

seanp2k2

Nice, esp reading Calzifier’s comment above and remembering how many times I’ve cursed the JQ syntax because of quoting issues…another “trick” I’ve been using is for any non-trivial JQ filter, stick it in a file or at least a heredoc and feed it to JQ using -f for much less quote-escaping malarkey.

bazzargh

nice, but... I'd written something like this (as a program you pipe to, not autocomplete) before, but when there's an error, I try to show the error then the last-good-output. The reason for this is that when you're typing a complex command you want to have the json visible to guide your thinking, just displaying the error hides it.

The way I did this was to store both the last working query and the last working output, I'd only reuse it if the last working query was a prefix of the current query - that avoids the awkward case where you are deleting letters from the output, so you need an output further back in history (which I didn't store, wasn't worth the hassle)

Feature request?

reegnz

Thanks for the idea, I've implemented it: https://github.com/reegnz/jq-zsh-plugin/commit/60d3b6fb3ca1b...

reegnz

Nice idea! I'll look into it.

tejtm

Learning jq is great however you still need to know something about every new scrap of json you feed it.

To than end I wrote a line of jq to emit every structural path from any json as a list of jq arguments.

You can use it to make queries or keep track of a documents structure.

https://github.com/TomConlin/json_to_paths

Marazan

For me jq is my epitome of "When faced with a problem a programmer says 'I know I can use X' and now they have two problems"

I continually bounce off the "language/philospohy" of jq in quite embarrassing ways. Every time I go "Ah, I can use this as a reason to learn jq and half an hour lateI've written a python script to extract the data instead.

marliechiller

x1000 this. I find I have similar reasoning that I apply to awk. I _know_ some people get massive benefits out of using it, I just dont need it often enough to actually pick it up... GPT to the rescue I suppose

zwischenzug

I started writing a book on jq, but realised it wasn't really enough for a full book, so put it out as a series of blog posts:

https://zwischenzugs.com/2023/06/27/learn-jq-the-hard-way-pa...

JQ really is the best kept secret in data.

ishandotpage

Oh my - your series of blog posts come up regularly for me when googling jq things.

tarruda

I'm curious about this web page: Did you compile jq targeting WASI/WASM to run an in-browser version?

ishandotpage

I am using https://github.com/biowasm/aioli which provides a already compiled wasm jq along with all the related support code for calling it

zwischenzug

Haha, well, we should probably talk!

ishandotpage

I'll shoot you an email after work

undefined

[deleted]

undefined

[deleted]

highmastdon

Great article. Nice to have it interactive. How does it work? Do you have a terminal running somewhere or does it run in the browser?

One thing I noticed, and where I stopped continuing, is that the jump from Filtering Nested Arrays to Flattening Nested JSON Objects, is WAAAY too big. From a simple filter to triple nested filters with keywords that had no introduction in a simpler example, isn’t working for me

jve

Seems he is using something called biowasm aioli: https://github.com/biowasm/aioli

> Aioli is a library for running genomics command-line tools in the browser using WebAssembly. See Who uses biowasm for example use cases.

https://biowasm.com/cdn/v3/jq/1.6

ollybee

Exercism also has a jq track with interactive lessons : https://exercism.org/tracks/jq

Daily Digest email

Get the top HN stories in your inbox every day.