jq Masterclass — JSON Surgery at the Command Line

What Is jq

JSON SURGERY

jq is a lightweight, zero-dependency command-line JSON processor written in C. It does for JSON what sed and awk do for text — but with a full query language, type system, recursive descent, and functional programming primitives. In 2026, every API response, every agent output, every log stream is JSON. jq is how you interrogate all of it.

Dependencies

Language

stdin

Input

stdout

Output

∞

Composability

NDJSON

Streaming Mode

The Mental Model

jq reads JSON and applies a filter to it. A filter takes JSON in and produces JSON out. Filters compose with the pipe operator |. The simplest filter is . — identity, returns whatever it receives. Everything else builds on this.

$ curl api.example.com/users | jq '.users[] | select(.active) | {name: .name, email: .email}'

jq

The program. Reads stdin (or files), applies filter, writes to stdout

'...'

Single-quoted filter string (prevents shell interpolation)

.users[]

Path expression: get users key, iterate array

| select(.active)

Pipe to filter: keep only where active is truthy

{name, email}

Object construction: build a new JSON object

THE PIPE PHILOSOPHY jq is a Unix pipe citizen. It reads from stdin, writes to stdout. Chain it: curl api | jq '.items[]' | jq 'select(.price > 100)'. Or chain inside one jq filter with |. The pipe is the unit of composition — everything is a transformation of JSON flowing through.

First 10 Filters — Zero to Useful

JQ REPL

INPUT JSON

{ "name": "Alice", "age": 32, "active": true, "scores": [88,92,76], "meta": { "role": "admin", "dept": "eng" } }

OUTPUTS

. → identity (pretty-print) .name → "Alice" .age → 32 .active → true .scores[0] → 88 .scores[-1] → 76 .scores[1:3] → [92,76] .meta.role → "admin" .scores | length → 3 .scores | add → 256 keys → ["active","age","meta","name","scores"] values → [true,32,{...},"Alice",[...]] type → "object" has("age") → true

jq '. | {n: .name, avg: (.scores | add / length)}' → {"n": "Alice", "avg": 85.33}

Essential CLI Flags

Flag	Effect
-r	Raw output — strings without quotes (for shell use)
-c	Compact output — one JSON value per line (NDJSON)
-n	Null input — no stdin; use `null` as input
-s	Slurp — read all inputs into one array
-R	Raw input — read lines as strings (not JSON)
-e	Exit status 1 if output is false/null
--arg k v	Bind shell variable as `$k` string in filter
--argjson k v	Bind shell variable as `$k` JSON in filter
--slurpfile k f	Load JSON file as `$k` array
--rawfile k f	Load text file as `$k` string
--args	Remaining args become `$ARGS.positional` array
--tab	Tab-indented output
--indent N	N-space indented output (default 2)
-f file	Read filter from file instead of argument

Core Language

PATH EXPRESSIONS

Path expressions are the atoms of jq. Everything else — select, map, reduce — composes on top of these. Get the path language completely wired in your muscle memory first.

Path expressions — complete referencejq

# ── IDENTITY & RECURSION ──────────────────────────────
.                   # identity — pass input unchanged
..                  # recursive descent — every value in tree
.[]                 # iterate — outputs each element separately

# ── OBJECT FIELD ACCESS ───────────────────────────────
.name               # field by name (fails if not object)
.name?              # optional: no error if .name doesn't exist
.["field-name"]     # bracket syntax — needed for special chars
.a.b.c              # chain — equivalent to .a | .b | .c
.a.b?              # optional at any level — suppress errors

# ── ARRAY INDEX & SLICING ─────────────────────────────
.items[0]           # first element
.items[-1]          # last element
.items[2:5]         # slice [2,5) — elements at index 2, 3, 4
.items[2:]           # from index 2 to end
.items[:3]           # first 3 elements
.items[]            # iterate: outputs each element individually

# ── THE COMMA OPERATOR ────────────────────────────────
.a, .b              # outputs .a then .b as separate values
.users[].name,
.users[].email     # outputs all names then all emails

# ── PIPE ──────────────────────────────────────────────
.users[] | .name   # iterate users, get name of each
. | length          # pipe identity into length

# ── OBJECT CONSTRUCTION ───────────────────────────────
{name: .name}       # build new object
{name}              # shorthand: key = name, value = .name
{name: .first,
 score: .scores[0]}

# Dynamic keys:
{(.key): .value}    # key is a computed expression

# ── ARRAY CONSTRUCTION ────────────────────────────────
[.a, .b, .c]        # build array from 3 values
[.items[]]          # collect iterated values into array
[.items[] | .name] # map names → array
[range(5)]          # [0,1,2,3,4]

Optional Operator & Error Suppression

? IS YOUR BEST FRIEND Appending ? to any expression suppresses errors and just produces no output instead. Essential for heterogeneous JSON where not every object has the same keys. Without ?, accessing a missing key on a non-object type throws a fatal error and stops processing.

Optional operator patternsjq

# Without ?: blows up if any item isn't an object
.items[].name      # ERROR if any item is a string/number

# With ?: silently skips non-objects
.items[].name?     # safe — only outputs where .name exists

# Try-catch for controlled error handling:
try .name catch "N/A"

# Alternative operator // (like null-coalescing ??)
.name // "unknown"  # if .name is null/false, use "unknown"
.count // 0          # default numeric fallback
.tags // []          # default empty array

# CRITICAL: // is not the same as ?
# .x? → suppress error, produce nothing
# .x // "default" → produce "default" if null/false
# .x? // "default" → suppress error AND default if null

# Real-world: safe deep access
.user?.address?.city? // "unknown"

Types and Type Testing

Type systemjq

# jq types: null, boolean, number, string, array, object
type                  # → "object", "array", "string", etc.

# Test type:
. | type == "array"
. | arrays            # pass-through only if input is array
. | objects           # pass-through only if input is object
. | strings           # pass-through only if string
. | numbers           # pass-through only if number
. | booleans          # pass-through only if bool
. | nulls             # pass-through only if null
. | scalars           # pass-through if not array/object
. | iterables         # pass-through if array or object

# Type coercion:
tostring              # any → "string"
tonumber              # "42" → 42
ascii_downcase        # string → lowercase
ascii_upcase          # string → UPPERCASE

Selection & Iteration

FILTERS & MAPS

select, map, map_values, any, all, limit — these are the workhorses of day-to-day jq. Master these and you can handle 90% of real-world JSON manipulation tasks.

select — the jq WHERE clausejq

# select(cond): pass through if cond is true, produce nothing if false
# It is the jq equivalent of SQL WHERE.

# Basic:
.items[] | select(.active)
.items[] | select(.price > 100)
.items[] | select(.status == "active")

# String matching:
.items[] | select(.name | startswith("Alice"))
.items[] | select(.email | endswith(".com"))
.items[] | select(.url | contains("github"))
.items[] | select(.name | test("^[Aa]"))  # regex

# Boolean combinators:
.items[] | select(.active and .verified)
.items[] | select(.active or .legacy)
.items[] | select(.active | not)

# Membership test:
.items[] | select(.role | IN("admin","owner"))
.items[] | select(.id | IN($ids[]))   # --argjson ids '[1,2,3]'

# Null guard:
.items[] | select(.name != null)
.items[] | select(.price | numbers)  # type-safe select

# Collect results back into array (select inside []):
[.items[] | select(.price > 100)]
# ↑ equivalent to: .items | map(select(.price > 100))

select vs map(select(...)) .items[] | select(...) outputs filtered values as separate items. Wrap in [...] to collect: [.items[] | select(...)]. Or use .items | map(select(...)) — identical result, different readability. Choose based on what comes next: piping onward → bare iterate; building a JSON array → wrap in [].

select with shell argsbash

# Pass shell variables safely with --arg and --argjson:
STATUS="active"
jq --arg status "$STATUS" \
   '[.[] | select(.status == $status)]' data.json

# Numeric comparison with --argjson:
MIN_PRICE=100
jq --argjson min "$MIN_PRICE" \
   '[.[] | select(.price > $min)]' data.json

# Pass an array for IN() test:
IDS='[1,2,3,4]'
jq --argjson ids "$IDS" \
   '[.[] | select(.id | IN($ids[]))]' data.json

map and map_values — array/object transformjq

# map(f): apply f to each element of array
# Equivalent to: [.[] | f]

map(.name)               # extract name from each object
map(.price * 1.2)        # multiply each price by 1.2
map(tostring)            # convert each element to string
map(ascii_downcase)     # lowercase each string
map({name: .name, id: .id}) # reshape each object

# map(select(...)): filter an array (most common pattern!)
map(select(.active))     # keep active items
map(select(.price > 10)) # keep items where price > 10

# Chain map and select:
map(select(.active))
| map({name: .name, email: .email})
| sort_by(.name)

# ──────────────────────────────────────────────────────
# map_values(f): apply f to each VALUE in object or array
# For objects: preserves keys, transforms values
# For arrays: same as map()

{a: 1, b: 2, c: 3} | map_values(. * 10)
# → {"a":10,"b":20,"c":30}

map_values(. // "N/A")  # replace nulls with "N/A" in all values
map_values(tostring)   # stringify every value in object
map_values(if . == null then empty else . end)  # remove null values

# del: remove keys
del(.password)          # remove one key
del(.password, .ssn, .token) # remove multiple
del(.items[].internal) # remove key from each item
del(.items[] | select(.active | not))  # remove inactive items

Update operator |= — in-place modificationjq

# |= : update operator. Read as "update with".
# Takes the current value, applies the expression, stores result.

.price |= . * 1.1          # add 10% to price
.name  |= ascii_upcase    # uppercase the name
.tags  |= . + ["new-tag"]  # append to array
.count |= . + 1             # increment counter
.items[].price |= . * 0.9  # 10% discount on all items

# += shorthand (equivalent to |= . + x):
.count += 1
.tags  += ["extra"]
.price *= 1.1
.name  //= "anonymous"   # set if null

# Set a key that might not exist:
.meta.processed = true   # creates path if needed
.meta += {ts: "2026-04-05"}  # merge into existing object

Array builtins — the complete setjq

# ── REDUCTION ─────────────────────────────────────────
length                 # array: count; string: chars; object: keys
add                    # sum numbers, concat strings/arrays, merge objects
any                    # true if any element is truthy
all                    # true if all elements are truthy
any(. > 10)           # true if any element > 10
all(.active)           # true if all objects have active=true
first                  # first element
last                   # last element
first(.items[] | select(.active))  # first active item
nth(2; .items[])       # 3rd item (0-indexed)
min | max              # min/max of number array
min_by(.price)        # object with lowest price
max_by(.score)        # object with highest score

# ── SORTING ───────────────────────────────────────────
sort                   # sort array of scalars
sort_by(.name)        # sort objects by field
sort_by(.name,.age)  # multi-key sort
reverse                # reverse array
sort_by(.price) | reverse   # descending sort

# ── DEDUPLICATION ─────────────────────────────────────
unique                 # deduplicate (sorts first)
unique_by(.id)        # deduplicate by field (keep first seen)
unique_by(.email | ascii_downcase)  # case-insensitive dedup

# ── SET OPERATIONS ────────────────────────────────────
a | inside(b)         # true if a is a subset of b
contains([1,2])       # true if array contains [1,2] as subset
flatten                # fully flatten nested arrays
flatten(1)            # flatten one level deep only
indices("x")          # array of indices where "x" appears
index("x")            # first index of "x"
rindex("x")           # last index
zip                    # transpose: [[a,b],[c,d]] → [[a,c],[b,d]]
transpose              # alias for zip
combinations          # cartesian product: [[a,b],[c,d]] → [a,c],[a,d],[b,c]...
range(5)              # 0,1,2,3,4 (as separate outputs)
range(2;10;2)        # 2,4,6,8 (start;end;step)
[range(5)]             # collect into array: [0,1,2,3,4]

group_by — the jq GROUP BY group_by(.field) sorts the array then groups into sub-arrays of objects with the same field value. It returns an array of arrays. Combine with map to aggregate each group.

group_by — SQL GROUP BY equivalentjq

# group_by(.field): sort + partition by field value
group_by(.status)
# → [[{status:"a",...},{status:"a",...}],[{status:"b",...}]]

# Count per group (like COUNT(*) GROUP BY):
group_by(.status) |
map({
  status: first.status,
  count:  length
})

# Sum per group (like SUM(amount) GROUP BY status):
group_by(.status) |
map({
  status: first.status,
  total:  map(.amount) | add,
  avg:    (map(.amount) | add) / length
})

# Multi-key grouping:
group_by([.status, .region])

Object builtins + to_entries / from_entriesjq

# ── INTROSPECTION ────────────────────────────────────
keys                   # sorted array of object keys
keys_unsorted          # keys in insertion order
values                 # array of values (in key order)
has("key")            # true if key exists (even if value is null)
in({a:1})             # true if input key exists in given object
length                 # number of keys in object

# ── MERGING ──────────────────────────────────────────
{a:1} + {b:2}          # → {a:1, b:2} (right overwrites left)
. + {extra: "field"}  # add field to existing object

# ── to_entries / from_entries / with_entries ─────────
# to_entries: object → [{key,value}]
{a:1,b:2} | to_entries
# → [{"key":"a","value":1},{"key":"b","value":2}]

# from_entries: [{key,value}] → object
[{key:"x",value:99}] | from_entries
# → {"x": 99}
# Also accepts: {name,value} and {k,v} instead of {key,value}

# with_entries(f): to_entries | map(f) | from_entries
# The POWER move: transform keys or values of an object
with_entries(.key |= ascii_upcase)
# → all keys uppercased

with_entries(.value |= tostring)
# → all values stringified

with_entries(select(.value != null))
# → remove null-valued keys

with_entries(select(.key | startswith("_") | not))
# → remove all keys starting with underscore (strip private fields)

# Rename a key:
with_entries(if .key == "old_name" then .key = "new_name" else . end)

with_entries — The Most Underused Power Move with_entries(f) is jq's secret weapon for object manipulation. It converts an object to key-value pairs, applies a filter to each pair, then rebuilds. This lets you filter, rename, and transform keys AND values simultaneously — in one expression. Most jq beginners use to_entries | map(...) | from_entries; power users use with_entries.

Object construction patternsjq

# Build object from array (INDEX — like a hash map):
INDEX(.items[]; .id)
# → {"id1":{...}, "id2":{...}} — keyed by id

INDEX(.items[]; .email)
# Lookup table: O(1) access by email

# IN() — membership test (complement to INDEX):
$lookup | IN(keys[])

# Zip two arrays into key-value pairs:
["a","b"] as $keys |
[1,2]    as $vals |
[$keys[], $vals[]] | from_entries
# → {"a":1,"b":2}

if / then / else / elif — full syntaxjq

# Basic if/then/else — ALWAYS needs else in jq
if .active then "active" else "inactive" end

# Multiple branches with elif:
if   .score >= 90 then "A"
elif .score >= 80 then "B"
elif .score >= 70 then "C"
else "F" end

# Inside map/select:
map(.price |= if . > 100 then . * 0.9 else . end)
# 10% discount on items over $100

# jq has NO switch/case — use if/elif chains
# Or use a lookup object:
{
  "pending":   0,
  "active":    1,
  "cancelled": -1
}[.status] // 99       # lookup with default

# empty — produce no output (useful in conditionals):
if .active then . else empty end
# equivalent to: select(.active)

# error — throw a custom error:
if .id | not then error("id required") else . end

try / catch — error handlingjq

# try/catch: handle errors gracefully
try .name catch null          # null on error
try tonumber catch 0          # 0 if not a number
try (. | fromjson) catch .   # return original if not valid JSON

# The error message is available in catch:
try .a.b.c catch {error: ., input: $__loc__}

# try without catch: suppress all errors (like ? but for expressions)
try (.data | fromjson | .result)

# Practical: parse JSON strings safely in a stream:
.events[] | try (.payload | fromjson)
# silently skips events with non-JSON payloads

Shape Shifting

DATA TRANSFORMS

Reshaping JSON — pivoting, flattening, indexing, walking recursive trees. These patterns handle the gnarly real-world JSON structures that APIs return when they think they're being clever.

Recursive Descent — Walking Deep Trees

.. and walk — recursive operationsjq

# .. recursive descent: outputs EVERY value in the tree
# (all scalars, all sub-objects, all sub-arrays)

# Find all strings anywhere in a deep JSON tree:
.. | strings

# Find all numbers:
.. | numbers

# Find any value matching a condition (anywhere in tree):
.. | objects | select(has("error"))
# → find all objects at any depth that have an "error" key

# Extract all values for a key anywhere in deep JSON:
.. | objects | .id?
# → all "id" values, at any nesting depth

# Extract all error messages from a complex API response:
.. | strings | select(test("error|Error|ERROR"))

# walk(f): applies f bottom-up to every node in tree
# Useful for: type coercion, normalization, transformation

# Stringify all numbers in any JSON structure:
walk(if type == "number" then tostring else . end)

# Remove all null values recursively:
walk(
  if type == "object" then
    with_entries(select(.value != null))
  else .
  end
)

# Normalize all keys to snake_case (simplified):
walk(
  if type == "object" then
    with_entries(
      .key |= gsub("(?<=.)(?=[A-Z])"; "_")
           | ascii_downcase
    )
  else . end
)

paths — structural introspection

paths / getpath / setpath / delpathsjq

# paths: emit all path arrays to leaf values
{a:{b:1},c:2} | paths
# → ["a","b"]
#    ["c"]

# paths(filter): only paths to values matching filter
paths(numbers)           # paths to numeric values only
paths(strings)           # paths to string values
paths(type == "null")  # paths to null values

# leaf_paths: paths to scalar (non-container) values
leaf_paths

# getpath / setpath — dynamic path access:
getpath(["a","b"])        # same as .a.b
setpath(["a","b"]; 99)   # set .a.b = 99
delpaths([["a","b"]])    # delete .a.b

# Dynamic path from string — parse "a.b.c" to ["a","b","c"]:
"user.address.city" | split(".") | getpath(.; $data)

# Find and replace at all matching paths:
[paths(type == "string")] |
reduce .[] as $p ($data;
  setpath($p; getpath($p) | ascii_downcase)
)
# Lowercase every string value in entire document

env — Access Environment Variables env returns the entire process environment as a JSON object. env.HOME gives $HOME. $ENV is an alias. This lets jq filters read env vars without needing --arg, which is powerful for config-driven filters in scripts.

env + $ENV in filtersjq

env.HOME               # → "/home/alice"
env.OPENAI_API_KEY    # → "sk-proj-..."
$ENV.LOG_LEVEL        # same as env.LOG_LEVEL

# Filter only keys matching a prefix:
env | with_entries(select(.key | startswith("AWS_")))

String Manipulation

STRINGS & FORMAT

jq has a rich string toolkit: interpolation, regex, splitting, joining, and format strings that produce CSV, TSV, HTML, URI-encoded output. All PCRE-compatible regex.

String operations — complete referencejq

# ── INTERPOLATION ─────────────────────────────────────
"Hello, \(.name)!"      # string interpolation with \(expr)
"\(.first) \(.last)"  # combine fields
"Score: \(.score*100|floor)%"  # expressions inside
"\(.items|length) items found"

# ── BASIC OPERATIONS ──────────────────────────────────
length                   # character count
ltrimstr("prefix")    # remove prefix if present
rtrimstr(".json")     # remove suffix if present
startswith("http")    # boolean
endswith(".com")       # boolean
ascii_downcase          # lowercase
ascii_upcase            # UPPERCASE
explode                 # string → [codepoints]
implode                 # [codepoints] → string
split(",")              # → array of strings
join(",")               # array → join with separator

# ── REGEX (PCRE) ──────────────────────────────────────
test("pattern")        # boolean: does string match?
test("pattern";"gi")  # with flags: g=global, i=ignorecase, x=extended
match("(\\w+)@(\\w+)")  # first match with captures
capture("(?<user>\\w+)@(?<domain>\\w+)")
# → {"user":"alice","domain":"example"}
scan("\\d+")           # all non-overlapping matches
sub("foo";"bar")      # replace first
gsub("foo";"bar")     # replace all
gsub("(?<n>\\d+)"; (.n|tonumber*2|tostring))
# Double every number in a string — captures as named group

splits("[,;|]")         # split on any of: comma, semicolon, pipe
scan("\\b\\w+\\b")     # extract all words

Format Strings — @base64, @csv, @tsv, @uri, @html, @json, @sh

Format strings — output encodingjq

# Format strings: @format applied to string interpolations

# @base64 — encode/decode
"hello world" | @base64      # → "aGVsbG8gd29ybGQ="
"aGVsbG8=" | @base64d        # → "hello"

# @uri — URL encode
"hello world/foo" | @uri     # → "hello%20world%2Ffoo"
"@uri "\(.name)/\(.id)""      # encode in a template

# @html — escape HTML
"<b>bold</b>" | @html       # → "&lt;b&gt;bold&lt;/b&gt;"

# @csv — format array as CSV line
["Alice",32,true] | @csv     # → "\"Alice\",32,true"
# Combine with map for multi-line CSV:
.users[] | [.name,.age,.email] | @csv

# @tsv — tab-separated (better for shell pipelines)
["Alice",32] | @tsv           # → "Alice\t32"

# @json — embed JSON inside a string
{a:1} | @json                  # → "{\"a\":1}"
# Useful for: putting JSON in a JSON string field
{payload: .data | @json}

# @sh — shell-safe quoting
"user's data" | @sh           # → "'user'\\''s data'"
# Use with -r to produce safe shell arguments:
# jq -r '.[] | @sh' | xargs curl

# @text — identity (default format)

# FORMAT INTERPOLATION — combine with \():
@uri "https://api.com/user/\(.id)?key=\(.key)"
# → URL with both values properly encoded

@html "<td>\(.name)</td>"
# → HTML with name safely escaped

Power Features

ADVANCED PATTERNS

reduce, label-break, limit, until, foreach — the functional programming core of jq that handles stateful iteration, early termination, and streaming. This is where jq stops feeling like a query tool and starts feeling like a language.

reduce — stateful foldjq

# reduce expr as $var (init; update):
# Like Array.reduce() in JS or functools.reduce() in Python.
# Most powerful when add/map aren't expressive enough.

# Sum an array (same as add, but explicit):
reduce .[] as $x (0; . + $x)

# Build an object from an array:
reduce .items[] as $item ({};
  . + {($item.id|tostring): $item.name}
)
# → {"1":"Alice","2":"Bob",...}

# Running totals (build array of cumulative sums):
reduce .values[] as $v ([[],0];
  [first + [last + $v], last + $v]
) | first
# → [v1, v1+v2, v1+v2+v3, ...]

# Word frequency count from array of strings:
reduce .words[] as $w ({};
  .[$w] += 1
)

# Merge array of objects, later values win:
reduce .patches[] as $p (.base; . * $p)
# * (multiply) on objects = recursive merge

# Deep merge (objects: right wins, arrays: concatenate):
def merge($a;$b):
  if ($a|type) == "object" and ($b|type) == "object"
  then $a + $b   # right wins for shared keys
  else $b end;

limit / until / foreach / label-breakjq

# limit(n; expr): take first n outputs of expr
limit(5; .items[])         # first 5 items
limit(1; .items[] | select(.active))  # first active
first(.items[] | select(.active))     # same as limit(1;...)

# until(cond; update): loop until condition true
1 | until(. > 100; . * 2)   # 1→2→4→8→16→32→64→128

# while(cond; update): emit values while condition true
1 | while(. < 100; . * 2)  # 1,2,4,8,16,32,64
[1 | while(. < 100; .*2)]  # collect: [1,2,4,8,16,32,64]

# foreach expr as $x (init; update; extract):
# like reduce but emits intermediate values
foreach .events[] as $e (
  {count:0,total:0};           # init
  {count: .count+1,
   total: .total+$e.amount};  # update
  {running_avg: .total/.count}  # extract (emitted each step)
)
# Outputs running average after each event — useful for streaming

# label-break: early exit from infinite generators
label $out |
foreach range(infinite) as $i (0; .+$i;
  if . > 100 then ., break $out
  else . end
)
# Find first triangular number > 100 (0+1+2+...+n)

# infinite: generates 0,1,2,... forever (needs limit or break)
first(range(infinite) | select(.%7==0 and .%11==0))
# → 77 (first multiple of both 7 and 11)

Functions & Modules

JQ PROGRAMS

jq is a full functional programming language. You can define named functions, use recursion, destructure inputs, and load reusable libraries. Write complex transformations once, call them anywhere.

def — function definitionsjq

# def name: body;
# Functions are defined with 'def', end with ';'
# They take the input via . (like all jq filters)

# Zero-argument function (operates on .):
def double: . * 2;
5 | double    # → 10

# With arguments (separated by semicolons):
def clamp($min; $max):
  if . < $min then $min
  elif . > $max then $max
  else . end;
150 | clamp(0;100)   # → 100

# Function argument can be a FILTER (higher-order):
def apply_twice(f): f | f;
3 | apply_twice(.*2)   # → 12 (3→6→12)

# Recursion with def:
def flatten_keys($prefix):
  if type == "object" then
    to_entries[] |
    .value | flatten_keys($prefix + "." + $key)
  else
    {($prefix): .}
  end;

# Built-in recursive: .[] | recurse
recurse                        # same as ..
recurse(.children[]?)        # walk only .children
recurse(.children[]?; length > 0)  # with depth guard

Variables, destructuring, $__loc__jq

# 'as' binding — name a value for use in expression
.price as $p | {original: $p, discounted: $p*0.9}

# Array destructuring:
[$first, $second] as [$head, $tail]
.coords as [$x, $y] | "(\($x),\($y))"

# Object destructuring:
. as {name: $n, age: $a} | "\($n) is \($a)"

# $__loc__ — current file + line number (debugging)
$__loc__              # → {"file":"","line":1}

# @base32 / math functions:
sqrt                   # √ of number
floor | ceil | round  # rounding
fabs                   # absolute value
pow(.;2)              # . squared
log | log2 | log10   # logarithms
exp | exp2 | exp10   # exponentials
nan | infinite        # special float values
isinfinite | isnan | isnormal  # float tests
significand | exponent | frexp  # IEEE 754 decomposition

# now — current unix timestamp:
now                    # → 1743811200.0
now | todate          # → "2026-04-05T00:00:00Z"
now | strftime("%Y-%m-%d")
"2026-01-15" | strptime("%Y-%m-%d") | mktime
# → unix timestamp of that date

# Read a library file with import / include:
# jq -L ~/jq-lib -r 'import "utils" as U; U::flatten(.)'

Production Patterns

REAL WORLD JQ

The patterns you'll actually use daily in April 2026: parsing API responses, processing LLM agent outputs, mining Kafka streams, extracting from kubectl/docker/github CLI output, building shell pipelines.

REST API response patternsbash

# ── PAGINATED API ────────────────────────────────────
# Response: {"data":[...],"meta":{"next_cursor":"abc"}}
curl api.com/users | jq '
  .data |
  map(select(.active)) |
  map({id, name, email: (.email | ascii_downcase)}) |
  sort_by(.name)
'

# Extract next cursor for shell loop:
CURSOR=$(curl api.com/users | jq -r '.meta.next_cursor // ""')

# ── GITHUB API ───────────────────────────────────────
# List open PRs with reviewer counts:
gh pr list --json number,title,reviewRequests,state |
jq '[.[] |
  select(.state == "OPEN") |
  {
    pr:        .number,
    title:     .title,
    reviewers: (.reviewRequests | length)
  }
] | sort_by(.reviewers) | reverse'

# ── KUBERNETES ───────────────────────────────────────
kubectl get pods -o json | jq '
  .items[] |
  select(.status.phase != "Running") |
  {
    name:   .metadata.name,
    phase:  .status.phase,
    reason: (.status.containerStatuses[0].state | to_entries[0].key)
  }
'

# All pod resource limits:
kubectl get pods -o json | jq '
  [.items[] |
    .metadata.name as $pod |
    .spec.containers[] |
    {
      pod:     $pod,
      container: .name,
      cpu:     (.resources.limits.cpu // "none"),
      memory:  (.resources.limits.memory // "none")
    }
  ]
'

# ── DOCKER ───────────────────────────────────────────
docker inspect my-container | jq '
  .[0] |
  {
    id:      .Id[:12],
    image:   .Config.Image,
    status:  .State.Status,
    ports:   (.NetworkSettings.Ports | to_entries |
              map("\(.key) → \(.value[0].HostPort // "none")")
             ),
    envs:    (.Config.Env | map(split("=") |
              {(.[0]): .[1]}
             ) | add)
  }
'

AWS CLI + cloud APIsbash

# AWS EC2: find instances with no Name tag
aws ec2 describe-instances | jq '
  .Reservations[].Instances[] |
  select(.Tags | map(.Key) | contains(["Name"]) | not) |
  {id: .InstanceId, type: .InstanceType, state: .State.Name}
'

# Sum EC2 costs by instance type:
aws ec2 describe-instances | jq '
  [.Reservations[].Instances[] |
    select(.State.Name == "running") |
    .InstanceType
  ] |
  group_by(.) |
  map({type: first, count: length}) |
  sort_by(.count) | reverse
'

# CloudWatch: find all alarms in ALARM state
aws cloudwatch describe-alarms | jq '
  [.MetricAlarms[] |
    select(.StateValue == "ALARM") |
    {name: .AlarmName, metric: .MetricName, reason: .StateReason}
  ]
'

# Transform for Slack notification:
aws cloudwatch describe-alarms | jq -r '
  .MetricAlarms[] |
  select(.StateValue == "ALARM") |
  "🚨 \(.AlarmName): \(.StateReason)"
'

OpenAI / Anthropic API response parsingbash

# OpenAI chat completion — extract just the text:
curl https://api.openai.com/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hi"}]}' \
  | jq -r '.choices[0].message.content'

# Extract all tool call arguments:
jq '
  .choices[0].message.tool_calls // [] |
  map({
    fn:   .function.name,
    args: (.function.arguments | fromjson)
  })
' response.json

# Anthropic Claude — extract text from content blocks:
jq -r '
  .content[] |
  select(.type == "text") |
  .text
' claude_response.json

# Parse structured output (JSON mode response):
jq '
  .choices[0].message.content |
  fromjson |              # parse the JSON string
  {
    sentiment: .sentiment,
    score:     .confidence_score,
    entities:  [.entities[] | select(.type == "PERSON") | .text]
  }
' response.json

# Token usage analytics across many completions:
cat responses/*.json | jq -s '
  {
    total_calls:       length,
    total_prompt:      map(.usage.prompt_tokens)     | add,
    total_completion:  map(.usage.completion_tokens)  | add,
    avg_completion:    (map(.usage.completion_tokens) | add / length | round),
    models:            [.[].model] | group_by(.) |
                       map({model: first, calls: length})
  }
'

# Extract all function calls from a multi-step agent run log:
cat agent_log.ndjson | jq -c '
  select(.type == "tool_call") |
  {
    ts:   .timestamp,
    fn:   .function_name,
    args: .arguments,
    ms:   .duration_ms
  }
'

fromjson / tojson — Parse Embedded JSON Strings LLM APIs return JSON strings containing JSON (the model's structured output). fromjson parses a JSON string into a jq value. tojson serializes a jq value to a JSON string. These two builtins are essential for working with LLM responses that use JSON mode.

Agent output harvestingbash

# Harvest results from 8 parallel agents running in tmux panes.
# Each agent writes a result.json. Aggregate across all.

cat results/agent_*.json | jq -s '
  {
    agents_run:     length,
    agents_success: map(select(.status == "success")) | length,
    agents_failed:  map(select(.status == "error"))   | length,
    total_tokens:   map(.token_count // 0)            | add,
    avg_duration_s: (map(.duration_s) | add / length  | round),
    outputs:        map(select(.status == "success")  | .result),
    errors:         map(select(.status == "error")    | {id: .agent_id, err: .error})
  }
'

# Build a markdown report from agent results:
cat results/agent_*.json | jq -rs '
  "# Agent Run Report\n\n" +
  "Total: \(length) agents\n\n" +
  (map("## Agent \(.agent_id)\n\(.result)\n") | join("\n"))
'

NDJSON streaming — Kafka, logs, agent outputsbash

# NDJSON: one JSON object per line. jq processes each line.
# No -s flag = streaming mode (each line processed independently)

# Kafka consumer → jq pipeline (kcat/kafkacat):
kcat -b broker:9092 -t orders -C -o beginning |
jq -c '
  select(.status == "COMPLETED") |
  {order_id, customer_id, amount, ts: .event_ts}
'

# Count events by type from live Kafka stream:
kcat -b broker:9092 -t events -C -e |
jq -r '.event_type' |
sort | uniq -c | sort -rn

# Process a large NDJSON file in streaming mode:
jq -c 'select(.amount > 1000)' events.ndjson |
jq -s 'group_by(.customer_id) | map({cid: first.customer_id, n: length})'

# --stream flag: streaming parser (process huge JSON without loading all)
jq --stream -c '
  # --stream emits [path, value] for every scalar and []
  # Only emit completed top-level objects:
  . as $in |
  if ($in | length) == 2 and ($in[0] | length) == 1
  then $in[1]
  else empty end
' giant_array.json

# JSON log mining — parse structured logs:
tail -f /var/log/app.log |
jq -cr '
  select(.level == "error") |
  "\(.timestamp) [\(.service)] \(.message)"
'

# -R flag: read raw lines (non-JSON), then parse:
tail -f mixed.log |
jq -Rc '
  try fromjson catch {raw: ., parsed: false}
'
# Tries to parse each line as JSON, falls back to {raw: line}

-s (slurp) vs Streaming jq -s loads ALL NDJSON lines into a single array — convenient but memory-hungry for large files. For files >1GB, pipe through jq twice: first pass per-line selection with -c, second pass slurp the filtered smaller set. Or use --stream for true constant-memory streaming of massive JSON arrays.

Slurp patterns for batch aggregationbash

# -s: slurp all inputs into one array, then aggregate
cat events.ndjson | jq -sc '
  {
    total:   length,
    errors:  map(select(.level == "error")) | length,
    p50_ms:  (sort_by(.duration_ms) | .[length/2 | floor].duration_ms),
    p99_ms:  (sort_by(.duration_ms) | .[(length * 0.99) | floor].duration_ms)
  }
'

# Multiple input files — each processed independently:
jq -s 'add | group_by(.region) | map({region: first.region, n: length})' \
   jan.ndjson feb.ndjson mar.ndjson

Shell integration — xargs, loops, conditionalsbash

# -r flag: raw output (no quotes on strings)
# Essential when piping jq output to shell commands

# For loop over jq array output:
jq -r '.users[].id' data.json |
while read -r id; do
  curl -s "api.com/user/$id" >> results.json
done

# xargs — parallel downloads:
jq -r '.repos[].clone_url' gh.json |
xargs -P 4 -I {} git clone {}

# Pass multiple fields as tab-separated to while read:
jq -r '.[] | [.id, .name, .email] | @tsv' users.json |
while IFS=$'\t' read -r id name email; do
  echo "Processing $name ($id) at $email"
done

# Use jq as a conditional in bash (-e exits 1 if null/false):
if curl api.com/status | jq -e '.healthy' >/dev/null; then
  echo "Service is healthy"
else
  echo "Service is DOWN"
fi

# Build JSON payload for a POST request:
PAYLOAD=$(jq -cn \
  --arg     name   "Alice" \
  --arg     email  "alice@example.com" \
  --argjson active true \
  --argjson score  42 \
  '{name: $name, email: $email, active: $active, score: $score}'
)
curl -XPOST -d "$PAYLOAD" api.com/users

# Update a specific field in a JSON file in-place:
jq '.config.debug = true' config.json | sponge config.json
# sponge (from moreutils) avoids redirect-to-same-file issue
# Alternative: jq ... config.json > /tmp/tmp.json && mv /tmp/tmp.json config.json

-n + --arg: Building JSON from Shell Variables jq -n (null input) lets you build JSON from scratch without needing input. Combine with --arg (string) and --argjson (typed JSON value) to safely embed shell variables into JSON. Never use string interpolation to build JSON — it breaks on quotes and special characters.

jq as a config processorbash

# Merge two JSON config files (right overrides left):
jq -s '.[0] * .[1]' defaults.json overrides.json

# Convert between formats — JSON to env vars:
jq -r 'to_entries | .[] | "\(.key | ascii_upcase)=\(.value)"' \
   config.json >> .env

# .env file to JSON:
cat .env | jq -Rn '
  [inputs | select(length > 0 and startswith("#") | not) |
   split("=") | {(.[0]): .[1:] | join("=")}] | add
'

Multi-file operations + in-place editingbash

# ── MULTIPLE INPUT FILES ─────────────────────────────
# jq processes each file as separate input by default
jq '.name' alice.json bob.json carol.json
# → "Alice"
#    "Bob"
#    "Carol"

# -s slurps all files into one array:
jq -s 'map(.name)' alice.json bob.json carol.json
# → ["Alice","Bob","Carol"]

# Process all JSON files in a directory:
jq -s '
  map({file: input_filename, count: .items | length}) |
  sort_by(.count) | reverse
' data/*.json

# input_filename: built-in, gives current file path
jq '{file: input_filename, keys: keys}' *.json

# input / inputs — explicit iteration over files
jq -n '
  [inputs | select(.active) | .name]
' users/*.json
# -n + inputs: process lazily without loading everything at once
# inputs = the rest of the input files (after -n consumed "nothing")

# ── IN-PLACE EDIT PATTERNS ───────────────────────────
# Pattern 1: sponge (moreutils package)
jq '.version |= . + 1' package.json | sponge package.json

# Pattern 2: temp file
jq '.version |= . + 1' package.json > /tmp/pkg.json \
  && mv /tmp/pkg.json package.json

# Pattern 3: process all JSON files in-place
for f in configs/*.json; do
  jq '.environment = "production"' "$f" | sponge "$f"
done

# Validate all JSON files (exit 1 on any invalid):
for f in **/*.json; do
  jq -e '.' "$f" >/dev/null || echo "INVALID: $f"
done

# Convert JSON array file to NDJSON:
jq -c '.[]' array.json > stream.ndjson

# Convert NDJSON to JSON array:
jq -sc '.' stream.ndjson > array.json

# Sort + dedup a JSON array file:
jq 'unique_by(.id) | sort_by(.created_at)' input.json | sponge input.json

jq vs Python for JSON — When to Use Each

Task	jq	Python
One-liner extraction	✓ Perfect	Overkill
Shell pipeline integration	✓ Native	Awkward
Stream large files	✓ --stream	Possible but verbose
Complex business logic	Gets hard	✓ Better
External API calls	No	✓ Yes
Multiple data sources	Limited	✓ Natural
Regex + transforms	✓ PCRE native	re module
REPL exploration	OK (jqplay.org)	✓ IPython
No dependencies needed	✓ Single binary	pip install...

The Rule of Thumb If the transformation fits in one command line and the input is JSON from a file or API: jq. If you need loops, external calls, error handling, or the filter is longer than ~5 pipes: Python. The best engineers reach for jq instinctively for inspection and quick transforms, and Python for anything that needs to be maintained.