jq reads JSON and applies a filter to it. A filter takes JSON in and produces JSON out. Filters compose with the pipe operator |. The simplest filter is . — identity, returns whatever it receives. Everything else builds on this.
jq
The program. Reads stdin (or files), applies filter, writes to stdout
'...'
Single-quoted filter string (prevents shell interpolation)
.users[]
Path expression: get users key, iterate array
| select(.active)
Pipe to filter: keep only where active is truthy
{name, email}
Object construction: build a new JSON object
curl api | jq '.items[]' | jq 'select(.price > 100)'. Or chain inside one jq filter with |. The pipe is the unit of composition — everything is a transformation of JSON flowing through.
| Flag | Effect |
|---|---|
| -r | Raw output — strings without quotes (for shell use) |
| -c | Compact output — one JSON value per line (NDJSON) |
| -n | Null input — no stdin; use null as input |
| -s | Slurp — read all inputs into one array |
| -R | Raw input — read lines as strings (not JSON) |
| -e | Exit status 1 if output is false/null |
| --arg k v | Bind shell variable as $k string in filter |
| --argjson k v | Bind shell variable as $k JSON in filter |
| --slurpfile k f | Load JSON file as $k array |
| --rawfile k f | Load text file as $k string |
| --args | Remaining args become $ARGS.positional array |
| --tab | Tab-indented output |
| --indent N | N-space indented output (default 2) |
| -f file | Read filter from file instead of argument |
# ── IDENTITY & RECURSION ────────────────────────────── . # identity — pass input unchanged .. # recursive descent — every value in tree .[] # iterate — outputs each element separately # ── OBJECT FIELD ACCESS ─────────────────────────────── .name # field by name (fails if not object) .name? # optional: no error if .name doesn't exist .["field-name"] # bracket syntax — needed for special chars .a.b.c # chain — equivalent to .a | .b | .c .a.b? # optional at any level — suppress errors # ── ARRAY INDEX & SLICING ───────────────────────────── .items[0] # first element .items[-1] # last element .items[2:5] # slice [2,5) — elements at index 2, 3, 4 .items[2:] # from index 2 to end .items[:3] # first 3 elements .items[] # iterate: outputs each element individually # ── THE COMMA OPERATOR ──────────────────────────────── .a, .b # outputs .a then .b as separate values .users[].name, .users[].email # outputs all names then all emails # ── PIPE ────────────────────────────────────────────── .users[] | .name # iterate users, get name of each . | length # pipe identity into length # ── OBJECT CONSTRUCTION ─────────────────────────────── {name: .name} # build new object {name} # shorthand: key = name, value = .name {name: .first, score: .scores[0]} # Dynamic keys: {(.key): .value} # key is a computed expression # ── ARRAY CONSTRUCTION ──────────────────────────────── [.a, .b, .c] # build array from 3 values [.items[]] # collect iterated values into array [.items[] | .name] # map names → array [range(5)] # [0,1,2,3,4]
? to any expression suppresses errors and just produces no output instead. Essential for heterogeneous JSON where not every object has the same keys. Without ?, accessing a missing key on a non-object type throws a fatal error and stops processing.
# Without ?: blows up if any item isn't an object .items[].name # ERROR if any item is a string/number # With ?: silently skips non-objects .items[].name? # safe — only outputs where .name exists # Try-catch for controlled error handling: try .name catch "N/A" # Alternative operator // (like null-coalescing ??) .name // "unknown" # if .name is null/false, use "unknown" .count // 0 # default numeric fallback .tags // [] # default empty array # CRITICAL: // is not the same as ? # .x? → suppress error, produce nothing # .x // "default" → produce "default" if null/false # .x? // "default" → suppress error AND default if null # Real-world: safe deep access .user?.address?.city? // "unknown"
# jq types: null, boolean, number, string, array, object type # → "object", "array", "string", etc. # Test type: . | type == "array" . | arrays # pass-through only if input is array . | objects # pass-through only if input is object . | strings # pass-through only if string . | numbers # pass-through only if number . | booleans # pass-through only if bool . | nulls # pass-through only if null . | scalars # pass-through if not array/object . | iterables # pass-through if array or object # Type coercion: tostring # any → "string" tonumber # "42" → 42 ascii_downcase # string → lowercase ascii_upcase # string → UPPERCASE
# select(cond): pass through if cond is true, produce nothing if false # It is the jq equivalent of SQL WHERE. # Basic: .items[] | select(.active) .items[] | select(.price > 100) .items[] | select(.status == "active") # String matching: .items[] | select(.name | startswith("Alice")) .items[] | select(.email | endswith(".com")) .items[] | select(.url | contains("github")) .items[] | select(.name | test("^[Aa]")) # regex # Boolean combinators: .items[] | select(.active and .verified) .items[] | select(.active or .legacy) .items[] | select(.active | not) # Membership test: .items[] | select(.role | IN("admin","owner")) .items[] | select(.id | IN($ids[])) # --argjson ids '[1,2,3]' # Null guard: .items[] | select(.name != null) .items[] | select(.price | numbers) # type-safe select # Collect results back into array (select inside []): [.items[] | select(.price > 100)] # ↑ equivalent to: .items | map(select(.price > 100))
.items[] | select(...) outputs filtered values as separate items. Wrap in [...] to collect: [.items[] | select(...)]. Or use .items | map(select(...)) — identical result, different readability. Choose based on what comes next: piping onward → bare iterate; building a JSON array → wrap in [].
# Pass shell variables safely with --arg and --argjson: STATUS="active" jq --arg status "$STATUS" \ '[.[] | select(.status == $status)]' data.json # Numeric comparison with --argjson: MIN_PRICE=100 jq --argjson min "$MIN_PRICE" \ '[.[] | select(.price > $min)]' data.json # Pass an array for IN() test: IDS='[1,2,3,4]' jq --argjson ids "$IDS" \ '[.[] | select(.id | IN($ids[]))]' data.json
# map(f): apply f to each element of array # Equivalent to: [.[] | f] map(.name) # extract name from each object map(.price * 1.2) # multiply each price by 1.2 map(tostring) # convert each element to string map(ascii_downcase) # lowercase each string map({name: .name, id: .id}) # reshape each object # map(select(...)): filter an array (most common pattern!) map(select(.active)) # keep active items map(select(.price > 10)) # keep items where price > 10 # Chain map and select: map(select(.active)) | map({name: .name, email: .email}) | sort_by(.name) # ────────────────────────────────────────────────────── # map_values(f): apply f to each VALUE in object or array # For objects: preserves keys, transforms values # For arrays: same as map() {a: 1, b: 2, c: 3} | map_values(. * 10) # → {"a":10,"b":20,"c":30} map_values(. // "N/A") # replace nulls with "N/A" in all values map_values(tostring) # stringify every value in object map_values(if . == null then empty else . end) # remove null values # del: remove keys del(.password) # remove one key del(.password, .ssn, .token) # remove multiple del(.items[].internal) # remove key from each item del(.items[] | select(.active | not)) # remove inactive items
# |= : update operator. Read as "update with". # Takes the current value, applies the expression, stores result. .price |= . * 1.1 # add 10% to price .name |= ascii_upcase # uppercase the name .tags |= . + ["new-tag"] # append to array .count |= . + 1 # increment counter .items[].price |= . * 0.9 # 10% discount on all items # += shorthand (equivalent to |= . + x): .count += 1 .tags += ["extra"] .price *= 1.1 .name //= "anonymous" # set if null # Set a key that might not exist: .meta.processed = true # creates path if needed .meta += {ts: "2026-04-05"} # merge into existing object
# ── REDUCTION ───────────────────────────────────────── length # array: count; string: chars; object: keys add # sum numbers, concat strings/arrays, merge objects any # true if any element is truthy all # true if all elements are truthy any(. > 10) # true if any element > 10 all(.active) # true if all objects have active=true first # first element last # last element first(.items[] | select(.active)) # first active item nth(2; .items[]) # 3rd item (0-indexed) min | max # min/max of number array min_by(.price) # object with lowest price max_by(.score) # object with highest score # ── SORTING ─────────────────────────────────────────── sort # sort array of scalars sort_by(.name) # sort objects by field sort_by(.name,.age) # multi-key sort reverse # reverse array sort_by(.price) | reverse # descending sort # ── DEDUPLICATION ───────────────────────────────────── unique # deduplicate (sorts first) unique_by(.id) # deduplicate by field (keep first seen) unique_by(.email | ascii_downcase) # case-insensitive dedup # ── SET OPERATIONS ──────────────────────────────────── a | inside(b) # true if a is a subset of b contains([1,2]) # true if array contains [1,2] as subset flatten # fully flatten nested arrays flatten(1) # flatten one level deep only indices("x") # array of indices where "x" appears index("x") # first index of "x" rindex("x") # last index zip # transpose: [[a,b],[c,d]] → [[a,c],[b,d]] transpose # alias for zip combinations # cartesian product: [[a,b],[c,d]] → [a,c],[a,d],[b,c]... range(5) # 0,1,2,3,4 (as separate outputs) range(2;10;2) # 2,4,6,8 (start;end;step) [range(5)] # collect into array: [0,1,2,3,4]
group_by(.field) sorts the array then groups into sub-arrays of objects with the same field value. It returns an array of arrays. Combine with map to aggregate each group.
# group_by(.field): sort + partition by field value group_by(.status) # → [[{status:"a",...},{status:"a",...}],[{status:"b",...}]] # Count per group (like COUNT(*) GROUP BY): group_by(.status) | map({ status: first.status, count: length }) # Sum per group (like SUM(amount) GROUP BY status): group_by(.status) | map({ status: first.status, total: map(.amount) | add, avg: (map(.amount) | add) / length }) # Multi-key grouping: group_by([.status, .region])
# ── INTROSPECTION ──────────────────────────────────── keys # sorted array of object keys keys_unsorted # keys in insertion order values # array of values (in key order) has("key") # true if key exists (even if value is null) in({a:1}) # true if input key exists in given object length # number of keys in object # ── MERGING ────────────────────────────────────────── {a:1} + {b:2} # → {a:1, b:2} (right overwrites left) . + {extra: "field"} # add field to existing object # ── to_entries / from_entries / with_entries ───────── # to_entries: object → [{key,value}] {a:1,b:2} | to_entries # → [{"key":"a","value":1},{"key":"b","value":2}] # from_entries: [{key,value}] → object [{key:"x",value:99}] | from_entries # → {"x": 99} # Also accepts: {name,value} and {k,v} instead of {key,value} # with_entries(f): to_entries | map(f) | from_entries # The POWER move: transform keys or values of an object with_entries(.key |= ascii_upcase) # → all keys uppercased with_entries(.value |= tostring) # → all values stringified with_entries(select(.value != null)) # → remove null-valued keys with_entries(select(.key | startswith("_") | not)) # → remove all keys starting with underscore (strip private fields) # Rename a key: with_entries(if .key == "old_name" then .key = "new_name" else . end)
with_entries(f) is jq's secret weapon for object manipulation. It converts an object to key-value pairs, applies a filter to each pair, then rebuilds. This lets you filter, rename, and transform keys AND values simultaneously — in one expression. Most jq beginners use to_entries | map(...) | from_entries; power users use with_entries.
# Build object from array (INDEX — like a hash map): INDEX(.items[]; .id) # → {"id1":{...}, "id2":{...}} — keyed by id INDEX(.items[]; .email) # Lookup table: O(1) access by email # IN() — membership test (complement to INDEX): $lookup | IN(keys[]) # Zip two arrays into key-value pairs: ["a","b"] as $keys | [1,2] as $vals | [$keys[], $vals[]] | from_entries # → {"a":1,"b":2}
# Basic if/then/else — ALWAYS needs else in jq if .active then "active" else "inactive" end # Multiple branches with elif: if .score >= 90 then "A" elif .score >= 80 then "B" elif .score >= 70 then "C" else "F" end # Inside map/select: map(.price |= if . > 100 then . * 0.9 else . end) # 10% discount on items over $100 # jq has NO switch/case — use if/elif chains # Or use a lookup object: { "pending": 0, "active": 1, "cancelled": -1 }[.status] // 99 # lookup with default # empty — produce no output (useful in conditionals): if .active then . else empty end # equivalent to: select(.active) # error — throw a custom error: if .id | not then error("id required") else . end
# try/catch: handle errors gracefully try .name catch null # null on error try tonumber catch 0 # 0 if not a number try (. | fromjson) catch . # return original if not valid JSON # The error message is available in catch: try .a.b.c catch {error: ., input: $__loc__} # try without catch: suppress all errors (like ? but for expressions) try (.data | fromjson | .result) # Practical: parse JSON strings safely in a stream: .events[] | try (.payload | fromjson) # silently skips events with non-JSON payloads
# .. recursive descent: outputs EVERY value in the tree # (all scalars, all sub-objects, all sub-arrays) # Find all strings anywhere in a deep JSON tree: .. | strings # Find all numbers: .. | numbers # Find any value matching a condition (anywhere in tree): .. | objects | select(has("error")) # → find all objects at any depth that have an "error" key # Extract all values for a key anywhere in deep JSON: .. | objects | .id? # → all "id" values, at any nesting depth # Extract all error messages from a complex API response: .. | strings | select(test("error|Error|ERROR")) # walk(f): applies f bottom-up to every node in tree # Useful for: type coercion, normalization, transformation # Stringify all numbers in any JSON structure: walk(if type == "number" then tostring else . end) # Remove all null values recursively: walk( if type == "object" then with_entries(select(.value != null)) else . end ) # Normalize all keys to snake_case (simplified): walk( if type == "object" then with_entries( .key |= gsub("(?<=.)(?=[A-Z])"; "_") | ascii_downcase ) else . end )
# paths: emit all path arrays to leaf values {a:{b:1},c:2} | paths # → ["a","b"] # ["c"] # paths(filter): only paths to values matching filter paths(numbers) # paths to numeric values only paths(strings) # paths to string values paths(type == "null") # paths to null values # leaf_paths: paths to scalar (non-container) values leaf_paths # getpath / setpath — dynamic path access: getpath(["a","b"]) # same as .a.b setpath(["a","b"]; 99) # set .a.b = 99 delpaths([["a","b"]]) # delete .a.b # Dynamic path from string — parse "a.b.c" to ["a","b","c"]: "user.address.city" | split(".") | getpath(.; $data) # Find and replace at all matching paths: [paths(type == "string")] | reduce .[] as $p ($data; setpath($p; getpath($p) | ascii_downcase) ) # Lowercase every string value in entire document
env returns the entire process environment as a JSON object. env.HOME gives $HOME. $ENV is an alias. This lets jq filters read env vars without needing --arg, which is powerful for config-driven filters in scripts.
env.HOME # → "/home/alice" env.OPENAI_API_KEY # → "sk-proj-..." $ENV.LOG_LEVEL # same as env.LOG_LEVEL # Filter only keys matching a prefix: env | with_entries(select(.key | startswith("AWS_")))
# ── INTERPOLATION ───────────────────────────────────── "Hello, \(.name)!" # string interpolation with \(expr) "\(.first) \(.last)" # combine fields "Score: \(.score*100|floor)%" # expressions inside "\(.items|length) items found" # ── BASIC OPERATIONS ────────────────────────────────── length # character count ltrimstr("prefix") # remove prefix if present rtrimstr(".json") # remove suffix if present startswith("http") # boolean endswith(".com") # boolean ascii_downcase # lowercase ascii_upcase # UPPERCASE explode # string → [codepoints] implode # [codepoints] → string split(",") # → array of strings join(",") # array → join with separator # ── REGEX (PCRE) ────────────────────────────────────── test("pattern") # boolean: does string match? test("pattern";"gi") # with flags: g=global, i=ignorecase, x=extended match("(\\w+)@(\\w+)") # first match with captures capture("(?<user>\\w+)@(?<domain>\\w+)") # → {"user":"alice","domain":"example"} scan("\\d+") # all non-overlapping matches sub("foo";"bar") # replace first gsub("foo";"bar") # replace all gsub("(?<n>\\d+)"; (.n|tonumber*2|tostring)) # Double every number in a string — captures as named group splits("[,;|]") # split on any of: comma, semicolon, pipe scan("\\b\\w+\\b") # extract all words
# Format strings: @format applied to string interpolations # @base64 — encode/decode "hello world" | @base64 # → "aGVsbG8gd29ybGQ=" "aGVsbG8=" | @base64d # → "hello" # @uri — URL encode "hello world/foo" | @uri # → "hello%20world%2Ffoo" "@uri "\(.name)/\(.id)"" # encode in a template # @html — escape HTML "<b>bold</b>" | @html # → "<b>bold</b>" # @csv — format array as CSV line ["Alice",32,true] | @csv # → "\"Alice\",32,true" # Combine with map for multi-line CSV: .users[] | [.name,.age,.email] | @csv # @tsv — tab-separated (better for shell pipelines) ["Alice",32] | @tsv # → "Alice\t32" # @json — embed JSON inside a string {a:1} | @json # → "{\"a\":1}" # Useful for: putting JSON in a JSON string field {payload: .data | @json} # @sh — shell-safe quoting "user's data" | @sh # → "'user'\\''s data'" # Use with -r to produce safe shell arguments: # jq -r '.[] | @sh' | xargs curl # @text — identity (default format) # FORMAT INTERPOLATION — combine with \(): @uri "https://api.com/user/\(.id)?key=\(.key)" # → URL with both values properly encoded @html "<td>\(.name)</td>" # → HTML with name safely escaped
# reduce expr as $var (init; update): # Like Array.reduce() in JS or functools.reduce() in Python. # Most powerful when add/map aren't expressive enough. # Sum an array (same as add, but explicit): reduce .[] as $x (0; . + $x) # Build an object from an array: reduce .items[] as $item ({}; . + {($item.id|tostring): $item.name} ) # → {"1":"Alice","2":"Bob",...} # Running totals (build array of cumulative sums): reduce .values[] as $v ([[],0]; [first + [last + $v], last + $v] ) | first # → [v1, v1+v2, v1+v2+v3, ...] # Word frequency count from array of strings: reduce .words[] as $w ({}; .[$w] += 1 ) # Merge array of objects, later values win: reduce .patches[] as $p (.base; . * $p) # * (multiply) on objects = recursive merge # Deep merge (objects: right wins, arrays: concatenate): def merge($a;$b): if ($a|type) == "object" and ($b|type) == "object" then $a + $b # right wins for shared keys else $b end;
# limit(n; expr): take first n outputs of expr limit(5; .items[]) # first 5 items limit(1; .items[] | select(.active)) # first active first(.items[] | select(.active)) # same as limit(1;...) # until(cond; update): loop until condition true 1 | until(. > 100; . * 2) # 1→2→4→8→16→32→64→128 # while(cond; update): emit values while condition true 1 | while(. < 100; . * 2) # 1,2,4,8,16,32,64 [1 | while(. < 100; .*2)] # collect: [1,2,4,8,16,32,64] # foreach expr as $x (init; update; extract): # like reduce but emits intermediate values foreach .events[] as $e ( {count:0,total:0}; # init {count: .count+1, total: .total+$e.amount}; # update {running_avg: .total/.count} # extract (emitted each step) ) # Outputs running average after each event — useful for streaming # label-break: early exit from infinite generators label $out | foreach range(infinite) as $i (0; .+$i; if . > 100 then ., break $out else . end ) # Find first triangular number > 100 (0+1+2+...+n) # infinite: generates 0,1,2,... forever (needs limit or break) first(range(infinite) | select(.%7==0 and .%11==0)) # → 77 (first multiple of both 7 and 11)
# def name: body; # Functions are defined with 'def', end with ';' # They take the input via . (like all jq filters) # Zero-argument function (operates on .): def double: . * 2; 5 | double # → 10 # With arguments (separated by semicolons): def clamp($min; $max): if . < $min then $min elif . > $max then $max else . end; 150 | clamp(0;100) # → 100 # Function argument can be a FILTER (higher-order): def apply_twice(f): f | f; 3 | apply_twice(.*2) # → 12 (3→6→12) # Recursion with def: def flatten_keys($prefix): if type == "object" then to_entries[] | .value | flatten_keys($prefix + "." + $key) else {($prefix): .} end; # Built-in recursive: .[] | recurse recurse # same as .. recurse(.children[]?) # walk only .children recurse(.children[]?; length > 0) # with depth guard
# 'as' binding — name a value for use in expression .price as $p | {original: $p, discounted: $p*0.9} # Array destructuring: [$first, $second] as [$head, $tail] .coords as [$x, $y] | "(\($x),\($y))" # Object destructuring: . as {name: $n, age: $a} | "\($n) is \($a)" # $__loc__ — current file + line number (debugging) $__loc__ # → {"file":"","line":1} # @base32 / math functions: sqrt # √ of number floor | ceil | round # rounding fabs # absolute value pow(.;2) # . squared log | log2 | log10 # logarithms exp | exp2 | exp10 # exponentials nan | infinite # special float values isinfinite | isnan | isnormal # float tests significand | exponent | frexp # IEEE 754 decomposition # now — current unix timestamp: now # → 1743811200.0 now | todate # → "2026-04-05T00:00:00Z" now | strftime("%Y-%m-%d") "2026-01-15" | strptime("%Y-%m-%d") | mktime # → unix timestamp of that date # Read a library file with import / include: # jq -L ~/jq-lib -r 'import "utils" as U; U::flatten(.)'
# ── PAGINATED API ──────────────────────────────────── # Response: {"data":[...],"meta":{"next_cursor":"abc"}} curl api.com/users | jq ' .data | map(select(.active)) | map({id, name, email: (.email | ascii_downcase)}) | sort_by(.name) ' # Extract next cursor for shell loop: CURSOR=$(curl api.com/users | jq -r '.meta.next_cursor // ""') # ── GITHUB API ─────────────────────────────────────── # List open PRs with reviewer counts: gh pr list --json number,title,reviewRequests,state | jq '[.[] | select(.state == "OPEN") | { pr: .number, title: .title, reviewers: (.reviewRequests | length) } ] | sort_by(.reviewers) | reverse' # ── KUBERNETES ─────────────────────────────────────── kubectl get pods -o json | jq ' .items[] | select(.status.phase != "Running") | { name: .metadata.name, phase: .status.phase, reason: (.status.containerStatuses[0].state | to_entries[0].key) } ' # All pod resource limits: kubectl get pods -o json | jq ' [.items[] | .metadata.name as $pod | .spec.containers[] | { pod: $pod, container: .name, cpu: (.resources.limits.cpu // "none"), memory: (.resources.limits.memory // "none") } ] ' # ── DOCKER ─────────────────────────────────────────── docker inspect my-container | jq ' .[0] | { id: .Id[:12], image: .Config.Image, status: .State.Status, ports: (.NetworkSettings.Ports | to_entries | map("\(.key) → \(.value[0].HostPort // "none")") ), envs: (.Config.Env | map(split("=") | {(.[0]): .[1]} ) | add) } '
# AWS EC2: find instances with no Name tag aws ec2 describe-instances | jq ' .Reservations[].Instances[] | select(.Tags | map(.Key) | contains(["Name"]) | not) | {id: .InstanceId, type: .InstanceType, state: .State.Name} ' # Sum EC2 costs by instance type: aws ec2 describe-instances | jq ' [.Reservations[].Instances[] | select(.State.Name == "running") | .InstanceType ] | group_by(.) | map({type: first, count: length}) | sort_by(.count) | reverse ' # CloudWatch: find all alarms in ALARM state aws cloudwatch describe-alarms | jq ' [.MetricAlarms[] | select(.StateValue == "ALARM") | {name: .AlarmName, metric: .MetricName, reason: .StateReason} ] ' # Transform for Slack notification: aws cloudwatch describe-alarms | jq -r ' .MetricAlarms[] | select(.StateValue == "ALARM") | "🚨 \(.AlarmName): \(.StateReason)" '
# OpenAI chat completion — extract just the text: curl https://api.openai.com/v1/chat/completions \ -H "Authorization: Bearer $OPENAI_API_KEY" \ -d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hi"}]}' \ | jq -r '.choices[0].message.content' # Extract all tool call arguments: jq ' .choices[0].message.tool_calls // [] | map({ fn: .function.name, args: (.function.arguments | fromjson) }) ' response.json # Anthropic Claude — extract text from content blocks: jq -r ' .content[] | select(.type == "text") | .text ' claude_response.json # Parse structured output (JSON mode response): jq ' .choices[0].message.content | fromjson | # parse the JSON string { sentiment: .sentiment, score: .confidence_score, entities: [.entities[] | select(.type == "PERSON") | .text] } ' response.json # Token usage analytics across many completions: cat responses/*.json | jq -s ' { total_calls: length, total_prompt: map(.usage.prompt_tokens) | add, total_completion: map(.usage.completion_tokens) | add, avg_completion: (map(.usage.completion_tokens) | add / length | round), models: [.[].model] | group_by(.) | map({model: first, calls: length}) } ' # Extract all function calls from a multi-step agent run log: cat agent_log.ndjson | jq -c ' select(.type == "tool_call") | { ts: .timestamp, fn: .function_name, args: .arguments, ms: .duration_ms } '
fromjson parses a JSON string into a jq value. tojson serializes a jq value to a JSON string. These two builtins are essential for working with LLM responses that use JSON mode.
# Harvest results from 8 parallel agents running in tmux panes. # Each agent writes a result.json. Aggregate across all. cat results/agent_*.json | jq -s ' { agents_run: length, agents_success: map(select(.status == "success")) | length, agents_failed: map(select(.status == "error")) | length, total_tokens: map(.token_count // 0) | add, avg_duration_s: (map(.duration_s) | add / length | round), outputs: map(select(.status == "success") | .result), errors: map(select(.status == "error") | {id: .agent_id, err: .error}) } ' # Build a markdown report from agent results: cat results/agent_*.json | jq -rs ' "# Agent Run Report\n\n" + "Total: \(length) agents\n\n" + (map("## Agent \(.agent_id)\n\(.result)\n") | join("\n")) '
# NDJSON: one JSON object per line. jq processes each line. # No -s flag = streaming mode (each line processed independently) # Kafka consumer → jq pipeline (kcat/kafkacat): kcat -b broker:9092 -t orders -C -o beginning | jq -c ' select(.status == "COMPLETED") | {order_id, customer_id, amount, ts: .event_ts} ' # Count events by type from live Kafka stream: kcat -b broker:9092 -t events -C -e | jq -r '.event_type' | sort | uniq -c | sort -rn # Process a large NDJSON file in streaming mode: jq -c 'select(.amount > 1000)' events.ndjson | jq -s 'group_by(.customer_id) | map({cid: first.customer_id, n: length})' # --stream flag: streaming parser (process huge JSON without loading all) jq --stream -c ' # --stream emits [path, value] for every scalar and [] # Only emit completed top-level objects: . as $in | if ($in | length) == 2 and ($in[0] | length) == 1 then $in[1] else empty end ' giant_array.json # JSON log mining — parse structured logs: tail -f /var/log/app.log | jq -cr ' select(.level == "error") | "\(.timestamp) [\(.service)] \(.message)" ' # -R flag: read raw lines (non-JSON), then parse: tail -f mixed.log | jq -Rc ' try fromjson catch {raw: ., parsed: false} ' # Tries to parse each line as JSON, falls back to {raw: line}
jq -s loads ALL NDJSON lines into a single array — convenient but memory-hungry for large files. For files >1GB, pipe through jq twice: first pass per-line selection with -c, second pass slurp the filtered smaller set. Or use --stream for true constant-memory streaming of massive JSON arrays.
# -s: slurp all inputs into one array, then aggregate cat events.ndjson | jq -sc ' { total: length, errors: map(select(.level == "error")) | length, p50_ms: (sort_by(.duration_ms) | .[length/2 | floor].duration_ms), p99_ms: (sort_by(.duration_ms) | .[(length * 0.99) | floor].duration_ms) } ' # Multiple input files — each processed independently: jq -s 'add | group_by(.region) | map({region: first.region, n: length})' \ jan.ndjson feb.ndjson mar.ndjson
# -r flag: raw output (no quotes on strings) # Essential when piping jq output to shell commands # For loop over jq array output: jq -r '.users[].id' data.json | while read -r id; do curl -s "api.com/user/$id" >> results.json done # xargs — parallel downloads: jq -r '.repos[].clone_url' gh.json | xargs -P 4 -I {} git clone {} # Pass multiple fields as tab-separated to while read: jq -r '.[] | [.id, .name, .email] | @tsv' users.json | while IFS=$'\t' read -r id name email; do echo "Processing $name ($id) at $email" done # Use jq as a conditional in bash (-e exits 1 if null/false): if curl api.com/status | jq -e '.healthy' >/dev/null; then echo "Service is healthy" else echo "Service is DOWN" fi # Build JSON payload for a POST request: PAYLOAD=$(jq -cn \ --arg name "Alice" \ --arg email "alice@example.com" \ --argjson active true \ --argjson score 42 \ '{name: $name, email: $email, active: $active, score: $score}' ) curl -XPOST -d "$PAYLOAD" api.com/users # Update a specific field in a JSON file in-place: jq '.config.debug = true' config.json | sponge config.json # sponge (from moreutils) avoids redirect-to-same-file issue # Alternative: jq ... config.json > /tmp/tmp.json && mv /tmp/tmp.json config.json
jq -n (null input) lets you build JSON from scratch without needing input. Combine with --arg (string) and --argjson (typed JSON value) to safely embed shell variables into JSON. Never use string interpolation to build JSON — it breaks on quotes and special characters.
# Merge two JSON config files (right overrides left): jq -s '.[0] * .[1]' defaults.json overrides.json # Convert between formats — JSON to env vars: jq -r 'to_entries | .[] | "\(.key | ascii_upcase)=\(.value)"' \ config.json >> .env # .env file to JSON: cat .env | jq -Rn ' [inputs | select(length > 0 and startswith("#") | not) | split("=") | {(.[0]): .[1:] | join("=")}] | add '
# ── MULTIPLE INPUT FILES ───────────────────────────── # jq processes each file as separate input by default jq '.name' alice.json bob.json carol.json # → "Alice" # "Bob" # "Carol" # -s slurps all files into one array: jq -s 'map(.name)' alice.json bob.json carol.json # → ["Alice","Bob","Carol"] # Process all JSON files in a directory: jq -s ' map({file: input_filename, count: .items | length}) | sort_by(.count) | reverse ' data/*.json # input_filename: built-in, gives current file path jq '{file: input_filename, keys: keys}' *.json # input / inputs — explicit iteration over files jq -n ' [inputs | select(.active) | .name] ' users/*.json # -n + inputs: process lazily without loading everything at once # inputs = the rest of the input files (after -n consumed "nothing") # ── IN-PLACE EDIT PATTERNS ─────────────────────────── # Pattern 1: sponge (moreutils package) jq '.version |= . + 1' package.json | sponge package.json # Pattern 2: temp file jq '.version |= . + 1' package.json > /tmp/pkg.json \ && mv /tmp/pkg.json package.json # Pattern 3: process all JSON files in-place for f in configs/*.json; do jq '.environment = "production"' "$f" | sponge "$f" done # Validate all JSON files (exit 1 on any invalid): for f in **/*.json; do jq -e '.' "$f" >/dev/null || echo "INVALID: $f" done # Convert JSON array file to NDJSON: jq -c '.[]' array.json > stream.ndjson # Convert NDJSON to JSON array: jq -sc '.' stream.ndjson > array.json # Sort + dedup a JSON array file: jq 'unique_by(.id) | sort_by(.created_at)' input.json | sponge input.json
| Task | jq | Python |
|---|---|---|
| One-liner extraction | ✓ Perfect | Overkill |
| Shell pipeline integration | ✓ Native | Awkward |
| Stream large files | ✓ --stream | Possible but verbose |
| Complex business logic | Gets hard | ✓ Better |
| External API calls | No | ✓ Yes |
| Multiple data sources | Limited | ✓ Natural |
| Regex + transforms | ✓ PCRE native | re module |
| REPL exploration | OK (jqplay.org) | ✓ IPython |
| No dependencies needed | ✓ Single binary | pip install... |