# JSONL JSONL (JSON Lines), also known as Newline Delimited JSON (NDJSON), is a text format where each line is a valid [[JavaScript Object Notation (JSON)]] object. It's designed for storing and streaming structured data, one record at a time. Unlike standard JSON arrays, JSONL doesn't require loading the entire dataset into memory. Each line can be parsed independently, making it ideal for large datasets, log files, and streaming applications. ## Format ```jsonl {"name": "Alice", "age": 30} {"name": "Bob", "age": 25} {"name": "Charlie", "age": 35} ``` ## Key Characteristics - **One JSON object per line**: Each line is a complete, valid JSON value - **Newline separated**: Lines end with `\n` (or `\r\n` on Windows) - **No commas between records**: Unlike JSON arrays - **No enclosing brackets**: No `[` or `]` wrapping the data - **Streamable**: Process line by line without loading entire file - **Appendable**: Add new records by appending lines ## Advantages Over JSON Arrays | Aspect | JSONL | JSON Array | |--------|-------|------------| | Memory usage | Low (stream line by line) | High (load entire file) | | Append data | Simple (add new line) | Complex (rewrite file) | | Partial reads | Easy | Difficult | | Error recovery | Skip bad lines | Entire file invalid | | Parallel processing | Natural | Requires splitting | ## Common Use Cases - **Log files**: Each event as a separate JSON object - **Data pipelines**: Streaming between services - **Machine learning**: Training data formats (OpenAI, Hugging Face) - **Database exports**: One record per line - **Issue tracking**: [[Beads]] stores issues as `.beads/beads.jsonl` - **Event sourcing**: Append-only event logs - **Analytics**: Clickstream and telemetry data ## Working with JSONL ### Command Line with jq ```bash # Read and process each line cat data.jsonl | jq -c '.name' # Filter lines cat data.jsonl | jq -c 'select(.age > 30)' # Convert JSON array to JSONL jq -c '.[]' array.json > data.jsonl # Convert JSONL to JSON array jq -s '.' data.jsonl > array.json ``` ### Python ```python import json # Read JSONL with open('data.jsonl', 'r') as f: for line in f: record = json.loads(line) print(record) # Write JSONL with open('data.jsonl', 'w') as f: for record in records: f.write(json.dumps(record) + '\n') ``` ### JavaScript/Node.js ```javascript const fs = require('fs'); const readline = require('readline'); // Read JSONL const rl = readline.createInterface({ input: fs.createReadStream('data.jsonl') }); rl.on('line', (line) => { const record = JSON.parse(line); console.log(record); }); ``` ## File Extensions - `.jsonl` (most common) - `.ndjson` - `.json` (sometimes, context-dependent) ## Specifications There's no official RFC, but two community specifications exist: - JSON Lines: https://jsonlines.org/ - NDJSON: https://github.com/ndjson/ndjson-spec ## References - https://jsonlines.org/ - https://en.wikipedia.org/wiki/JSON_streaming#Newline-delimited_JSON ## Related - [[JavaScript Object Notation (JSON)]] - [[jq]] - [[Beads]] - [[Data Formats]]