# JSONL
JSONL (JSON Lines), also known as Newline Delimited JSON (NDJSON), is a text format where each line is a valid [[JavaScript Object Notation (JSON)]] object. It's designed for storing and streaming structured data, one record at a time.
Unlike standard JSON arrays, JSONL doesn't require loading the entire dataset into memory. Each line can be parsed independently, making it ideal for large datasets, log files, and streaming applications.
## Format
```jsonl
{"name": "Alice", "age": 30}
{"name": "Bob", "age": 25}
{"name": "Charlie", "age": 35}
```
## Key Characteristics
- **One JSON object per line**: Each line is a complete, valid JSON value
- **Newline separated**: Lines end with `\n` (or `\r\n` on Windows)
- **No commas between records**: Unlike JSON arrays
- **No enclosing brackets**: No `[` or `]` wrapping the data
- **Streamable**: Process line by line without loading entire file
- **Appendable**: Add new records by appending lines
## Advantages Over JSON Arrays
| Aspect | JSONL | JSON Array |
|--------|-------|------------|
| Memory usage | Low (stream line by line) | High (load entire file) |
| Append data | Simple (add new line) | Complex (rewrite file) |
| Partial reads | Easy | Difficult |
| Error recovery | Skip bad lines | Entire file invalid |
| Parallel processing | Natural | Requires splitting |
## Common Use Cases
- **Log files**: Each event as a separate JSON object
- **Data pipelines**: Streaming between services
- **Machine learning**: Training data formats (OpenAI, Hugging Face)
- **Database exports**: One record per line
- **Issue tracking**: [[Beads]] stores issues as `.beads/beads.jsonl`
- **Event sourcing**: Append-only event logs
- **Analytics**: Clickstream and telemetry data
## Working with JSONL
### Command Line with jq
```bash
# Read and process each line
cat data.jsonl | jq -c '.name'
# Filter lines
cat data.jsonl | jq -c 'select(.age > 30)'
# Convert JSON array to JSONL
jq -c '.[]' array.json > data.jsonl
# Convert JSONL to JSON array
jq -s '.' data.jsonl > array.json
```
### Python
```python
import json
# Read JSONL
with open('data.jsonl', 'r') as f:
for line in f:
record = json.loads(line)
print(record)
# Write JSONL
with open('data.jsonl', 'w') as f:
for record in records:
f.write(json.dumps(record) + '\n')
```
### JavaScript/Node.js
```javascript
const fs = require('fs');
const readline = require('readline');
// Read JSONL
const rl = readline.createInterface({
input: fs.createReadStream('data.jsonl')
});
rl.on('line', (line) => {
const record = JSON.parse(line);
console.log(record);
});
```
## File Extensions
- `.jsonl` (most common)
- `.ndjson`
- `.json` (sometimes, context-dependent)
## Specifications
There's no official RFC, but two community specifications exist:
- JSON Lines: https://jsonlines.org/
- NDJSON: https://github.com/ndjson/ndjson-spec
## References
- https://jsonlines.org/
- https://en.wikipedia.org/wiki/JSON_streaming#Newline-delimited_JSON
## Related
- [[JavaScript Object Notation (JSON)]]
- [[jq]]
- [[Beads]]
- [[Data Formats]]