Cloudflare Pipelines - DeveloPassion

# Cloudflare Pipelines [[Cloudflare]] Pipelines is a managed data-ingestion service that accepts high-volume event streams via HTTP or Worker bindings, buffers them, and writes the output to [[Cloudflare R2]] as partitioned Parquet or JSON files. Think of it as "Kafka → S3 for the lazy" — no brokers, no Schema Registry, no Flink job. It is the missing primitive between [[Cloudflare Queues]] (transactional messaging) and analytics workloads (batch over big data). Queues is for "process each message once"; Pipelines is for "ingest millions of events and dump them somewhere queryable." ## Why It Matters Building event ingestion yourself means standing up Kafka or Kinesis, writing a sink, partitioning files, handling backpressure, monitoring lag. Pipelines collapses all of that to a binding call: `env.MY_PIPELINE.send(events)`. Output lands in [[Cloudflare R2]], ready to query with DuckDB, Athena, ClickHouse, or whatever analytics engine. ## Common Use Cases - **Analytics event ingestion** — product analytics, clickstreams - **Log aggregation** to R2 for cheap long-term storage - **CDC sinks** — write database change events to columnar files - **AI training data collection** — gather inference inputs/outputs for fine-tuning - **Audit trails** — immutable append-only event records ## Architecture Shape - HTTP endpoint or Worker binding accepts events - Cloudflare buffers, batches, partitions (by time and key) - Writes Parquet/JSON files to a configured R2 bucket - Downstream analytics queries the R2 bucket directly ## References - Pipelines home: https://developers.cloudflare.com/pipelines/ ## Related - [[Cloudflare]] - [[Cloudflare Workers]] - [[Cloudflare R2]] - [[Cloudflare Queues]] - [[Wrangler]]