Architecture

Supermetal ships as a single binary and runs the entire pipeline from source to target in one process. Data moves through the pipeline as Apache Arrow record batches, read from the source, staged in an object store buffer, and written to the target with no intermediate serialization between stages. Source transactions arrive at the target intact.

Design

Loading diagram...

Single process

A typical CDC pipeline chains a producer, a broker, and consumers. Debezium reads the database log and writes Avro or JSON into Kafka, and a consumer reads it back, decodes it, and encodes it again as Parquet before loading. Each step is a separate process behind a network, so a row is serialized and deserialized several times on its way to the target, and each component brings its own deployment, monitoring, and configuration.

Supermetal runs the whole pipeline in one process.

  • The source reader, buffer, and target writer pass Arrow record batches in memory. Data is serialized once, into the format the target loads.
  • The UI, management APIs, and metrics are built into the binary. There is nothing else to deploy.
  • The engine is built on Tokio, Rust's asynchronous runtime, and moves data in record batches of configurable size, so memory use is bounded by batch size rather than dataset size. Batch serialization runs on a separate work stealing pool so compute and IO proceed in parallel. One agent runs many connectors.

Arrow as the data format

Sources read data into Arrow record batches and targets consume them.

  • Numeric precision and scale, signedness, and temporal resolution are preserved exactly. A DECIMAL(38,12) at the source arrives as a DECIMAL(38,12) at the target. Pipelines that exchange JSON, CSV, or Avro flatten these into a handful of supertypes, and the target either loses precision or spends extra compute recovering it.
  • Converting to a target's bulk format, such as Parquet for warehouses, is batched and columnar instead of serialized row by row, so it vectorizes with SIMD.
  • When a source schema changes, Supermetal detects it through Arrow's schema metadata, propagates the change through the pipeline, and adapts the target.

Object store native

Supermetal uses object storage (S3, GCS, Azure Blob), local NVMe, or any filesystem as a durable buffer between source and target. Buffer files are written in a format the target ingests natively (Arrow, Parquet, CSV, JSON), defaulting to Parquet. The buffer decouples the two ends of the pipeline. A slow target does not backpressure the source, and a target outage does not affect the source pipeline, which keeps capturing into the buffer.

Transactional consistency

At any point, the target reflects a consistent moment in the source's history. Supermetal preserves transaction boundaries end to end. Multi table transactions arrive as they committed, with changes to every table applied together. For throughput, it can batch several source transactions into one atomic target transaction, and it never splits one transaction across target commits.

Replication lifecycle

A deployment runs one or more connectors, each moving data from one source to one target. A connector replicates in two phases, an initial snapshot of existing data followed by continuous capture of changes. On startup it runs a workflow specific to the source.

  1. Establish a consistent point in the source database log (for Postgres, a WAL LSN).
  2. Discover the tables to replicate and their schemas.
  3. Snapshot existing data.
  4. Read the log forward from the consistent point.

The connector checkpoints the consistent point to the state store only after the snapshot completes, so transactions that commit while the snapshot runs are picked up when log reading begins.

Initial snapshot

Loading diagram...

Snapshots run in parallel at two levels.

  • Across tables, many tables are copied at once.
  • Within a table, large tables are split into chunks copied in parallel across all available cores, so a single large table cannot stall the rest of the snapshot. Where the source exposes physical row locations (ctid in Postgres, ROWID in Oracle), chunks follow disk pages and reads stay sequential.

Each chunk streams through the pipeline as record batches, so memory use stays bounded no matter how large the table.

Continuous CDC

After the snapshot, the connector reads the log from the consistent point, decodes entries into inserts, updates, and deletes, and groups them by transaction. Changes accumulate in the buffer and flush to the target on an interval controlled by flush_interval_ms, batching multiple source transactions into each load.

Target loading

Each target ingests data in the form it loads fastest.

  • Warehouses (Snowflake, Databricks, ClickHouse) receive columnar files such as Parquet. When the warehouse can load directly from object storage, Supermetal points it at files already staged in the buffer rather than moving data through the Supermetal process again.
  • Databases (Postgres, MySQL, SQL Server) receive binary data through their bulk load APIs.

Where the target supports transactions, Supermetal applies changes within them, preserving source consistency. Supermetal adds two metadata columns to every target table, a deletion tombstone and a version derived from the source transaction commit timestamp, and uses the version to deduplicate changes during merge and upsert.

State management

Replication state is checkpoint metadata, such as source log positions, not the replicated data itself. It lives on disk and is updated at transaction boundaries from source to target. After a crash or restart, a connector resumes from the last transaction delivered to the target and reprocesses only data that was still in flight.

Correctness

Supermetal is tested with two frameworks. Both run real source and target databases and assert that the replicated data matches the source exactly.

  • Sqllogictest suites, the test format used by query engines such as DuckDB, ClickHouse, and DataFusion, adapted for replication, cover DML, data types, transactional integrity, schema evolution, crash recovery, and deduplication.
  • Type fuzzing sends randomized values for every source type through the pipeline, and each must arrive at the target with value and precision intact.

Last updated on

On this page