Kafka

Apache Kafka is a distributed event streaming platform for high-performance data pipelines.

This guide walks you through configuring Kafka to work seamlessly with Supermetal.

Protobuf & Apicurio support

Protobuf serialization and Apicurio Schema Registry are not yet supported in the Kafka target. Use JSON or Avro, and choose Confluent for Schema Registry.


Features

FeatureDetails
Message Formats

Serialization

Schema Registry

Topic Management

Schema Changes

Transactions

Deletes


Prerequisites

Before you begin, ensure you have:

  • A Kafka cluster that Supermetal can reach.
  • A set of Kafka credentials (or a service account) that can produce to your target topics.
  • (Optional) A Schema Registry if you plan to use Avro.

Setup

Configure Kafka Connection

In the Kafka target form, configure:

  • Brokers: a comma-separated list of bootstrap brokers (for example, broker1:9092, broker2:9092).
  • Security Protocol and SASL Mechanism (if required by your cluster).
  • Username and Password for SASL mechanisms that require credentials.
  • SSL CA / Certificate / Key paths when using Ssl or SaslSsl.

SSL file paths must be available to Supermetal

If you reference certificate or key files, make sure they exist in the Supermetal runtime environment (for example, mount them as Kubernetes secrets or Docker volumes).

Configure Topics

Set Topic Name Template to control where each table is written.

The default template is:

  • supermetal.{database_or_schema}.{table}

Optionally configure Topic Options (partitions, replication factor, and topic config) if you want Supermetal to create missing topics.

If you plan to use automatic topic creation, make sure your Kafka principal is allowed to create topics. Otherwise, pre-create the topics that match your template.

Existing topics are not modified

Topic options are only applied when Supermetal creates a topic. If a topic already exists, its configuration is left unchanged.

Select Message Format and Serialization

Configure:

  • Format Type: Debezium or Supermetal
  • Value SerDe: Json (no registry required) or Avro (requires Schema Registry)
  • Key SerDe (optional): configure if you need structured keys or if your tables have composite primary keys

If you choose Avro, configure the Value Schema Registry (and Key Schema Registry if you also use Avro/Protobuf for keys).

Protobuf & Apicurio

Protobuf serialization and Apicurio Registry are not yet supported in the Kafka target.

Configure Optional Behaviors

Depending on your needs, configure:

  • Tombstones on Delete: emit an extra null value record after deletes (requires a key)
  • Schema Changes: enable schema change events and choose dedicated vs same-topic routing
  • Transactions: enable to wrap snapshot batches and CDC writes in Kafka transactions
  • Producer Preset and tuning knobs (compression, batch size, linger, retries) to match your workload

Transactions require acks=All and idempotence

Transactions automatically enforce idempotence and acks=All. These settings are only relevant for non-transactional producers.


Topic Naming

Supermetal writes each table to a Kafka topic derived from Topic Name Template.

The default template is:

  • supermetal.{database_or_schema}.{table}

Supported placeholders:

PlaceholderResolves to
{database}The source database name (when available)
{schema}The source schema/namespace name (when available)
{table}The table name
{database_or_schema}Database if present, otherwise schema

Supermetal also normalizes topic names by collapsing consecutive periods and trimming leading/trailing periods. This keeps topic names clean when a placeholder is empty.

Single-topic routing

You can route multiple tables into a single topic by using a fixed template (for example, supermetal.cdc). If you do this, ensure your consumers can identify the source table.


Message Formats

Debezium format is designed for compatibility with consumers that expect Debezium-style CDC events.

Change events Supermetal emits the standard Debezium operation types:

  • Inserts and snapshots (op = c / r)
  • Updates (op = u)
  • Deletes (op = d)
  • Truncates (op = t)

Schema and payload With JSON serialization, the value can be emitted as:

  • A Debezium payload only (schemas disabled), or
  • A {schema, payload} wrapper (schemas enabled)

This is controlled by Embed Schema in Message when you are using JSON without a Schema Registry.

Headers (optional) Debezium headers can be enabled to simplify routing and debugging:

  • __debezium-operation
  • __debezium-source-schema
  • __debezium-source-table
  • __debezium.oldkey and __debezium.newkey (for primary key updates)

Tombstones on delete (optional) When enabled, Supermetal emits an additional record with the same key and a null value after each delete.

Transaction metadata (optional) When Emit TX Metadata is enabled, Supermetal publishes Debezium-style transaction begin/end events to a separate topic.

For background on the Debezium envelope format, see the Debezium documentation on change event structure: Debezium change event structure.

Snapshot Last Row Mode

Snapshot Last Row Mode is reserved for future snapshot coordination behavior and currently does not change how snapshots are emitted.

Supermetal format is a compact, row-oriented format intended for direct consumption.

Data events Each message value includes the table columns plus two metadata fields:

  • _sm_version: an event version (nanoseconds)
  • _sm_deleted: a soft delete flag

Delete events are emitted as a "soft delete" row with _sm_deleted=true. If Tombstones on Delete is enabled, Supermetal can also emit a Kafka tombstone after the soft delete.

Operational events Truncates and schema changes are emitted as explicit event records (either on a dedicated schema topic or in-band, depending on configuration).

Composite primary keys

If your tables use composite primary keys and you are using the Supermetal format, configure a Key SerDe (JSON or Avro). Composite keys require a configured key serializer.


Serialization

The Kafka target has separate settings for:

  • Value SerDe (required)
  • Key SerDe (optional)

Value SerDe

  • JSON: easiest to start with; can run without a Schema Registry.
  • Avro: requires a Schema Registry.

Key SerDe

If you don't set a Key SerDe, Supermetal uses these defaults:

  • Debezium: emits Debezium-style JSON keys when the table has a primary key.
  • Supermetal:
    • Scalar (single-column) primary keys are encoded directly into the Kafka message key (bytes).
    • Composite primary keys require a configured Key SerDe.

For scalar primary keys with no Key SerDe, keys are encoded as:

Primary key typeKey bytes
StringUTF-8 bytes
BinaryRaw bytes
Other scalar typesString representation (UTF-8 bytes)

Tombstones require keys

Tombstones on delete are only emitted when a message key can be produced.


Schema Registry

If you configure a Schema Registry, Supermetal registers schemas and uses schema IDs when serializing messages:

  • Avro always uses the registry.
  • JSON can use the registry (JSON Schema) or run without it.
  • Confluent Schema Registry is supported.

See the Confluent Schema Registry documentation for compatibility settings, subject naming, and client configuration.


Transactions

Kafka transactions are optional. When enabled, Supermetal uses transactions to:

  • Wrap snapshot batches in transactions.
  • Align CDC writes to source transaction boundaries (begin/commit).

Last updated on

On this page