Kafka
Apache Kafka is a distributed event streaming platform for high-performance data pipelines.
This guide walks you through configuring Kafka to work seamlessly with Supermetal.
Protobuf & Apicurio support
Protobuf serialization and Apicurio Schema Registry are not yet supported in the Kafka target. Use JSON or Avro, and choose Confluent for Schema Registry.
Features
| Feature | Details |
|---|---|
| Message Formats | |
| Serialization | |
| Schema Registry | |
| Topic Management | |
| Schema Changes | |
| Transactions | |
| Deletes |
Prerequisites
Before you begin, ensure you have:
- A Kafka cluster that Supermetal can reach.
- A set of Kafka credentials (or a service account) that can produce to your target topics.
- (Optional) A Schema Registry if you plan to use Avro.
Setup
Configure Kafka Connection
In the Kafka target form, configure:
- Brokers: a comma-separated list of bootstrap brokers (for example,
broker1:9092, broker2:9092). - Security Protocol and SASL Mechanism (if required by your cluster).
- Username and Password for SASL mechanisms that require credentials.
- SSL CA / Certificate / Key paths when using
SslorSaslSsl.
SSL file paths must be available to Supermetal
If you reference certificate or key files, make sure they exist in the Supermetal runtime environment (for example, mount them as Kubernetes secrets or Docker volumes).
Configure Topics
Set Topic Name Template to control where each table is written.
The default template is:
supermetal.{database_or_schema}.{table}
Optionally configure Topic Options (partitions, replication factor, and topic config) if you want Supermetal to create missing topics.
If you plan to use automatic topic creation, make sure your Kafka principal is allowed to create topics. Otherwise, pre-create the topics that match your template.
Existing topics are not modified
Topic options are only applied when Supermetal creates a topic. If a topic already exists, its configuration is left unchanged.
Select Message Format and Serialization
Configure:
- Format Type:
DebeziumorSupermetal - Value SerDe:
Json(no registry required) orAvro(requires Schema Registry) - Key SerDe (optional): configure if you need structured keys or if your tables have composite primary keys
If you choose Avro, configure the Value Schema Registry (and Key Schema Registry if you also use Avro/Protobuf for keys).
Protobuf & Apicurio
Protobuf serialization and Apicurio Registry are not yet supported in the Kafka target.
Configure Optional Behaviors
Depending on your needs, configure:
- Tombstones on Delete: emit an extra
nullvalue record after deletes (requires a key) - Schema Changes: enable schema change events and choose dedicated vs same-topic routing
- Transactions: enable to wrap snapshot batches and CDC writes in Kafka transactions
- Producer Preset and tuning knobs (compression, batch size, linger, retries) to match your workload
Transactions require acks=All and idempotence
Transactions automatically enforce idempotence and acks=All. These settings are only relevant for non-transactional producers.
Topic Naming
Supermetal writes each table to a Kafka topic derived from Topic Name Template.
The default template is:
supermetal.{database_or_schema}.{table}
Supported placeholders:
| Placeholder | Resolves to |
|---|---|
{database} | The source database name (when available) |
{schema} | The source schema/namespace name (when available) |
{table} | The table name |
{database_or_schema} | Database if present, otherwise schema |
Supermetal also normalizes topic names by collapsing consecutive periods and trimming leading/trailing periods. This keeps topic names clean when a placeholder is empty.
Single-topic routing
You can route multiple tables into a single topic by using a fixed template (for example, supermetal.cdc).
If you do this, ensure your consumers can identify the source table.
Message Formats
Debezium format is designed for compatibility with consumers that expect Debezium-style CDC events.
Change events Supermetal emits the standard Debezium operation types:
- Inserts and snapshots (
op = c/r) - Updates (
op = u) - Deletes (
op = d) - Truncates (
op = t)
Schema and payload With JSON serialization, the value can be emitted as:
- A Debezium payload only (schemas disabled), or
- A
{schema, payload}wrapper (schemas enabled)
This is controlled by Embed Schema in Message when you are using JSON without a Schema Registry.
Headers (optional) Debezium headers can be enabled to simplify routing and debugging:
-
__debezium-operation -
__debezium-source-schema -
__debezium-source-table __debezium.oldkeyand__debezium.newkey(for primary key updates)
Tombstones on delete (optional)
When enabled, Supermetal emits an additional record with the same key and a null value after each delete.
Transaction metadata (optional) When Emit TX Metadata is enabled, Supermetal publishes Debezium-style transaction begin/end events to a separate topic.
For background on the Debezium envelope format, see the Debezium documentation on change event structure: Debezium change event structure.
Snapshot Last Row Mode
Snapshot Last Row Mode is reserved for future snapshot coordination behavior and currently does not change how snapshots are emitted.
Supermetal format is a compact, row-oriented format intended for direct consumption.
Data events Each message value includes the table columns plus two metadata fields:
_sm_version: an event version (nanoseconds)_sm_deleted: a soft delete flag
Delete events are emitted as a "soft delete" row with _sm_deleted=true. If Tombstones on Delete is enabled,
Supermetal can also emit a Kafka tombstone after the soft delete.
Operational events Truncates and schema changes are emitted as explicit event records (either on a dedicated schema topic or in-band, depending on configuration).
Composite primary keys
If your tables use composite primary keys and you are using the Supermetal format, configure a Key SerDe (JSON or Avro). Composite keys require a configured key serializer.
Serialization
The Kafka target has separate settings for:
- Value SerDe (required)
- Key SerDe (optional)
Value SerDe
- JSON: easiest to start with; can run without a Schema Registry.
- Avro: requires a Schema Registry.
Key SerDe
If you don't set a Key SerDe, Supermetal uses these defaults:
- Debezium: emits Debezium-style JSON keys when the table has a primary key.
- Supermetal:
- Scalar (single-column) primary keys are encoded directly into the Kafka message key (bytes).
- Composite primary keys require a configured Key SerDe.
For scalar primary keys with no Key SerDe, keys are encoded as:
| Primary key type | Key bytes |
|---|---|
| String | UTF-8 bytes |
| Binary | Raw bytes |
| Other scalar types | String representation (UTF-8 bytes) |
Tombstones require keys
Tombstones on delete are only emitted when a message key can be produced.
Schema Registry
If you configure a Schema Registry, Supermetal registers schemas and uses schema IDs when serializing messages:
- Avro always uses the registry.
- JSON can use the registry (JSON Schema) or run without it.
- Confluent Schema Registry is supported.
See the Confluent Schema Registry documentation for compatibility settings, subject naming, and client configuration.
Transactions
Kafka transactions are optional. When enabled, Supermetal uses transactions to:
- Wrap snapshot batches in transactions.
- Align CDC writes to source transaction boundaries (begin/commit).
Last updated on