MySQL

MySQL's binary log (binlog) provides a reliable foundation for Change Data Capture and data replication. Supermetal supports two replication modes, auto-detected at startup:

  • GTID mode — when gtid_mode=ON, uses Global Transaction Identifiers for position tracking. Recommended when possible, as it supports failover to a different server.
  • File+position mode — when GTID is not enabled, tracks position using binlog file name and byte offset. Works with any MySQL 8.0+ server that has binary logging enabled.

This guide covers the features, prerequisites, and configuration steps required to connect MySQL with Supermetal.


Features

FeatureNotes
Initial Data Sync

Change Data Capture

Schema Evolution

Transactional Consistency

Catalog Support


Prerequisites

Before you begin, ensure you have:

  • MySQL Requirements:

    • Version: MySQL 8.0 or later
    • Supported Deployments:
      • Self-Hosted MySQL
      • Amazon RDS for MySQL
      • Azure Database for MySQL

    Replication Mode

    Supermetal auto-detects the replication mode at startup. If gtid_mode=ON, it uses GTID-based tracking. Otherwise, it falls back to binlog file+position tracking. GTID is recommended for failover support but not required.

  • Binary Log Requirements:

    • Binary Log Format: Must be set to ROW format
    • Binary Log Retention: Configure appropriate retention period (recommended: at least 3 days)
  • Deployment Requirements:

    • Database Permissions: User with SELECT, REPLICATION SLAVE, and REPLICATION CLIENT privileges
    • Network Connectivity: Ensure Supermetal can reach your MySQL server (default port: 3306)

Setup

Permissions Overview

Supermetal requires a dedicated MySQL user with appropriate permissions to read data and access the binary log for CDC.

Required PermissionPurpose
SELECTRead data from tables during snapshot and CDC
REPLICATION SLAVEAccess the binary log for CDC
REPLICATION CLIENTQuery server status and replication position

Setup Instructions

Self-Hosted MySQL Setup

Enable Binary Logging

Edit your MySQL configuration file (my.cnf or my.ini):

[mysqld]
# Server identification (must be unique in your infrastructure)
server-id = 223344

# Binary logging (required)
log_bin = mysql-bin
binlog_format = ROW
binlog_row_image = FULL
binlog_row_metadata = FULL
   
# GTID (recommended, not required)
gtid_mode = ON
enforce_gtid_consistency = ON
   
# Binary log retention (3 days)
binlog_expire_logs_seconds = 259200

Configuration Parameters

  • server-id: Must be unique for each server in your replication topology. Can be any non-zero value between 1 and 4294967295.
  • log_bin: Base name for binary log files.
  • binlog_format: Must be ROW.
  • binlog_row_image: Must be FULL to capture complete before/after row images.
  • binlog_row_metadata: Must be FULL to include column metadata in binary log events.
  • gtid_mode and enforce_gtid_consistency: Recommended for failover support. If you cannot enable GTID (e.g., other CDC tools on the same server depend on non-GTID mode), Supermetal will use file+position tracking instead.
  • binlog_expire_logs_seconds: How long binary logs are kept before automatic deletion.

Create a Dedicated User for Replication

Connect to MySQL as a privileged user and run:

CREATE USER 'supermetal_user'@'%' IDENTIFIED BY 'strong-password';
GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'supermetal_user'@'%';
FLUSH PRIVILEGES;

Script Variables

Replace the following variables with your own values:

  • supermetal_user: Username for Supermetal access
  • strong-password: Secure password for the Supermetal user

Verify Configuration

Verify your configuration:

SHOW VARIABLES LIKE 'log_bin';
SHOW VARIABLES LIKE 'binlog_format';
SHOW VARIABLES LIKE 'binlog_row_image';
SHOW VARIABLES LIKE 'binlog_row_metadata';
SHOW VARIABLES LIKE 'server_id';
SHOW VARIABLES LIKE 'gtid_mode';

Required values:

  • log_bin: ON
  • binlog_format: ROW
  • binlog_row_image: FULL
  • binlog_row_metadata: FULL
  • server_id: A non-zero value

GTID values (if enabled):

  • gtid_mode: ON
  • enforce_gtid_consistency: ON

If gtid_mode is OFF, Supermetal will use file+position replication automatically.

Amazon RDS MySQL Setup

Create a Parameter Group

  1. In the AWS RDS console, navigate to "Parameter groups"
  2. Click "Create parameter group"
  3. Select the MySQL engine and version that matches your RDS instance
  4. Enter a name and description (e.g., "supermetal-cdc-params")
  5. Click "Create"
  6. Select the newly created parameter group and modify the following parameters:
    • binlog_format: ROW
    • binlog_row_image: FULL
    • binlog_row_metadata: FULL
    • gtid-mode: ON (recommended, not required)
    • enforce_gtid_consistency: ON (only if gtid-mode is ON)

RDS Limitations

In RDS, server-id is automatically assigned and cannot be modified.

Apply the Parameter Group to Your RDS Instance

  1. Navigate to your RDS instances and select your MySQL instance
  2. Click "Modify"
  3. Under "Additional configuration", select the parameter group you created
  4. Choose "Apply immediately" or schedule for your next maintenance window
  5. Click "Continue" and confirm the changes

Set Binary Log Retention Period

Connect to your RDS MySQL instance and run:

CALL mysql.rds_set_configuration('binlog retention hours', 72);

RDS Specific

Amazon RDS uses binlog retention hours instead of the standard binlog_expire_logs_seconds parameter. The default value is 0 (immediate removal), so this step is essential.

Create a User for Replication

CREATE USER 'supermetal_user'@'%' IDENTIFIED BY 'strong-password';
GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'supermetal_user'@'%';
FLUSH PRIVILEGES;

Script Variables

Replace the following variables with your own values:

  • supermetal_user: Username for Supermetal access
  • strong-password: Secure password for the Supermetal user

Azure Database for MySQL Setup

Configure Server Parameters

  1. In the Azure portal, navigate to your MySQL server resource
  2. Select "Server parameters" under the Settings section
  3. Find and set the following parameters:
    • binlog_format: ROW
    • binlog_row_image: FULL
    • binlog_row_metadata: FULL
    • gtid_mode: ON (recommended, not required)
    • enforce_gtid_consistency: ON (only if gtid_mode is ON)
  4. Click "Save" to apply the changes
  5. Restart your Azure Database for MySQL server if prompted

Azure Limitations

Some parameters might be read-only in Azure Database for MySQL, particularly in Flexible Server configurations. The server-id is automatically assigned by Azure and cannot be modified.

Create a User for Replication

Connect to your Azure MySQL instance and run:

CREATE USER 'supermetal_user'@'%' IDENTIFIED BY 'strong-password';
GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'supermetal_user'@'%';
FLUSH PRIVILEGES;

Script Variables

Replace the following variables with your own values:

  • supermetal_user: Username for Supermetal access
  • strong-password: Secure password for the Supermetal user

Network Security

Ensure your Azure MySQL server's firewall rules allow connections from the Supermetal service IP addresses.


Data Types Mapping

Supermetal automatically maps MySQL data types to Apache Arrow data types according to the following mapping:

MySQL TypeArrow TypeNotes
TINYINTInt8 / UInt8Signed/unsigned determined by column definition
SMALLINTInt16 / UInt16Signed/unsigned determined by column definition
MEDIUMINTInt32 / UInt32Signed/unsigned determined by column definition
INTInt32 / UInt32Signed/unsigned determined by column definition
BIGINTInt64 / UInt64Signed/unsigned determined by column definition
FLOATFloat32
DOUBLEFloat64
DECIMALDecimal128 / Decimal256 / Utf8Decimal128 for precision ≤ 38, Decimal256 for precision ≤ 76, Utf8 for precision > 76
MySQL TypeArrow TypeNotes
CHARUtf8
VARCHARUtf8
TEXTUtf8Includes all TEXT variants (TINYTEXT, TEXT, MEDIUMTEXT, LONGTEXT)
ENUMUtf8Original enum values preserved as strings
SETList<Utf8>Converted to arrays of strings (e.g., 'value1,value2' → ["value1", "value2"])
JSONUtf8Preserved as JSON string
MySQL TypeArrow TypeNotes
BINARYBinary
VARBINARYBinary
BLOBBinaryIncludes all BLOB variants (TINYBLOB, BLOB, MEDIUMBLOB, LONGBLOB)
BITUtf8Converted to binary string representation (e.g., BIT(4) value 5 becomes "0101")
MySQL TypeArrow TypeNotes
DATEDate32
TIMEIntervalDayTime / Utf8IntervalDayTime for precision ≤ 3 (milliseconds), Utf8 for precision > 3 (microseconds)
DATETIMETimestamp(Microsecond)Full microsecond precision without timezone
TIMESTAMPTimestamp(Microsecond)Full microsecond precision, stored in UTC internally by MySQL
YEARUInt16
MySQL TypeArrow Type
GEOMETRYUnsupported
POINTUnsupported
LINESTRINGUnsupported
POLYGONUnsupported
MULTIPOINTUnsupported
MULTILINESTRINGUnsupported
MULTIPOLYGONUnsupported
GEOMCOLLECTIONUnsupported
VECTORUnsupported

Columns with unsupported data types are automatically excluded from replication. Tables containing only unsupported column types cannot be replicated.

Last updated on

On this page