MySQL
Supermetal replicates from MySQL using the binary log (binlog) for change data capture and parallel snapshots for initial loads.
Prerequisites
- MySQL 5.7 or later, self hosted or managed (Amazon RDS, Azure Database for MySQL).
- Binary logging enabled with
ROWformat and a retention period of at least 3 days recommended. - A user with
SELECT,REPLICATION SLAVE, andREPLICATION CLIENTprivileges. - Network connectivity from the Supermetal agent to the MySQL server (default port 3306).
Replication modes
Supermetal detects the replication mode at startup.
- GTID mode. When
gtid_mode=ON, position tracking uses Global Transaction Identifiers. Recommended where possible, since it survives failover to a different server. - File+position mode. Tracks the binlog file name and byte offset. Use when GTID cannot be enabled on the server.
Setup
Supermetal requires a dedicated MySQL user with permissions to read data and access the binary log.
| Permission | Purpose |
|---|---|
SELECT | Read data from tables during snapshot and CDC |
REPLICATION SLAVE | Access the binary log for CDC |
REPLICATION CLIENT | Query server status and replication position |
Self hosted MySQL setup
Enable binary logging
Edit your MySQL configuration file (my.cnf or my.ini):
[mysqld]
# Server identification (must be unique in your infrastructure)
server-id = 223344
# Binary logging (required)
log_bin = mysql-bin
binlog_format = ROW
binlog_row_image = FULL
binlog_row_metadata = FULL
# GTID (recommended, not required)
gtid_mode = ON
enforce_gtid_consistency = ON
# Binary log retention (3 days)
binlog_expire_logs_seconds = 259200Configuration parameters
server-id: must be unique for each server in your replication topology. Any non zero value between 1 and 4294967295.log_bin: base name for binary log files.binlog_format: must beROW.binlog_row_image: must beFULLto capture complete before and after row images.binlog_row_metadata: must beFULLto include column metadata in binary log events.gtid_modeandenforce_gtid_consistency: recommended for failover support. If you cannot enable GTID (for example, other CDC tools on the same server depend on non GTID mode), Supermetal uses file+position tracking instead.binlog_expire_logs_seconds: how long binary logs are kept before automatic deletion.
Create a dedicated user for replication
Connect to MySQL as a privileged user and run:
CREATE USER 'supermetal_user'@'%' IDENTIFIED BY 'strong-password';
GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'supermetal_user'@'%';
FLUSH PRIVILEGES;Script variables
Replace supermetal_user and strong-password with your own values and store the password securely.
Verify configuration
SHOW VARIABLES LIKE 'log_bin';
SHOW VARIABLES LIKE 'binlog_format';
SHOW VARIABLES LIKE 'binlog_row_image';
SHOW VARIABLES LIKE 'binlog_row_metadata';
SHOW VARIABLES LIKE 'server_id';
SHOW VARIABLES LIKE 'gtid_mode';Required values:
log_bin: ONbinlog_format: ROWbinlog_row_image: FULLbinlog_row_metadata: FULLserver_id: a non zero value
GTID values (if enabled):
gtid_mode: ONenforce_gtid_consistency: ON
If gtid_mode is OFF, Supermetal uses file+position replication automatically.
Amazon RDS MySQL setup
Create a parameter group
- In the AWS RDS console, navigate to "Parameter groups"
- Click "Create parameter group"
- Select the MySQL engine and version that matches your RDS instance
- Enter a name and description (e.g., "supermetal-cdc-params")
- Click "Create"
- Select the newly created parameter group and modify the following parameters:
binlog_format: ROWbinlog_row_image: FULLbinlog_row_metadata: FULLgtid-mode: ON (recommended, not required)enforce_gtid_consistency: ON (only if gtid-mode is ON)
RDS limitations
In RDS, server-id is automatically assigned and cannot be modified.
Apply the parameter group to your RDS instance
- Navigate to your RDS instances and select your MySQL instance
- Click "Modify"
- Under "Additional configuration", select the parameter group you created
- Choose "Apply immediately" or schedule for your next maintenance window
- Click "Continue" and confirm the changes
Set the binary log retention period
Connect to your RDS MySQL instance and run:
CALL mysql.rds_set_configuration('binlog retention hours', 72);RDS specific
Amazon RDS uses binlog retention hours instead of the standard binlog_expire_logs_seconds parameter. The default value is 0 (immediate removal), so this step is essential.
Create a user for replication
CREATE USER 'supermetal_user'@'%' IDENTIFIED BY 'strong-password';
GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'supermetal_user'@'%';
FLUSH PRIVILEGES;Script variables
Replace supermetal_user and strong-password with your own values and store the password securely.
Azure Database for MySQL setup
Configure server parameters
- In the Azure portal, navigate to your MySQL server resource
- Select "Server parameters" under the Settings section
- Find and set the following parameters:
binlog_format: ROWbinlog_row_image: FULLbinlog_row_metadata: FULLgtid_mode: ON (recommended, not required)enforce_gtid_consistency: ON (only if gtid_mode is ON)
- Click "Save" to apply the changes
- Restart your Azure Database for MySQL server if prompted
Azure limitations
Some parameters might be read only in Azure Database for MySQL, particularly in Flexible Server configurations. The server-id is automatically assigned by Azure and cannot be modified.
Create a user for replication
Connect to your Azure MySQL instance and run:
CREATE USER 'supermetal_user'@'%' IDENTIFIED BY 'strong-password';
GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'supermetal_user'@'%';
FLUSH PRIVILEGES;Script variables
Replace supermetal_user and strong-password with your own values and store the password securely.
Network security
Ensure your Azure MySQL server's firewall rules allow connections from the Supermetal agent.
Data Types Mapping
| MySQL Type | Arrow Type | Notes |
|---|---|---|
TINYINT(1) | Boolean | By default. Disable the TINYINT(1) as boolean option to keep integers when columns hold values outside 0 and 1. |
TINYINT | Int8 / UInt8 | Signed or unsigned per column definition |
SMALLINT | Int16 / UInt16 | Signed or unsigned per column definition |
MEDIUMINT | Int32 / UInt32 | Signed or unsigned per column definition |
INT | Int32 / UInt32 | Signed or unsigned per column definition |
BIGINT | Int64 / UInt64 | Signed or unsigned per column definition |
FLOAT | Float32 | |
DOUBLE | Float64 | |
DECIMAL | Decimal128 / Decimal256 / Utf8 | Decimal128 for precision ≤ 38, Decimal256 for precision ≤ 76, Utf8 beyond |
| MySQL Type | Arrow Type | Notes |
|---|---|---|
CHAR | Utf8 | |
VARCHAR | Utf8 | |
TEXT | Utf8 | All TEXT variants (TINYTEXT, TEXT, MEDIUMTEXT, LONGTEXT) |
ENUM | Utf8 | Enum values preserved as strings |
SET | List<Utf8> | Converted to arrays of strings ('value1,value2' becomes ["value1", "value2"]) |
JSON | Utf8 | Preserved as a JSON string |
| MySQL Type | Arrow Type | Notes |
|---|---|---|
BINARY | Binary | |
VARBINARY | Binary | |
BLOB | Binary | All BLOB variants (TINYBLOB, BLOB, MEDIUMBLOB, LONGBLOB) |
BIT(1) | Boolean | MySQL constrains its storage to one bit |
BIT(n) | Utf8 | Binary string representation (BIT(4) value 5 becomes "0101") |
| MySQL Type | Arrow Type | Notes |
|---|---|---|
DATE | Date32 | |
TIME | IntervalDayTime / Utf8 | IntervalDayTime for precision ≤ 3 (milliseconds), Utf8 beyond |
DATETIME | Timestamp(µs) | Microsecond precision, timezone naive |
TIMESTAMP | Timestamp(µs) | Microsecond precision, stored in UTC internally by MySQL |
YEAR | UInt16 |
| MySQL Type | Arrow Type |
|---|---|
GEOMETRY | Unsupported |
POINT | Unsupported |
LINESTRING | Unsupported |
POLYGON | Unsupported |
MULTIPOINT | Unsupported |
MULTILINESTRING | Unsupported |
MULTIPOLYGON | Unsupported |
GEOMCOLLECTION | Unsupported |
VECTOR | Unsupported |
Columns with unsupported data types are excluded from replication. Tables containing only unsupported column types cannot be replicated.
Changelog
0.1.7
2026-06-16
Snapshot chunking now interpolates datetime, date, and decimal primary keys.
SSH tunnel errors now surface during validation.
0.1.5
2026-06-08
MySQL 5.7 is now supported. 5.7 reached end-of-life in October 2023.
0.1.2
2026-05-21
Source connections can route through an SSH tunnel to a bastion host.
Last updated on