MySQL
MySQL's binary log (binlog) provides a reliable foundation for Change Data Capture and data replication. Supermetal builds on this capability to deliver efficient data extraction with minimal database overhead, leveraging Global Transaction Identifiers (GTIDs) for guaranteed consistency.
This guide covers the features, prerequisites, and configuration steps required to connect MySQL with Supermetal mainly focusing on binlog setup and permissions.
Features
| Feature | Notes | 
|---|---|
| Initial Data Sync | |
| Change Data Capture | |
| Schema Evolution | |
| Transactional Consistency | |
| Catalog Support | 
Prerequisites
Before you begin, ensure you have:
- 
MySQL Requirements:
- Version: MySQL 8.0 or later with GTID mode enabled
 - Supported Deployments:
Self-Hosted MySQLAmazon RDS for MySQL(Coming Soon)Azure Database for MySQL(Coming Soon)
 
GTID Requirement
GTID mode must be enabled on your MySQL server for Change Data Capture functionality. See MySQL GTID Documentation for details.
 - 
Binary Log Requirements:
- Binary Log Format: Must be set to 
ROWformat - Binary Log Retention: Configure appropriate retention period (recommended: at least 3 days)
 
 - Binary Log Format: Must be set to 
 - 
Deployment Requirements:
- Database Permissions: User with 
SELECT,REPLICATION SLAVE, andREPLICATION CLIENTprivileges - Network Connectivity: Ensure Supermetal can reach your MySQL server (default port: 3306)
 
 - Database Permissions: User with 
 
Setup
Permissions Overview
Supermetal requires a dedicated MySQL user with appropriate permissions to read data and access the binary log for CDC.
| Required Permission | Purpose | 
|---|---|
SELECT | Read data from tables during snapshot and CDC | 
REPLICATION SLAVE | Access the binary log for CDC | 
REPLICATION CLIENT | Query server status information including GTID position | 
Setup Instructions
Self-Hosted MySQL Setup
Enable Binary Logging and GTID Mode
Edit your MySQL configuration file (my.cnf or my.ini):
[mysqld]
# Server identification (must be unique in your infrastructure)
server-id = 223344
# Enable binary logging
log_bin = mysql-bin
binlog_format = ROW
binlog_row_image = FULL
binlog_row_metadata = FULL
   
# Enable GTID mode
gtid_mode = ON
enforce_gtid_consistency = ON
   
# Set binary log retention (3 days)
binlog_expire_logs_seconds = 259200Configuration Parameters
server-id: Must be unique for each server in your replication topology. Can be any non-zero value between 1 and 4294967295.log_bin: Base name for binary log files.binlog_format: Must be set to ROW for CDC to work properly.binlog_row_image: Must be set to FULL to capture complete before/after row images, which is required for CDC.binlog_row_metadata: Must be set to FULL to include complete metadata in binary log events.gtid_modeandenforce_gtid_consistency: Required for reliable replication position tracking.binlog_expire_logs_seconds: Controls how long binary logs are retained before automatic deletion.
Create a Dedicated User for Replication
Connect to MySQL as a privileged user and run:
CREATE USER 'supermetal_user'@'%' IDENTIFIED BY 'strong-password';
GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'supermetal_user'@'%';
FLUSH PRIVILEGES;Script Variables
Replace the following variables with your own values:
supermetal_user: Username for Supermetal accessstrong-password: Secure password for the Supermetal user
Verify Configuration
Verify binary logging and GTID configuration:
SHOW VARIABLES LIKE 'gtid_mode';
SHOW VARIABLES LIKE 'enforce_gtid_consistency';
SHOW VARIABLES LIKE 'log_bin';
SHOW VARIABLES LIKE 'binlog_format';
SHOW VARIABLES LIKE 'binlog_row_image';
SHOW VARIABLES LIKE 'binlog_row_metadata';
SHOW VARIABLES LIKE 'server_id';The output should show:
gtid_mode: ONenforce_gtid_consistency: ONlog_bin: ONbinlog_format: ROWbinlog_row_image: FULLbinlog_row_metadata: FULLserver_id: A non-zero value
Amazon RDS MySQL Setup
Create a Parameter Group for GTID Mode
- In the AWS RDS console, navigate to "Parameter groups"
 - Click "Create parameter group"
 - Select the MySQL engine and version that matches your RDS instance
 - Enter a name and description (e.g., "supermetal-cdc-params")
 - Click "Create"
 - Select the newly created parameter group and modify the following parameters:
gtid-mode: ONenforce_gtid_consistency: ONbinlog_format: ROWbinlog_row_image: FULLbinlog_row_metadata: FULL
 
RDS Limitations
In RDS, some parameters cannot be modified, including server-id. RDS automatically assigns a unique server-id to each instance, so you don't need to configure this manually.
Apply the Parameter Group to Your RDS Instance
- Navigate to your RDS instances and select your MySQL instance
 - Click "Modify"
 - Under "Additional configuration", select the parameter group you created
 - Choose "Apply immediately" or schedule for your next maintenance window
 - Click "Continue" and confirm the changes
 
Set Binary Log Retention Period
Connect to your RDS MySQL instance and run:
CALL mysql.rds_set_configuration('binlog retention hours', 72);RDS Specific
Amazon RDS uses binlog retention hours instead of the standard binlog_expire_logs_seconds parameter. The default value is 0 (immediate removal), so this step is essential.
Create a User for Replication
CREATE USER 'supermetal_user'@'%' IDENTIFIED BY 'strong-password';
GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'supermetal_user'@'%';
FLUSH PRIVILEGES;Script Variables
Replace the following variables with your own values:
supermetal_user: Username for Supermetal accessstrong-password: Secure password for the Supermetal user
Azure Database for MySQL Setup
Configure Server Parameters
- In the Azure portal, navigate to your MySQL server resource
 - Select "Server parameters" under the Settings section
 - Find and set the following parameters:
gtid_mode: ONenforce_gtid_consistency: ONbinlog_format: ROWbinlog_row_image: FULLbinlog_row_metadata: FULL
 - Click "Save" to apply the changes
 - Restart your Azure Database for MySQL server if prompted
 
Azure Limitations
Some parameters might be read-only in Azure Database for MySQL, particularly in Flexible Server configurations. The server-id is automatically assigned by Azure and cannot be modified.
Create a User for Replication
Connect to your Azure MySQL instance and run:
CREATE USER 'supermetal_user'@'%' IDENTIFIED BY 'strong-password';
GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'supermetal_user'@'%';
FLUSH PRIVILEGES;Script Variables
Replace the following variables with your own values:
supermetal_user: Username for Supermetal accessstrong-password: Secure password for the Supermetal user
Network Security
Ensure your Azure MySQL server's firewall rules allow connections from the Supermetal service IP addresses.
Data Types Mapping
Supermetal automatically maps MySQL data types to Apache Arrow data types according to the following mapping:
| MySQL Type | Arrow Type | Notes | 
|---|---|---|
TINYINT | Int8 / UInt8 | Signed/unsigned determined by column definition | 
SMALLINT | Int16 / UInt16 | Signed/unsigned determined by column definition | 
MEDIUMINT | Int32 / UInt32 | Signed/unsigned determined by column definition | 
INT | Int32 / UInt32 | Signed/unsigned determined by column definition | 
BIGINT | Int64 / UInt64 | Signed/unsigned determined by column definition | 
FLOAT | Float32 | |
DOUBLE | Float64 | |
DECIMAL | Decimal128 / Decimal256 / Utf8 | Decimal128 for precision ≤ 38, Decimal256 for precision ≤ 76, Utf8 for precision > 76 | 
| MySQL Type | Arrow Type | Notes | 
|---|---|---|
CHAR | Utf8 | |
VARCHAR | Utf8 | |
TEXT | Utf8 | Includes all TEXT variants (TINYTEXT, TEXT, MEDIUMTEXT, LONGTEXT) | 
ENUM | Utf8 | Original enum values preserved as strings | 
SET | List<Utf8> | Converted to arrays of strings (e.g., 'value1,value2' → ["value1", "value2"]) | 
JSON | Utf8 | Preserved as JSON string | 
| MySQL Type | Arrow Type | Notes | 
|---|---|---|
BINARY | Binary | |
VARBINARY | Binary | |
BLOB | Binary | Includes all BLOB variants (TINYBLOB, BLOB, MEDIUMBLOB, LONGBLOB) | 
BIT | Utf8 | Converted to binary string representation (e.g., BIT(4) value 5 becomes "0101") | 
| MySQL Type | Arrow Type | Notes | 
|---|---|---|
DATE | Date32 | |
TIME | IntervalDayTime / Utf8 | IntervalDayTime for precision ≤ 3 (milliseconds), Utf8 for precision > 3 (microseconds) | 
DATETIME | Timestamp(Microsecond) | Full microsecond precision without timezone | 
TIMESTAMP | Timestamp(Microsecond) | Full microsecond precision, stored in UTC internally by MySQL | 
YEAR | UInt16 | 
| MySQL Type | Arrow Type | 
|---|---|
GEOMETRY | Unsupported | 
POINT | Unsupported | 
LINESTRING | Unsupported | 
POLYGON | Unsupported | 
MULTIPOINT | Unsupported | 
MULTILINESTRING | Unsupported | 
MULTIPOLYGON | Unsupported | 
GEOMCOLLECTION | Unsupported | 
VECTOR | Unsupported | 
Columns with unsupported data types are automatically excluded from replication. Tables containing only unsupported column types cannot be replicated.
Last updated on