SQL Server

SQL Server's Change Data Capture (CDC) through capture tables enables reliable, low impact change capture for data integration. Supermetal builds on this foundation to deliver efficient data extraction and replication with minimal source database overhead.

This guide covers features and configuration steps required to connect SQL Server with Supermetal, focusing on CDC setup and permissions.


Features

FeatureNotes
Initial Data Sync

Change Data Capture

CDC Processing Modes

Configurable CDC Processing

Catalog Support


Prerequisites

Before you begin, ensure you have:

  • SQL Server Requirements:

    • Version: SQL Server 2016 or higher
    • Supported Editions:
      • Standard SQL Server (Enterprise, Standard, Personal, Azure SQL Managed Instance)
      • Azure SQL Database
      • Amazon RDS for SQL Server

    Unsupported SQL Server Editions

    CDC is not supported on Azure Synapse, Azure SQL Edge, or Azure Synapse Serverless.

  • SQL Server Agent: The SQL Server Agent service must be running, as it manages the CDC capture jobs.

  • Setup Requirements:

    • Database Permissions: sysadmin or db_owner role for CDC setup (see Setup for role-specific instructions)
    • Network Connectivity: Ensure Supermetal can reach your SQL Server (default port: 1433)

Setup

Permissions Overview

Supermetal's SQL Server integration provides different levels of Change Data Capture (CDC) automation depending on your SQL Server edition and user permissions. The table below summarizes the automation capabilities:

SQL Server EditionUser RoleAutomation Capabilities
StandardsysadminFull CDC workflow automation (database CDC enablement, table CDC enablement, job configuration)
Standarddb_ownerPartial automation (table CDC enablement, job configuration)
StandardRegular userLimited automation (requires manual setup by a privileged user)
Azure SQL Databasedb_ownerFull CDC workflow automation (all CDC operations)
Azure SQL DatabaseRegular userLimited automation (requires manual setup by a privileged user)
AWS RDSdb_ownerPartial automation (table CDC enablement, job configuration)
AWS RDSRegular userLimited automation (requires manual setup by a privileged user)

Recommended

We recommend creating a db_owner user for all SQL server editions as it provides the best tradeoff between automation and permissions.

Setup Instructions

SYSADMIN Role

The sysadmin server role provides the highest level of permissions, enabling Supermetal to fully automate the CDC workflow, including enabling CDC at the database level, configuring tables for CDC, and managing CDC jobs.

Edition Support

The sysadmin role is only available on Standard SQL Server editions. Azure SQL Database and AWS RDS restrict the use of sysadmin privileges.

Create a SYSADMIN User

AWS RDS for SQL Server does not allow the creation of sysadmin users. Please use the DB_OWNER role instead.

Azure SQL Database does not support the sysadmin role. Please use the DB_OWNER role instead.

Password Security

Replace 'strong-password' with a secure, unique password for the Supermetal user. Store this password securely, as you'll need it when configuring the source in Supermetal.

DB_OWNER Role

The db_owner database role provides sufficient permissions for most CDC operations except enabling CDC at the database level on Standard SQL Server and AWS RDS (this step must be performed by a sysadmin). On Azure SQL Database, db_owner can automate the entire CDC workflow.

Create a DB_OWNER User

Script Variables

Replace the placeholder values in the script with your actual information:

  • 'strong-password': Replace with a secure, unique password for the supermetal_user.
  • DATABASE_NAME: The name of the database you want to replicate from.

Enable Snapshot Isolation

Enable snapshot isolation on the database to ensure consistent reads during CDC operations. This step must be performed by a sysadmin user .

Not required for Azure SQL Database. Supermetal automatically manages snapshot isolation enablement.

Script Variables

Replace DATABASE_NAME with the name of the database you want to replicate from.

Enable CDC at Database Level

This step is required for Standard SQL Server and AWS RDS. It creates the CDC schema and starts the SQL Server Agent jobs necessary for CDC. This step must be performed by a user with sysadmin privileges.

Not required for Azure SQL Database. Supermetal automatically manages CDC enablement at the database level.

MINIMAL_ACCESS_USER Role

A minimal access user has limited permissions and requires a user with higher privileges (such as sysadmin or db_owner) to set up the CDC environment. After setup, the minimal access user can be used for ongoing replication.

Create a Minimal Access User

Script Variables

Replace the placeholder values in the script with your actual information:

  • 'strong-password': Replace with a secure, unique password for the supermetal_user.
  • DATABASE_NAME: The name of the database you want to replicate from.
  • SCHEMA_NAME: The schema of the table you want to replicate from.
  • TABLE_NAME: The name of the table you want to replicate from.

Enable Snapshot Isolation

Enable snapshot isolation on the database to ensure consistent reads during CDC operations. This step must be performed by a user with appropriate privileges (typically sysadmin or db_owner).

Script Variables

Replace DATABASE_NAME with the name of the database you want to replicate from.

Enable CDC at Database Level

This step must be performed by a user with appropriate privileges (typically sysadmin for Standard and RDS, db_owner for Azure).

Grant CDC Schema Access

If this user does not have SELECT permission on the database schema, then we need to explicitly grant SELECT permission on the [cdc] schema:

GRANT SELECT on SCHEMA::[cdc] to [supermetal_user];

Enable CDC for Tables

The following query creates a capture instance for a specific table. This step must be performed by a user with appropriate privileges.

Script Variables

Replace the placeholder values in the script with your actual information:

  • SCHEMA_NAME: The schema of the table you want to replicate from.
  • TABLE_NAME: The name of the table you want to replicate from.

Configure CDC Job Parameters (Optional)

You can customize the CDC job parameters to optimize performance based on your workload.

Replace the parameter values with appropriate values for your environment:

  • INSERT_MAX_SCANS_VALUE: Maximum number of scan cycles (default: 10)
  • INSERT_MAX_TRANS_VALUE: Maximum number of transactions (default: 500)
  • INSERT_POLL_INTERVAL_SECS_VALUE: Polling interval in seconds (default: 5)

Data Types Mapping

SQL Server Type(s)Apache Arrow DataTypeNotes
TINYINTUInt8
SMALLINTInt16
INTInt32
BIGINTInt64
SMALLMONEYDecimal128(10, 4)Fixed precision with 4 decimal places.
MONEYDecimal128(19, 4)Fixed precision with 4 decimal places.
DECIMAL(p,s)Decimal128(p, s) / Decimal256(p, s)Uses SQL Server precision and scale directly. Decimal256 for precision > 38.
NUMERIC(p,s)Decimal128(p, s) / Decimal256(p, s)Same as DECIMAL. Default precision is 18.
REALFloat64
FLOATFloat64
SQL Server Type(s)Apache Arrow DataTypeNotes
BITBoolean
SQL Server Type(s)Apache Arrow DataTypeNotes
DATEDate32Days since UNIX epoch.
TIMETime64(ns)Nanoseconds since midnight.
SMALLDATETIMETimestamp(s)Second-level precision, minute-level accuracy (rounds to nearest minute).
DATETIMETimestamp(ms)Millisecond precision, 3.33ms accuracy.
DATETIME2(0)Timestamp(s)Second precision.
DATETIME2(1-3)Timestamp(ms)Millisecond precision (1-3 fractional digits).
DATETIME2(4-6)Timestamp(μs)Microsecond precision (4-6 fractional digits).
DATETIME2(7)Utf8Mapped to string to preserve full range (years 1-9999).
DATETIMEOFFSET(0)Timestamp(s, "UTC")Second precision, normalized to UTC.
DATETIMEOFFSET(1-3)Timestamp(ms, "UTC")Millisecond precision (1-3 fractional digits), normalized to UTC.
DATETIMEOFFSET(4-6)Timestamp(μs, "UTC")Microsecond precision (4-6 fractional digits), normalized to UTC.
DATETIMEOFFSET(7)Utf8Mapped to string to preserve full range (years 0001-9999).
SQL Server Type(s)Apache Arrow DataTypeNotes
CHAR, VARCHARUtf8
NCHAR, NVARCHARUtf8Unicode character strings.
TEXTUtf8Legacy large text type (deprecated). Use VARCHAR(MAX) instead.
NTEXTUtf8Legacy large Unicode text type (deprecated). Use NVARCHAR(MAX) instead.
UNIQUEIDENTIFIERUtf8UUID/GUID represented as a string.
SQL Server Type(s)Apache Arrow DataTypeNotes
BINARY, VARBINARYBinary
ROWVERSIONBinaryAutomatically incremented binary identifier. See rowversion.
IMAGEBinaryLegacy large binary type (deprecated). Use VARBINARY(MAX) instead.
SQL Server Type(s)Apache Arrow DataType
XMLUtf8
GEOGRAPHYUnsupported
GEOMETRYUnsupported
HIERARCHYIDUnsupported

Last updated on