PostgreSQL
Supermetal replicates to PostgreSQL through the binary COPY protocol for bulk loads, applying changes with inserts and merges. Deletes are hard, removing rows from the target rather than flagging them.
Prerequisites
- PostgreSQL 9.4 or higher, self managed or managed (AWS RDS, AWS Aurora, Google Cloud SQL, Azure Database).
- Privileges to create users and grant permissions.
- Network connectivity from the Supermetal agent to the database.
Setup
Log in to your PostgreSQL database and run the following:
-- Create a user for Supermetal
CREATE USER supermetal_user WITH PASSWORD 'strong-password';
-- Grant permissions
-- allows supermetal to connect to the database, create schemas and (temp) tables
GRANT CONNECT, CREATE, TEMPORARY ON DATABASE target_database TO supermetal_user;Password security
Replace 'strong-password' with a unique password and store it securely. You will enter it when configuring the target in Supermetal.
Script variables
Replace target_database with the name of your PostgreSQL database.
Foreign Key Constraints
If the destination schema has foreign key constraints, tables loaded in parallel may temporarily violate them until all referenced rows arrive. To let Supermetal defer constraint checks during writes, grant elevated privileges:
-- AWS RDS
GRANT rds_superuser TO supermetal_user;
-- Standard PostgreSQL (run as superuser)
ALTER USER supermetal_user WITH SUPERUSER;Snapshot Performance
For large snapshots, tune checkpointing on the target server to keep COPY throughput high and avoid I/O storms. Checkpoints trigger when either limit is reached first.
max_wal_size. Maximum WAL size before a checkpoint.COPYgenerates WAL rapidly, and the 1GB default causes frequent checkpoints that throttle loads. Increase to 16GB to 64GB.checkpoint_timeout. Maximum time between checkpoints. The 5 minute default triggers checkpoints before large snapshots complete. Increase to 30 minutes to 1 hour.
Lock Timeout
Data operations (COPY, INSERT, MERGE) wait indefinitely when a target table is locked by a long running transaction, DDL, or maintenance. Setting the operation lock timeout makes them fail fast instead, applying a lock_timeout in seconds to each operation.
Data Types Mapping
| Apache Arrow DataType | PostgreSQL Type | Notes |
|---|---|---|
Int8 | SMALLINT | |
Int16 | SMALLINT | |
Int32 | INTEGER | |
Int64 | BIGINT | |
UInt8 | SMALLINT | |
UInt16 | INTEGER | |
UInt32 | BIGINT | |
UInt64 | NUMERIC | |
Float16 | REAL | |
Float32 | REAL | |
Float64 | DOUBLE PRECISION | |
Decimal128(p,s) | NUMERIC(p,s) | |
Decimal256(p,s) | NUMERIC(p,s) |
| Apache Arrow DataType | PostgreSQL Type | Notes |
|---|---|---|
Date32 | DATE | |
Date64 | DATE | |
Time32 | TIME | |
Time64 | TIME | |
Time64(ns) | BIGINT | Stored as an integer since Postgres lacks nanosecond precision. |
Timestamp(s/ms/us, [tz]) | TIMESTAMP[Z] | |
Timestamp(ns, [tz]) | BIGINT | Stored as an integer since Postgres lacks nanosecond precision. |
Interval | INTERVAL | |
Duration | Unsupported |
| Apache Arrow DataType | PostgreSQL Type | Notes |
|---|---|---|
Utf8, LargeUtf8 | TEXT |
| Apache Arrow DataType | PostgreSQL Type | Notes |
|---|---|---|
Boolean | BOOLEAN |
| Apache Arrow DataType | PostgreSQL Type | Notes |
|---|---|---|
Utf8 JSON Extension (arrow.json) | JSONB | |
Struct | Unsupported | |
Map | Unsupported |
| Apache Arrow DataType | PostgreSQL Type | Notes |
|---|---|---|
Binary, LargeBinary, FixedSizeBinary | BYTEA |
| Apache Arrow DataType | PostgreSQL Type | Notes |
|---|---|---|
List<T>, LargeList<T>, FixedSizeList<T> | T[] | Only one dimensional arrays of primitive types. |
Changelog
0.1.2
2026-05-21
Target connections can route through an SSH tunnel to a bastion host.
Last updated on