PostgreSQL

PostgreSQL (Postgres) is a powerful open-source relational database. This guide walks you through configuring your PostgreSQL database to work seamlessly with Supermetal.


Features

FeatureNotes
Bulk Insert

Transactional Consistency

Schema Evolution

Hard Deletes


Prerequisites

Before you begin, ensure you have:


Setup

Log in to your PostgreSQL database and run the following:

-- Create a user for Supermetal
CREATE USER supermetal_user WITH PASSWORD 'strong-password';

-- Grant permissions
-- allows supermetal to connect to the database, create schemas and (temp) tables
GRANT CONNECT, CREATE, TEMPORARY ON DATABASE target_database TO supermetal_user;

Password Security

Replace 'strong-password' with a secure, unique password for the supermetal_user. Store this password securely, as you'll need it when configuring the target in Supermetal.

Script Variables

Replace the placeholders in the script with your specific values:

  • target_database: The name of your PostgreSQL database

Snapshot Load Performance (Optional)

For large snapshots, optimize checkpoint frequency to maximize COPY throughput and avoid I/O storms. Checkpoints trigger when either limit is reached first.

  • max_wal_size: Maximum WAL size before checkpoint. COPY generates WAL rapidly, and small values (default: 1GB) cause frequent checkpoints that throttle performance. Increase to 16GB - 64GB to allow bulk loads to complete with fewer I/O interruptions.
  • checkpoint_timeout: Maximum time between checkpoints. Default (5min) triggers checkpoints before large snapshots complete. Increase to 30min - 1h to prevent premature interruptions during extended loads.

Data Types Mapping

Apache Arrow DataTypePostgreSQL TypeNotes
Int8SMALLINT
Int16SMALLINT
Int32INTEGER
Int64BIGINT
UInt8SMALLINT
UInt16INTEGER
UInt32BIGINT
UInt64NUMERIC
Float16REAL
Float32REAL
Float64DOUBLE PRECISION
Decimal128(p,s)NUMERIC(p,s)
Decimal256(p,s)NUMERIC(p,s)
Apache Arrow DataTypePostgreSQL TypeNotes
Date32DATE
Date64DATE
Time32TIME
Time64TIME
Time64(ns)BIGINTNanosecond precision stored as integer because Postgres does not support nanosecond precision
Timestamp(s/ms/us, [tz])TIMESTAMP[Z]
Timestamp(ns, [tz])BIGINTNanosecond timestamps stored as integer because Postgres does not support nanosecond precision
IntervalINTERVAL
DurationNot supportedDurations are explicitly not supported
Apache Arrow DataTypePostgreSQL TypeNotes
Utf8, LargeUtf8TEXT
Apache Arrow DataTypePostgreSQL TypeNotes
BooleanBOOLEANBoolean (true/false)
Apache Arrow DataTypePostgreSQL TypeNotes
Utf8 JSON Extension (arrow.json)JSONB
StructNot supported
MapNot supported
Apache Arrow DataTypePostgreSQL TypeNotes
Binary, LargeBinary, FixedSizeBinaryBYTEA
Apache Arrow DataTypePostgreSQL TypeNotes
List<T>, LargeList<T>, FixedSizeList<T>T[]Only one-dimensional arrays of primitive types are supported

Last updated on