PostgreSQL
PostgreSQL (Postgres) is a powerful open-source relational database. This guide walks you through configuring your PostgreSQL database to work seamlessly with Supermetal.
Features
| Feature | Notes | 
|---|---|
| Bulk Insert | |
| Transactional Consistency | |
| Schema Evolution | |
| Hard Deletes | 
Prerequisites
Before you begin, ensure you have:
- Supported PostgreSQL Version: PostgreSQL 9.4 or higher.
 - Database Admin Access: Sufficient privileges to create users and grant permissions.
 - Network Connectivity: Ensure that Supermetal's services can reach your PostgreSQL database. You may need to configure security groups.
 
Setup
Log in to your PostgreSQL database and run the following:
-- Create a user for Supermetal
CREATE USER supermetal_user WITH PASSWORD 'strong-password';
-- Grant permissions
-- allows supermetal to connect to the database, create schemas and (temp) tables
GRANT CONNECT, CREATE, TEMPORARY ON DATABASE target_database TO supermetal_user;Password Security
Replace 'strong-password' with a secure, unique password for the supermetal_user. Store this password securely, as you'll need it when configuring the target in Supermetal.
Script Variables
Replace the placeholders in the script with your specific values:
target_database: The name of your PostgreSQL database
Snapshot Load Performance (Optional)
For large snapshots, optimize checkpoint frequency to maximize COPY throughput and avoid I/O storms. Checkpoints trigger when either limit is reached first.
max_wal_size: Maximum WAL size before checkpoint. COPY generates WAL rapidly, and small values (default: 1GB) cause frequent checkpoints that throttle performance. Increase to 16GB - 64GB to allow bulk loads to complete with fewer I/O interruptions.checkpoint_timeout: Maximum time between checkpoints. Default (5min) triggers checkpoints before large snapshots complete. Increase to 30min - 1h to prevent premature interruptions during extended loads.
Data Types Mapping
| Apache Arrow DataType | PostgreSQL Type | Notes | 
|---|---|---|
Int8 | SMALLINT | |
Int16 | SMALLINT | |
Int32 | INTEGER | |
Int64 | BIGINT | |
UInt8 | SMALLINT | |
UInt16 | INTEGER | |
UInt32 | BIGINT | |
UInt64 | NUMERIC | |
Float16 | REAL | |
Float32 | REAL | |
Float64 | DOUBLE PRECISION | |
Decimal128(p,s) | NUMERIC(p,s) | |
Decimal256(p,s) | NUMERIC(p,s) | 
| Apache Arrow DataType | PostgreSQL Type | Notes | 
|---|---|---|
Date32 | DATE | |
Date64 | DATE | |
Time32 | TIME | |
Time64 | TIME | |
Time64(ns) | BIGINT | Nanosecond precision stored as integer because Postgres does not support nanosecond precision | 
Timestamp(s/ms/us, [tz]) | TIMESTAMP[Z] | |
Timestamp(ns, [tz]) | BIGINT | Nanosecond timestamps stored as integer because Postgres does not support nanosecond precision | 
Interval | INTERVAL | |
Duration | Not supported | Durations are explicitly not supported | 
| Apache Arrow DataType | PostgreSQL Type | Notes | 
|---|---|---|
Utf8, LargeUtf8 | TEXT | 
| Apache Arrow DataType | PostgreSQL Type | Notes | 
|---|---|---|
Boolean | BOOLEAN | Boolean (true/false) | 
| Apache Arrow DataType | PostgreSQL Type | Notes | 
|---|---|---|
Utf8 JSON Extension (arrow.json) | JSONB | |
Struct | Not supported | |
Map | Not supported | 
| Apache Arrow DataType | PostgreSQL Type | Notes | 
|---|---|---|
Binary, LargeBinary, FixedSizeBinary | BYTEA | 
| Apache Arrow DataType | PostgreSQL Type | Notes | 
|---|---|---|
List<T>, LargeList<T>, FixedSizeList<T> | T[] | Only one-dimensional arrays of primitive types are supported | 
Last updated on