PostgreSQL
PostgreSQL (Postgres) is a powerful open-source relational database that supports logical replication, enabling real-time Change Data Capture (CDC).
This guide walks you through configuring your PostgreSQL database to work seamlessly with Supermetal.
Features
| Feature | Notes | 
|---|---|
| Initial Data Sync | |
| Change Data Capture | |
| Transactional Consistency | |
| Catalog Support | 
Prerequisites
Before you begin, ensure you have:
- Supported PostgreSQL Version: PostgreSQL version 9.4 or higher.
 - Database Admin Access: Sufficient privileges to create users, grant permissions, and modify database configuration parameters. This might require 
superuserprivileges or specific cloud provider roles (e.g.,rds_superuseron AWS RDS,azure_pg_adminon Azure). - Network Connectivity: Ensure that Supermetal's services can reach your PostgreSQL database. You may need to configure firewall rules, security groups, or VPC peering.
 
Setup
Supermetal supports two replication types for PostgreSQL sources:
- Snapshot: One-time or periodic full table snapshots.
 - Logical Replication: Real-time change data capture (CDC) with initial snapshot.
 
Data Types Mapping
| PostgreSQL Type(s) | Apache Arrow DataType | Notes | 
|---|---|---|
BOOL | Boolean | 
| PostgreSQL Type(s) | Apache Arrow DataType | Notes | 
|---|---|---|
INT2 | Int16 | |
INT4 | Int32 | |
INT8, MONEY | Int64 | MONEY is mapped as a fixed-point integer. | 
FLOAT4 | Float32 | Single precision floating-point. | 
FLOAT8 | Float64 | Double precision floating-point. | 
NUMERIC(p, s) | Decimal128(p, s) / Decimal256(p, s) | Uses PostgreSQL precision (p) and scale (s) directly to ensure exact numeric representation. | 
NUMERIC (unconstrained) | Utf8 | Unconstrained NUMERIC defaults to Utf8 as precision/scale are undefined. | 
OID | UInt32 | Object Identifier. | 
PG_LSN | UInt64 | PostgreSQL Log Sequence Number. | 
| PostgreSQL Type(s) | Apache Arrow DataType | Notes | 
|---|---|---|
DATE | Date32 | Days since UNIX epoch. Precision preserved at the day level. | 
TIME | Time64(µs) | Microseconds since midnight. Precision preserved at the microsecond level. | 
TIMESTAMP | Timestamp(µs) | Microseconds since UNIX epoch, timezone naive. Precision preserved at microsecond level. | 
TIMESTAMPTZ | Timestamp(µs, "UTC") | Microseconds since UNIX epoch, assumed UTC. Precision preserved at microsecond level. | 
TIMETZ, INTERVAL | Utf8 | Temporal types with string representation in Arrow. | 
| PostgreSQL Type(s) | Apache Arrow DataType | Notes | 
|---|---|---|
CHAR, VARCHAR, TEXT, BPCHAR, NAME, UNKNOWN | Utf8 | General-purpose text types. | 
citext, ltree, lquery, ltxtquery | Utf8 | Special text types (checked by name in code). | 
BIT, VARBIT | Utf8 | Bit string types represented as text. | 
UUID | Utf8 | Universally Unique Identifier. | 
ENUM | Utf8 | Enumerated types represented as their text labels. | 
INET, CIDR, MACADDR, MACADDR8 | Utf8 | Network address types. | 
TSVECTOR | Utf8 | Text search vector type represented as text. Positions are not preserved | 
| PostgreSQL Type(s) | Apache Arrow DataType | Notes | 
|---|---|---|
BYTEA | Binary | Variable-length byte array. | 
| PostgreSQL Type(s) | Apache Arrow DataType | Notes | 
|---|---|---|
JSON, JSONB | Utf8 | Mapped to Utf8. Using the Canonical JSON Extension (arrow.json) is recommended for interoperability, allowing downstream systems to recognize the content as JSON. This metadata signals to downstream supermetal targets (like Snowflake, Clickhouse, etc.) that the string contains JSON, allowing them to map it to their appropriate semi-structured types (e.g., VARIANT, JSON). | 
XML | Utf8 | Mapped to Utf8. A canonical extension type for XML might also exist or be defined depending on specific library support, similar to JSON. | 
| PostgreSQL Type(s) | Apache Arrow DataType | Notes | 
|---|---|---|
BOOL[] | List<Boolean> | Base types map as per their categories. | 
BYTEA[] | List<Binary> | |
INT2[] | List<Int16> | |
INT4[] | List<Int32> | |
INT8[], MONEY[] | List<Int64> | |
OID[] | List<UInt32> | |
PG_LSN[] | List<UInt64> | |
FLOAT4[] | List<Float32> | |
FLOAT8[] | List<Float64> | |
NUMERIC(p, s)[] | List<Decimal(p,s)> | Precision and scale preserved within list elements. | 
DATE[] | List<Date32> | |
TIME[] | List<Time64(µs)> | |
TIMESTAMP[] | List<Timestamp(µs)> | |
TIMESTAMPTZ[] | List<Timestamp(µs, "UTC")> | |
CHAR[], VARCHAR[], TEXT[], etc., _citext, _ltree, etc., BIT[], VARBIT[], TIMETZ[], INTERVAL[], XML[], UUID[], INET[], CIDR[], MACADDR8[], MACADDR[], JSON[], JSONB[], NUMERIC[] (unconstrained) | List<Utf8> | Includes arrays of types mapped to Utf8. | 
| Multi-dimensional Arrays (>1 dim) | Unsupported | Multi-dimensional arrays (dimension > 1) are not supported. | 
| PostgreSQL Type(s) | Apache Arrow DataType | Notes | 
|---|---|---|
INT4RANGE, INT8RANGE, NUMRANGE, TSRANGE, TSTZRANGE, DATERANGE | Utf8 | Range types are serialized to JSON format using the Canonical JSON Extension (arrow.json). | 
| Multirange types | Unsupported | Multirange types are not supported. | 
| PostgreSQL Type(s) | Apache Arrow DataType | Notes | 
|---|---|---|
DOMAIN(base_type) | Mapped as base_type | Domains are resolved to their underlying base type. | 
| Composite Types (ROW) | Utf8 | Composite types (structs/rows) are not explicitly mapped and default to a string (e.g., JSON) representation (Utf8). | 
| PostgreSQL Type(s) | Apache Arrow DataType | Notes | 
|---|---|---|
POINT, LSEG, PATH, BOX, POLYGON, LINE, CIRCLE | Unsupported | Geometric types are unsupported. | 
PostgreSQL to PostgreSQL Replication
When replicating from PostgreSQL to PostgreSQL, Supermetal replicates the source PostgreSQL data types 1:1, preserving the exact type from source to target when possible.
Last updated on