Apache Doris
Apache Doris is a real-time analytical MPP database with a MySQL-compatible interface.
Prerequisites
- Supported Apache Doris Distributions
- VeloDB Cloud
- Apache Doris 4.0 or higher
Cluster Topology
The connector talks to the Frontend HTTP endpoint and follows Stream Load redirects to Backends. The machine running Supermetal must reach both FE and BE. If either is blocked, snapshot and CDC ingest will fail.
- Database Admin Access. An account with
Admin_priv(typically the built-inroot). The setup steps below issueCREATE USER,CREATE DATABASE, andGRANT, which require this privilege. - Network Connectivity. Open inbound access from Supermetal to:
- FE HTTP (default
8030). Stream Load entry point. Issues a 307 redirect to a BE. - BE HTTP (default
8040). The client follows the 307 directly to the BE holding the target tablet. - FE MySQL (default
9030). SQL, schema reflection, andINSERT WITH LABELvia the S3 TVF path. Override viafe_mysql_port.
- FE HTTP (default
Setup
Create a user & grant permissions
Connect to a Frontend
Log in to any FE as root (or another account with Admin_priv) using the MySQL protocol on port 9030.
mysql -h <fe-host> -P 9030 -u root -pCreate a user
CREATE USER 'supermetal_user' IDENTIFIED BY 'strong-password';Create the target database
CREATE DATABASE IF NOT EXISTS target_database;Grant privileges
GRANT SELECT_PRIV, LOAD_PRIV, ALTER_PRIV, CREATE_PRIV, DELETE_PRIV
ON target_database.*
TO 'supermetal_user';Connection details
You'll need the following to configure the target in Supermetal.
- Frontend HTTP URL (for example
http://fe.example.internal:8030, orhttps://...:8030if you've fronted FE with TLS) - Username and password you created above
- Target database name
Multiple Frontends
If you run more than one FE, point Supermetal at a single URL behind a load balancer. The connector follows Stream Load's 307 redirects to BEs on its own. FE failover should be handled by your LB, not by listing FEs individually.
TLS / mTLS
For HTTPS endpoints, optionally set ssl_root_cert (private CA) and ssl_client_cert_pem + ssl_client_key_pem (mTLS). The same SSL config is applied to both the Stream Load HTTP client and the MySQL client.
Open the SQL editor
Log in to VeloDB Cloud and open the SQL editor for your warehouse. Cloud warehouses expose a MySQL-compatible endpoint. The same DDL works on the web console or any MySQL client pointed at port 9030.
Create a user
CREATE USER 'supermetal_user' IDENTIFIED BY 'strong-password';Create the target database
CREATE DATABASE IF NOT EXISTS target_database;Grant privileges
GRANT SELECT_PRIV, LOAD_PRIV, ALTER_PRIV, CREATE_PRIV, DELETE_PRIV
ON target_database.*
TO 'supermetal_user';Connection details
You'll need the following to configure the target in Supermetal.
- Frontend HTTP(S) URL (for example
https://<warehouse-id>.cloud.velodb.io:8030) - Username and password you created above
- Target database name
IP Allowlist
VeloDB Cloud restricts inbound traffic by default. From the warehouse settings, add the public IP (or VPC peering range) of the machine running Supermetal to the allowlist for both the FE HTTP port and the BE HTTP port.
Table Model
| Value | Description |
|---|---|
Auto (default) | Tables with primary keys land as Unique Key with Merge-on-Write (sequence column _sm_version). Tables without primary keys land as Duplicate Key. |
UniqueKey | Forces Unique Key with Merge-on-Write. Requires primary keys. |
DuplicateKey | Append-only, no deduplication. Lower write amplification when row-level updates are not needed. |
Binary Handling
Doris has no native binary type. BINARY payloads land in STRING. The binary_handling_mode setting controls encoding.
| Mode | Description |
|---|---|
Bytes (default) | Raw bytes pass through. Fastest, but Doris string functions (LIKE, REGEXP, UPPER) may misbehave on non-UTF-8 content. |
Hex | Hex-encoded on write. Decode with unhex(). |
Base64 | Standard base64 on write. Decode with from_base64(). |
Data Types Mapping
| Apache Arrow DataType | Doris Type | Notes |
|---|---|---|
Int8 | TINYINT | |
Int16 | SMALLINT | |
Int32 | INT | |
Int64 | BIGINT | |
UInt8, UInt16 | INT | Widened one signed level. Doris ignores Parquet's UINT_N annotation. |
UInt32 | BIGINT | Widened one signed level. |
UInt64 | DECIMALV3(20, 0) | BIGINT silently nulls values at or above 2⁶³. Promoted to DECIMAL. |
Float16 | FLOAT | |
Float32 | FLOAT | NaN and ±Inf preserved. |
Float64 | DOUBLE | NaN and ±Inf preserved. |
Decimal128(p, s) | DECIMALV3(p, s) | p < 38. |
Decimal128(38, s) | STRING | Workaround for Doris 4.0/4.1 silently nulling DECIMAL(38, *) on parquet ingest. Decode with CAST AS DECIMAL on read. |
Decimal256(p, s) | DECIMALV3(p, s) | p < 38. Narrowed to Decimal128. |
Decimal256(p, s) | STRING | p ≥ 38. |
| Apache Arrow DataType | Doris Type | Notes |
|---|---|---|
Date32, Date64 | DATEV2 | Values outside 0000-01-01..9999-12-31 are nulled. |
Timestamp(*, [tz]) | DATETIMEV2(N) | Cast to UTC. N matches source-declared precision (capped at 6). Plain Arrow timestamps without source metadata land at DATETIMEV2(6). Values before 1900-01-01 are nulled. Doris parquet ingest corrupts pre-1900 timestamps. |
Time32, Time64 | STRING | No native TIME type. |
Duration, Interval | STRING | No Doris equivalent. |
| Apache Arrow DataType | Doris Type | Notes |
|---|---|---|
Utf8, LargeUtf8, Utf8View | STRING | |
Utf8 JSON extension (arrow.json) | JSON |
| Apache Arrow DataType | Doris Type | Notes |
|---|---|---|
Boolean | BOOLEAN |
| Apache Arrow DataType | Doris Type | Notes |
|---|---|---|
Binary, LargeBinary, BinaryView, FixedSizeBinary | STRING | Encoding controlled by binary_handling_mode. |
| Apache Arrow DataType | Doris Type | Notes |
|---|---|---|
Struct | STRUCT<...> | |
Map | MAP<K, V> |
| Apache Arrow DataType | Doris Type | Notes |
|---|---|---|
List<T>, LargeList<T>, FixedSizeList<T> | ARRAY<T> | Stream Load rejects complex parquet types. ARRAY columns require an object-store buffer so loads go through the S3 TVF. |
Nullability
All non-primary-key columns are nullable in Doris by default. Enable preserve_source_nullability to carry NOT NULL from the source schema.
Last updated on