@@ -6,11 +6,11 @@ A [dlt](https://dlthub.com/) destination for [Apache Iceberg](https://iceberg.ap
66
77- ** Atomic Multi-File Commits** : Multiple parquet files committed as single Iceberg snapshot per table
88- ** REST Catalog Support** : Works with Nessie, Polaris, AWS Glue, Unity Catalog
9+ - ** Credential Vending** : Most REST catalogs vend storage credentials automatically
910- ** Partitioning** : Full support for Iceberg partition transforms via ` iceberg_adapter() `
1011- ** Merge Strategies** : Delete-insert and upsert with hard delete support
1112- ** DuckDB Integration** : Query loaded data via ` pipeline.dataset() `
1213- ** Schema Evolution** : Automatic schema updates when adding columns
13- - ** Authentication** : OAuth2, Bearer token, AWS SigV4
1414
1515## Installation
1616
@@ -26,10 +26,6 @@ uv add dlt-iceberg
2626
2727## Quick Start
2828
29- See [ examples/] ( examples/ ) directory for working examples.
30-
31- ### Incremental Load
32-
3329``` python
3430import dlt
3531from dlt_iceberg import iceberg_rest
@@ -41,12 +37,11 @@ def generate_events():
4137pipeline = dlt.pipeline(
4238 pipeline_name = " my_pipeline" ,
4339 destination = iceberg_rest(
44- catalog_uri = " http ://localhost:19120/iceberg/main " ,
40+ catalog_uri = " https ://my-catalog.example.com/api/catalog " ,
4541 namespace = " analytics" ,
46- s3_endpoint = " http://localhost:9000" ,
47- s3_access_key_id = " minioadmin" ,
48- s3_secret_access_key = " minioadmin" ,
49- s3_region = " us-east-1" ,
42+ warehouse = " my_warehouse" ,
43+ credential = " client-id:client-secret" ,
44+ oauth2_server_uri = " https://my-catalog.example.com/oauth/tokens" ,
5045 ),
5146)
5247
@@ -83,26 +78,38 @@ def generate_users():
8378pipeline.run(generate_users())
8479```
8580
86- ## Configuration Options
81+ ## Configuration
8782
88- All configuration options can be passed to ` iceberg_rest() ` :
83+ ### Required Options
8984
9085``` python
9186iceberg_rest(
92- catalog_uri = " ..." , # Required: REST catalog URI
93- namespace = " ..." , # Required: Iceberg namespace (database)
94- warehouse = " ..." , # Optional: Warehouse location
87+ catalog_uri = " ..." , # REST catalog endpoint (or sqlite:// for local)
88+ namespace = " ..." , # Iceberg namespace (database)
89+ )
90+ ```
9591
96- # Authentication
97- credential = " ..." , # OAuth2 client credentials
98- oauth2_server_uri = " ..." , # OAuth2 token endpoint
99- token = " ..." , # Bearer token
92+ ### Authentication
10093
101- # AWS SigV4
102- sigv4_enabled = True ,
103- signing_region = " us-east-1" ,
94+ Choose based on your catalog:
95+
96+ | Catalog | Auth Method |
97+ | ---------| -------------|
98+ | Polaris, Lakekeeper | ` credential ` + ` oauth2_server_uri ` |
99+ | Unity Catalog | ` token ` |
100+ | AWS Glue | ` sigv4_enabled ` + ` signing_region ` |
101+ | Local SQLite | None needed |
104102
105- # S3 configuration
103+ Most REST catalogs (Polaris, Lakekeeper, etc.) ** vend storage credentials automatically** via the catalog API. You typically don't need to configure S3/GCS/Azure credentials manually.
104+
105+ <details >
106+ <summary ><b >Advanced Options</b ></summary >
107+
108+ ``` python
109+ iceberg_rest(
110+ # ... required options ...
111+
112+ # Manual storage credentials (usually not needed with credential vending)
106113 s3_endpoint = " ..." ,
107114 s3_access_key_id = " ..." ,
108115 s3_secret_access_key = " ..." ,
@@ -121,34 +128,43 @@ iceberg_rest(
121128)
122129```
123130
124- ### Catalog Examples
131+ </details >
132+
133+ ## Catalog Examples
125134
126- #### Nessie (Docker)
135+ <details >
136+ <summary ><b >Polaris / Lakekeeper</b ></summary >
127137
128138``` python
129139iceberg_rest(
130- catalog_uri = " http://localhost:19120/iceberg/main" ,
131- namespace = " my_namespace" ,
132- s3_endpoint = " http://localhost:9000" ,
133- s3_access_key_id = " minioadmin" ,
134- s3_secret_access_key = " minioadmin" ,
135- s3_region = " us-east-1" ,
140+ catalog_uri = " https://polaris.example.com/api/catalog" ,
141+ warehouse = " my_warehouse" ,
142+ namespace = " production" ,
143+ credential = " client-id:client-secret" ,
144+ oauth2_server_uri = " https://polaris.example.com/api/catalog/v1/oauth/tokens" ,
136145)
137146```
138147
139- Start services: ` docker compose up -d `
148+ Storage credentials are vended automatically by the catalog.
149+
150+ </details >
140151
141- #### Local SQLite Catalog
152+ <details >
153+ <summary ><b >Unity Catalog (Databricks)</b ></summary >
142154
143155``` python
144156iceberg_rest(
145- catalog_uri = " sqlite:///catalog.db" ,
146- warehouse = " file:///path/to/warehouse" ,
147- namespace = " my_namespace" ,
157+ catalog_uri = " https://<workspace>.cloud.databricks.com/api/2.1/unity-catalog/iceberg-rest" ,
158+ warehouse = " <catalog-name>" ,
159+ namespace = " <schema-name>" ,
160+ token = " <databricks-token>" ,
148161)
149162```
150163
151- #### AWS Glue
164+ </details >
165+
166+ <details >
167+ <summary ><b >AWS Glue</b ></summary >
152168
153169``` python
154170iceberg_rest(
@@ -160,31 +176,43 @@ iceberg_rest(
160176)
161177```
162178
163- AWS credentials via environment variables.
179+ Requires AWS credentials in environment (` AWS_ACCESS_KEY_ID ` , ` AWS_SECRET_ACCESS_KEY ` ).
180+
181+ </details >
164182
165- #### Polaris
183+ <details >
184+ <summary ><b >Local SQLite Catalog</b ></summary >
166185
167186``` python
168187iceberg_rest(
169- catalog_uri = " https://polaris.example.com/api/catalog" ,
170- warehouse = " s3://bucket/warehouse" ,
171- namespace = " production" ,
172- credential = " client-id:client-secret" ,
173- oauth2_server_uri = " https://polaris.example.com/api/catalog/v1/oauth/tokens" ,
188+ catalog_uri = " sqlite:///catalog.db" ,
189+ warehouse = " file:///path/to/warehouse" ,
190+ namespace = " my_namespace" ,
174191)
175192```
176193
177- #### Unity Catalog
194+ Great for local development and testing.
195+
196+ </details >
197+
198+ <details >
199+ <summary ><b >Nessie (Docker)</b ></summary >
178200
179201``` python
180202iceberg_rest(
181- catalog_uri = " https://<workspace>.cloud.databricks.com/api/2.1/unity-catalog/iceberg-rest" ,
182- warehouse = " <catalog-name>" ,
183- namespace = " <schema-name>" ,
184- token = " <databricks-token>" ,
203+ catalog_uri = " http://localhost:19120/iceberg/main" ,
204+ namespace = " my_namespace" ,
205+ s3_endpoint = " http://localhost:9000" ,
206+ s3_access_key_id = " minioadmin" ,
207+ s3_secret_access_key = " minioadmin" ,
208+ s3_region = " us-east-1" ,
185209)
186210```
187211
212+ Start Nessie + MinIO with ` docker compose up -d ` (see docker-compose.yml in repo).
213+
214+ </details >
215+
188216## Partitioning
189217
190218### Using iceberg_adapter (Recommended)
@@ -315,7 +343,7 @@ def users_with_deletes():
315343### Run Tests
316344
317345``` bash
318- # Start Docker services
346+ # Start Docker services (for Nessie tests)
319347docker compose up -d
320348
321349# Run all tests
0 commit comments