Skip to content

Commit 0bb440d

Browse files
committed
Improve README: collapsible catalogs, clearer config section
1 parent c86342b commit 0bb440d

1 file changed

Lines changed: 78 additions & 50 deletions

File tree

README.md

Lines changed: 78 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,11 @@ A [dlt](https://dlthub.com/) destination for [Apache Iceberg](https://iceberg.ap
66

77
- **Atomic Multi-File Commits**: Multiple parquet files committed as single Iceberg snapshot per table
88
- **REST Catalog Support**: Works with Nessie, Polaris, AWS Glue, Unity Catalog
9+
- **Credential Vending**: Most REST catalogs vend storage credentials automatically
910
- **Partitioning**: Full support for Iceberg partition transforms via `iceberg_adapter()`
1011
- **Merge Strategies**: Delete-insert and upsert with hard delete support
1112
- **DuckDB Integration**: Query loaded data via `pipeline.dataset()`
1213
- **Schema Evolution**: Automatic schema updates when adding columns
13-
- **Authentication**: OAuth2, Bearer token, AWS SigV4
1414

1515
## Installation
1616

@@ -26,10 +26,6 @@ uv add dlt-iceberg
2626

2727
## Quick Start
2828

29-
See [examples/](examples/) directory for working examples.
30-
31-
### Incremental Load
32-
3329
```python
3430
import dlt
3531
from dlt_iceberg import iceberg_rest
@@ -41,12 +37,11 @@ def generate_events():
4137
pipeline = dlt.pipeline(
4238
pipeline_name="my_pipeline",
4339
destination=iceberg_rest(
44-
catalog_uri="http://localhost:19120/iceberg/main",
40+
catalog_uri="https://my-catalog.example.com/api/catalog",
4541
namespace="analytics",
46-
s3_endpoint="http://localhost:9000",
47-
s3_access_key_id="minioadmin",
48-
s3_secret_access_key="minioadmin",
49-
s3_region="us-east-1",
42+
warehouse="my_warehouse",
43+
credential="client-id:client-secret",
44+
oauth2_server_uri="https://my-catalog.example.com/oauth/tokens",
5045
),
5146
)
5247

@@ -83,26 +78,38 @@ def generate_users():
8378
pipeline.run(generate_users())
8479
```
8580

86-
## Configuration Options
81+
## Configuration
8782

88-
All configuration options can be passed to `iceberg_rest()`:
83+
### Required Options
8984

9085
```python
9186
iceberg_rest(
92-
catalog_uri="...", # Required: REST catalog URI
93-
namespace="...", # Required: Iceberg namespace (database)
94-
warehouse="...", # Optional: Warehouse location
87+
catalog_uri="...", # REST catalog endpoint (or sqlite:// for local)
88+
namespace="...", # Iceberg namespace (database)
89+
)
90+
```
9591

96-
# Authentication
97-
credential="...", # OAuth2 client credentials
98-
oauth2_server_uri="...", # OAuth2 token endpoint
99-
token="...", # Bearer token
92+
### Authentication
10093

101-
# AWS SigV4
102-
sigv4_enabled=True,
103-
signing_region="us-east-1",
94+
Choose based on your catalog:
95+
96+
| Catalog | Auth Method |
97+
|---------|-------------|
98+
| Polaris, Lakekeeper | `credential` + `oauth2_server_uri` |
99+
| Unity Catalog | `token` |
100+
| AWS Glue | `sigv4_enabled` + `signing_region` |
101+
| Local SQLite | None needed |
104102

105-
# S3 configuration
103+
Most REST catalogs (Polaris, Lakekeeper, etc.) **vend storage credentials automatically** via the catalog API. You typically don't need to configure S3/GCS/Azure credentials manually.
104+
105+
<details>
106+
<summary><b>Advanced Options</b></summary>
107+
108+
```python
109+
iceberg_rest(
110+
# ... required options ...
111+
112+
# Manual storage credentials (usually not needed with credential vending)
106113
s3_endpoint="...",
107114
s3_access_key_id="...",
108115
s3_secret_access_key="...",
@@ -121,34 +128,43 @@ iceberg_rest(
121128
)
122129
```
123130

124-
### Catalog Examples
131+
</details>
132+
133+
## Catalog Examples
125134

126-
#### Nessie (Docker)
135+
<details>
136+
<summary><b>Polaris / Lakekeeper</b></summary>
127137

128138
```python
129139
iceberg_rest(
130-
catalog_uri="http://localhost:19120/iceberg/main",
131-
namespace="my_namespace",
132-
s3_endpoint="http://localhost:9000",
133-
s3_access_key_id="minioadmin",
134-
s3_secret_access_key="minioadmin",
135-
s3_region="us-east-1",
140+
catalog_uri="https://polaris.example.com/api/catalog",
141+
warehouse="my_warehouse",
142+
namespace="production",
143+
credential="client-id:client-secret",
144+
oauth2_server_uri="https://polaris.example.com/api/catalog/v1/oauth/tokens",
136145
)
137146
```
138147

139-
Start services: `docker compose up -d`
148+
Storage credentials are vended automatically by the catalog.
149+
150+
</details>
140151

141-
#### Local SQLite Catalog
152+
<details>
153+
<summary><b>Unity Catalog (Databricks)</b></summary>
142154

143155
```python
144156
iceberg_rest(
145-
catalog_uri="sqlite:///catalog.db",
146-
warehouse="file:///path/to/warehouse",
147-
namespace="my_namespace",
157+
catalog_uri="https://<workspace>.cloud.databricks.com/api/2.1/unity-catalog/iceberg-rest",
158+
warehouse="<catalog-name>",
159+
namespace="<schema-name>",
160+
token="<databricks-token>",
148161
)
149162
```
150163

151-
#### AWS Glue
164+
</details>
165+
166+
<details>
167+
<summary><b>AWS Glue</b></summary>
152168

153169
```python
154170
iceberg_rest(
@@ -160,31 +176,43 @@ iceberg_rest(
160176
)
161177
```
162178

163-
AWS credentials via environment variables.
179+
Requires AWS credentials in environment (`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`).
180+
181+
</details>
164182

165-
#### Polaris
183+
<details>
184+
<summary><b>Local SQLite Catalog</b></summary>
166185

167186
```python
168187
iceberg_rest(
169-
catalog_uri="https://polaris.example.com/api/catalog",
170-
warehouse="s3://bucket/warehouse",
171-
namespace="production",
172-
credential="client-id:client-secret",
173-
oauth2_server_uri="https://polaris.example.com/api/catalog/v1/oauth/tokens",
188+
catalog_uri="sqlite:///catalog.db",
189+
warehouse="file:///path/to/warehouse",
190+
namespace="my_namespace",
174191
)
175192
```
176193

177-
#### Unity Catalog
194+
Great for local development and testing.
195+
196+
</details>
197+
198+
<details>
199+
<summary><b>Nessie (Docker)</b></summary>
178200

179201
```python
180202
iceberg_rest(
181-
catalog_uri="https://<workspace>.cloud.databricks.com/api/2.1/unity-catalog/iceberg-rest",
182-
warehouse="<catalog-name>",
183-
namespace="<schema-name>",
184-
token="<databricks-token>",
203+
catalog_uri="http://localhost:19120/iceberg/main",
204+
namespace="my_namespace",
205+
s3_endpoint="http://localhost:9000",
206+
s3_access_key_id="minioadmin",
207+
s3_secret_access_key="minioadmin",
208+
s3_region="us-east-1",
185209
)
186210
```
187211

212+
Start Nessie + MinIO with `docker compose up -d` (see docker-compose.yml in repo).
213+
214+
</details>
215+
188216
## Partitioning
189217

190218
### Using iceberg_adapter (Recommended)
@@ -315,7 +343,7 @@ def users_with_deletes():
315343
### Run Tests
316344

317345
```bash
318-
# Start Docker services
346+
# Start Docker services (for Nessie tests)
319347
docker compose up -d
320348

321349
# Run all tests

0 commit comments

Comments
 (0)