Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions docs/cloud/features/04_infrastructure/warehouses.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,17 +19,21 @@

## What is compute-compute separation? {#what-is-compute-compute-separation}

Compute-compute separation is available for Scale and Enterprise tiers.
In ClickHouse Cloud, compute runs on dedicated CPU and memory clusters called **services**.

Check warning on line 22 in docs/cloud/features/04_infrastructure/warehouses.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.EOLWhitespace

Remove whitespace characters from the end of the line.

Each ClickHouse Cloud service includes:
- A group of two or more ClickHouse nodes (or replicas) is required, but the child services can be single replica.
- ClickHouse nodes (referred to as **replicas**)
- An endpoint (or multiple endpoints created via ClickHouse Cloud UI console), which is a service URL that you use to connect to the service (for example, `https://dv2fzne24g.us-east-1.aws.clickhouse.cloud:8443`).
- An object storage folder where the service stores all the data and partially metadata:

The initial service you create is the primary service. Subsequently, you can create additional services that have access to the

Check warning on line 29 in docs/cloud/features/04_infrastructure/warehouses.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.EOLWhitespace

Remove whitespace characters from the end of the line.

:::note
Child single services can scale vertically unlike single parent services.
:::

Compute-compute separation is available for Scale and Enterprise tiers.

<Image img={compute_1} size="md" alt="Current service in ClickHouse Cloud" />

<br />
Expand Down
102 changes: 102 additions & 0 deletions docs/use-cases/data_lake/polaris.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
---
slug: /use-cases/data-lake/polaris-catalog
sidebar_label: 'Polaris catalog'
title: 'Polaris catalog'
pagination_prev: null
pagination_next: null
description: 'In this guide, we will walk you through the steps to query
your data using ClickHouse and the Snowflake Polaris catalog.'
keywords: ['Polaris', 'Snowflake', 'Data Lake']
show_related_blogs: true
doc_type: 'guide'
---

import BetaBadge from '@theme/badges/BetaBadge';

<BetaBadge/>

ClickHouse supports integration with multiple catalogs (Unity, Glue, Polaris,
etc.). In this guide, we will walk you through the steps to query your data
using ClickHouse and the [Apache Polaris Catalog](https://polaris.apache.org/releases/1.1.0/getting-started/using-polaris/#setup).
Apache Polaris supports Iceberg tables and Delta Tables (via Generic Tables). This integration only supports Iceberg tables at this time.

:::note
As this feature is experimental, you will need to enable it using:
`SET allow_experimental_database_unity_catalog = 1;`
:::

## Prerequisites {#prerequisites}

To connect to the Polaris catalog, you will need:

- Snowflake Open Catalog (hosted Polaris) or self-hosted Polaris Catalog
- Your Polaris catalog URI (for example, `https://<account-id>.<region>.aws.snowflakecomputing.com/polaris/api/catalog/v1` or `http://polaris:8181/api/catalog/v1/oauth/tokens`)
- Catalog credentials (client ID and client secret)
- The OAuth tokens URI for your Polaris instance
- Storage endpoint for the object store where your Iceberg data lives (for example, S3)
- ClickHouse version 26.1+

For Open Catalog, Snowflake's managed Polaris offering, your URI will include `/polaris` while for self-hosted, it may not.

<VerticalStepper>

## Creating a connection between Polaris and ClickHouse {#connecting}

Create a database that connects ClickHouse to your Polaris catalog:

```sql
CREATE DATABASE polaris_catalog
ENGINE = DataLakeCatalog('https://<catalog_uri>/api/catalog/v1')
SETTINGS
catalog_type = 'rest',
catalog_credential = '<client-id>:<client-secret>',
warehouse = 'snowflake',
auth_scope = 'PRINCIPAL_ROLE:ALL',
oauth_server_uri = 'https://<catalog_uri>/api/catalog/v1/oauth/tokens',
storage_endpoint = '<storage_endpoint>'
```

## Query the Polaris catalog using ClickHouse {#query-polaris-catalog}

Once the connection is in place, you can query Polaris:

```sql title="Query"
USE polaris_catalog;
SHOW TABLES;
```

To query a table:

```sql title="Query"
SELECT count(*) FROM `polaris_db.my_iceberg_table`;
```

:::note
Backticks are required, for example, `schema.table`.
:::

To inspect the table DDL:

```sql
SHOW CREATE TABLE `polaris_db.my_iceberg_table`;
```

## Loading data from Polaris into ClickHouse {#loading-data-into-clickhouse}

To load data from Polaris into a ClickHouse table, create the target table with your desired schema, then insert from the Polaris table:

```sql title="Query"
CREATE TABLE my_clickhouse_table
(
-- define columns to match your Iceberg table
`id` Int64,
`name` String,
`event_time` DateTime64(3)
)
ENGINE = MergeTree
ORDER BY id;

INSERT INTO my_clickhouse_table
SELECT * FROM polaris_catalog.`polaris_db.my_iceberg_table`;
```
</VerticalStepper>