Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
3 Skipped Deployments
|
507aa28 to
d416e81
Compare
|
The docs look great! A couple of small pieces of feedback, but lgtm overall. I can't wait to get these in there I reviewed this preview on mobile: https://clickhouse-docs-9zj3bpftf-clickhouse.vercel.app/docs/integrations/apache-flink
|
|
|
||
| This is the official Apache Flink Sink Connector supported by ClickHouse. It is built using Flink's [AsyncSinkBase](https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink) and the official ClickHouse [java client](https://github.com/ClickHouse/clickhouse-java). | ||
|
|
||
| The connector supports Apache Flink's DataStream API. Table API support is planned for a future release. |
There was a problem hiding this comment.
Table API support is planned for a future release.
we can link to an issue to follow and upvote or delete the note
There was a problem hiding this comment.
i'll link the issue and remove the ## Using the Table API section
|
|
||
| ### Table API Connection Options {#table-api-connection-options} | ||
|
|
||
| Planned for a future release — this section will be updated once available. |
There was a problem hiding this comment.
I'm not sure what immediate value the note gives, I'd consider removing it
There was a problem hiding this comment.
understood, i'll remove these sections
|
|
||
| Notes: | ||
| * A `ZoneId` must be provided when performing date operations. | ||
| * Precision and scale must be provided when performing decimal operations. |
There was a problem hiding this comment.
do we have an example to link to?
There was a problem hiding this comment.
yeah - i will link directly to CH docs
docs/integrations/data-ingestion/apache-flink/flink-connector.md
Outdated
Show resolved
Hide resolved
|
|
||
| ## ClickHouse Version Compatibility and Security {#compatibility-and-security} | ||
|
|
||
| - All artifacts and versions of the connector are tested with all [active LTS versions](https://github.com/ClickHouse/ClickHouse/pulls?q=is%3Aopen+is%3Apr+label%3Arelease) of ClickHouse. |
There was a problem hiding this comment.
active and LTS?
| - All artifacts and versions of the connector are tested with all [active LTS versions](https://github.com/ClickHouse/ClickHouse/pulls?q=is%3Aopen+is%3Apr+label%3Arelease) of ClickHouse. | |
| - All artifacts and versions of the connector are tested with all [active and LTS versions](https://github.com/ClickHouse/ClickHouse/pulls?q=is%3Aopen+is%3Apr+label%3Arelease) of ClickHouse. |
BentsiLeviav
left a comment
There was a problem hiding this comment.
Left a few comments. In addition, can we go over the repo and find relevant locations where Flink is mentioned, and make sure we link them to this page?
Found these locations, but we might have more:
| ClickHouseClientConfig clickHouseClientConfig = new ClickHouseClientConfig(url, username, password, database, tableName); | ||
| ``` | ||
|
|
||
| Let's say you want to insert RAW CSV data as is: |
There was a problem hiding this comment.
Can you change this example to be copy-pastable like we have here?
If needed, you can add comments in between explaining each step.
| For more detailed instructions, see the [Example Guide](https://github.com/ClickHouse/flink-connector-clickhouse/blob/main/examples/README.md) | ||
|
|
||
| ### DataStream API Connection Options {#datastream-api-connection-options} | ||
| #### Clickhouse Client Options {#client-options} |
There was a problem hiding this comment.
Can you elaborate on how to set Java Client options? https://github.com/ClickHouse/flink-connector-clickhouse/blob/main/flink-connector-clickhouse-1.17/src/main/java/org/apache/flink/connector/clickhouse/sink/ClickHouseClientConfig.java#L79
|
|
||
| For more detailed instructions, see the [Example Guide](https://github.com/ClickHouse/flink-connector-clickhouse/blob/main/examples/README.md) | ||
|
|
||
| ### DataStream API Connection Options {#datastream-api-connection-options} |
There was a problem hiding this comment.
Can we add a section explaining how to add ClickHouse settings to the queries? https://github.com/ClickHouse/flink-connector-clickhouse/blob/main/flink-connector-clickhouse-1.17/src/main/java/org/apache/flink/connector/clickhouse/sink/ClickHouseClientConfig.java#L33
There was a problem hiding this comment.
addressed by the previous comment
There was a problem hiding this comment.
Tnx. Can you make it more explicit?
There are two types of settings users can tune:
- Java client options https://clickhouse.com/docs/integrations/language-clients/java/client#configuration
- ClickHouse session settings https://clickhouse.com/docs/operations/settings/settings
With the existing phrasing, the only indication forthe clickhouse settings is by this line
Map.of(), // serverSettings.
|
|
||
| ## Limitations {#limitations} | ||
|
|
||
| * Currently, the sink does not support exactly-once semantics. |
There was a problem hiding this comment.
In reference to what @mshustov said earlier, let's make sure we have a point and a linked issue for each of these:
- Exactly once semantics
- DLQ
- TableAPI
A link for each one of these will allow the community/user to vote and give us feedback
|
|
||
| ### Table API Connection Options {#table-api-connection-options} | ||
|
|
||
| Planned for a future release — this section will be updated once available. |
d37e53d to
5d8487b
Compare
5d8487b to
c211779
Compare
@BentsiLeviav yes - i confirmed the locations you identified are the only other mentions of flink in the repo besides the same pages in japanese & other languages. Should I update those as well? |
|
thanks @alexfrancoeur!
i've updated the flink icon link to point to the new docs page 👍
sure will do
yup, it's removed now |
9a10228 to
2a9c54a
Compare
docs/integrations/data-ingestion/apache-flink/flink-connector.md
Outdated
Show resolved
Hide resolved
docs/integrations/data-ingestion/apache-flink/flink-connector.md
Outdated
Show resolved
Hide resolved
|
|
||
| The following options come directly from Flink's `AsyncSinkBase`: | ||
|
|
||
| | Parameters | Description | Default Value | |
There was a problem hiding this comment.
All options' default values are marked with N/A. If there truly are no defaults and users must always provide them, it would be worth adding a note saying these are all required (like was done in the client options section
There was a problem hiding this comment.
yeah that's a good callout - I think we should set sane defaults for these options, but that can be a followup.
There was a problem hiding this comment.
it would be worth adding a note saying these are all required
done
|
|
||
| ## ClickHouse Version Compatibility and Security {#compatibility-and-security} | ||
|
|
||
| - All artifacts and versions of the connector are tested with all [active and LTS versions](https://github.com/ClickHouse/ClickHouse/pulls?q=is%3Aopen+is%3Apr+label%3Arelease) of ClickHouse. |
There was a problem hiding this comment.
Given that the CH versions are static in our workflows and not yet fully automated to be updated, we should consider being more accurate, maybe something like:
| - All artifacts and versions of the connector are tested with all [active and LTS versions](https://github.com/ClickHouse/ClickHouse/pulls?q=is%3Aopen+is%3Apr+label%3Arelease) of ClickHouse. | |
| - The connector is tested against a range of recent ClickHouse versions, including latest and head, via a daily CI workflow. The tested versions are updated periodically as new ClickHouse releases become active. |
Adding a link to the matrix might be useful as well, so users would get an accurate status. https://github.com/ClickHouse/flink-connector-clickhouse/blob/main/.github/workflows/tests-nightly.yaml#L15
There was a problem hiding this comment.
thank you, i've updated this line
docs/integrations/data-ingestion/apache-flink/flink-connector.md
Outdated
Show resolved
Hide resolved
| <Tabs groupId="client_options_example"> | ||
| <TabItem value="Java" label="Java" default> | ||
|
|
||
| ```java |
There was a problem hiding this comment.
I think best would be best if we give some network options as example configurations as an example (I think it will be more useful)
docs/integrations/data-ingestion/apache-flink/flink-connector.md
Outdated
Show resolved
Hide resolved
f455e1f to
9a4861b
Compare
a2f55b4 to
78c12c7
Compare
| ## Advanced and recommended usage {#advanced-and-recommended-usage} | ||
|
|
||
| - For optimal performance, ensure your DataStream element type is **not** a Generic type - see [here for Flink's type distinction](https://nightlies.apache.org/flink/flink-docs-release-2.2/docs/dev/datastream/fault-tolerance/serialization/types_serialization/#flinks-typeinformation-class). Non-generic elements will avoid the serialization overhead incurred by Kryo and improve throughput to ClickHouse. | ||
| - We recommend setting `maxBatchSize` to at least 1000 and ideally between 10,000 to 100,000. See [this guide on bulk inserts](https://clickhouse.com/docs/optimize/bulk-inserts) for more information. |
There was a problem hiding this comment.
items 12 and 15 in the doc matrix
Co-authored-by: Bentsi Leviav <bentsi.997@gmail.com>
@kurnoolsaketh no need to update any of the translations, there is a bot which will open a PR after this gets merged with any updates to translations. |
Summary
Introduce documentation for the official Flink connector. This doc page is adapted from the flink connector repo's README.