diff --git a/docs/advanced-querying.mdx b/docs/advanced-querying.mdx index 824ae88d..d52cf791 100644 --- a/docs/advanced-querying.mdx +++ b/docs/advanced-querying.mdx @@ -45,8 +45,8 @@ Each of these methods returns a `ContextStream` object that must be opened via a ### Data blocks {#data-blocks} ClickHouse Connect processes all data from the primary `query` method as a stream of blocks received from the ClickHouse server. These blocks are transmitted in the custom "Native" format to and from ClickHouse. A "block" is simply a sequence of columns of binary data, where each column contains an equal number of data values of the specified data type. (As a columnar database, ClickHouse stores this data in a similar form.) The size of a block returned from a query is governed by two user settings that can be set at several levels (user profile, user, session, or query). They're: -- [max_block_size](/core/reference/settings/session-settings#max_block_size) -- Limit on the size of the block in rows. Default 65536. -- [preferred_block_size_bytes](/core/reference/settings/session-settings#preferred_block_size_bytes) -- Soft limit on the size of the block in bytes. Default 1,000,0000. +- [max_block_size](/reference/settings/session-settings#max_block_size) -- Limit on the size of the block in rows. Default 65536. +- [preferred_block_size_bytes](/reference/settings/session-settings#preferred_block_size_bytes) -- Soft limit on the size of the block in bytes. Default 1,000,0000. Regardless of the `preferred_block_size_setting`, each block will never be more than `max_block_size` rows. Depending on the type of query, the actual blocks returned can be of any size. For example, queries to a distributed table covering many shards may contain smaller blocks retrieved directly from each shard. @@ -408,14 +408,14 @@ client.query('SELECT device_id, dev_address, gw_address from devices', column_fo ## External data {#external-data} -ClickHouse queries can accept external data in any ClickHouse format. This binary data is sent along with the query string to be used to process the data. Details of the External Data feature are [here](/core/reference/engines/table-engines/special/external-data). The client `query*` methods accept an optional `external_data` parameter to take advantage of this feature. The value for the `external_data` parameter should be a `clickhouse_connect.driver.external.ExternalData` object. The constructor for that object accepts the following arguments: +ClickHouse queries can accept external data in any ClickHouse format. This binary data is sent along with the query string to be used to process the data. Details of the External Data feature are [here](/reference/engines/table-engines/special/external-data). The client `query*` methods accept an optional `external_data` parameter to take advantage of this feature. The value for the `external_data` parameter should be a `clickhouse_connect.driver.external.ExternalData` object. The constructor for that object accepts the following arguments: | Name | Type | Description | |-----------|-------------------|-----------------------------------------------------------------------------------------------------------------------------------------------| | file_path | str | Path to a file on the local system path to read the external data from. Either `file_path` or `data` is required | | file_name | str | The name of the external data "file". If not provided, will be determined from the `file_path` (without extensions) | | data | bytes | The external data in binary form (instead of being read from a file). Either `data` or `file_path` is required | -| fmt | str | The ClickHouse [Input Format](/core/reference/formats) of the data. Defaults to `TSV` | +| fmt | str | The ClickHouse [Input Format](/reference/formats) of the data. Defaults to `TSV` | | types | str or seq of str | A list of column data types in the external data. If a string, types should be separated by commas. Either `types` or `structure` is required | | structure | str or seq of str | A list of column name + data type in the data (see examples). Either `structure` or `types` is required | | mime_type | str | Optional MIME type of the file data. Currently ClickHouse ignores this HTTP subheader | diff --git a/docs/advanced-usage.mdx b/docs/advanced-usage.mdx index 09dccaa5..b7749144 100644 --- a/docs/advanced-usage.mdx +++ b/docs/advanced-usage.mdx @@ -71,7 +71,7 @@ The code above yields an `output.csv` file with the following content: 4,"4" ``` -Similarly, you could save data in [TabSeparated](/core/reference/formats/TabSeparated/TabSeparated) and other formats. See [Formats for Input and Output Data](/core/reference/formats) for an overview of all available format options. +Similarly, you could save data in [TabSeparated](/reference/formats/TabSeparated/TabSeparated) and other formats. See [Formats for Input and Output Data](/reference/formats) for an overview of all available format options. ## Multithreaded, multiprocess, and async/event driven use cases {#multithreaded-multiprocess-and-asyncevent-driven-use-cases} @@ -113,8 +113,8 @@ See also: [run_async example](https://github.com/ClickHouse/clickhouse-connect/b ## Managing ClickHouse session IDs {#managing-clickhouse-session-ids} Each ClickHouse query occurs within the context of a ClickHouse "session". Sessions are currently used for two purposes: -- To associate specific ClickHouse settings with multiple queries (see the [user settings](/core/reference/settings/session-settings)). The ClickHouse `SET` command is used to change the settings for the scope of a user session. -- To track [temporary tables.](/core/reference/statements/create/table#temporary-tables) +- To associate specific ClickHouse settings with multiple queries (see the [user settings](/reference/settings/session-settings)). The ClickHouse `SET` command is used to change the settings for the scope of a user session. +- To track [temporary tables.](/reference/statements/create/table#temporary-tables) By default, each query executed with a ClickHouse Connect `Client` instance uses that client's session ID. `SET` statements and temporary tables work as expected when using a single client. However, the ClickHouse server doesn't allow concurrent queries within the same session (the client will raise a `ProgrammingError` if attempted). For applications that execute concurrent queries, use one of the following patterns: 1. Create a separate `Client` instance for each thread/process/event handler that needs session isolation. This preserves per-client session state (temporary tables and `SET` values). diff --git a/docs/driver-api.mdx b/docs/driver-api.mdx index ae7155ba..104d753c 100644 --- a/docs/driver-api.mdx +++ b/docs/driver-api.mdx @@ -72,7 +72,7 @@ Finally, the `settings` argument to `get_client` is used to pass additional Clic | wait_end_of_query | Buffers the entire response on the ClickHouse server. This setting is required to return summary information, and is set automatically on non-streaming queries. | | role | ClickHouse role to be used for the session. Valid transport setting that can be included in query context. | -For other ClickHouse settings that can be sent with each query, see [the ClickHouse documentation](/core/reference/settings/session-settings). +For other ClickHouse settings that can be sent with each query, see [the ClickHouse documentation](/reference/settings/session-settings). ### Client creation examples {#client-creation-examples} @@ -249,7 +249,7 @@ ClickHouse Connect Client `query*` and `command` methods accept an optional `par #### Server-side binding {#server-side-binding} -ClickHouse supports [server side binding](/core/concepts/features/interfaces/client#cli-queries-with-parameters) for most query values, where the bound value is sent separate from the query as an HTTP query parameter. ClickHouse Connect will add the appropriate query parameters if it detects a binding expression of the form `{:}`. For server side binding, the `parameters` argument should be a Python dictionary. +ClickHouse supports [server side binding](/concepts/features/interfaces/client#cli-queries-with-parameters) for most query values, where the bound value is sent separate from the query as an HTTP query parameter. ClickHouse Connect will add the appropriate query parameters if it detects a binding expression of the form `{:}`. For server side binding, the `parameters` argument should be a Python dictionary. - Server-side binding with Python dictionary, DateTime value, and string value @@ -339,7 +339,7 @@ To bind DateTime64 arguments (ClickHouse types with sub-second precision) requir ### Settings argument {#settings-argument-1} -All the key ClickHouse Connect Client "insert" and "select" methods accept an optional `settings` keyword argument to pass ClickHouse server [user settings](/core/reference/settings/session-settings) for the included SQL statement. The `settings` argument should be a dictionary. Each item should be a ClickHouse setting name and its associated value. Note that values will be converted to strings when sent to the server as query parameters. +All the key ClickHouse Connect Client "insert" and "select" methods accept an optional `settings` keyword argument to pass ClickHouse server [user settings](/reference/settings/session-settings) for the included SQL statement. The `settings` argument should be a dictionary. Each item should be a ClickHouse setting name and its associated value. Note that values will be converted to strings when sent to the server as query parameters. As with client level settings, ClickHouse Connect will drop any settings that the server marks as *readonly*=*1*, with an associated log message. Settings that apply only to queries via the ClickHouse HTTP interface are always valid. Those settings are described under the `get_client` [API](#settings-argument). diff --git a/docs/index.mdx b/docs/index.mdx index 76ac65cd..f1678fac 100644 --- a/docs/index.mdx +++ b/docs/index.mdx @@ -2,7 +2,7 @@ keywords: ['clickhouse', 'python', 'client', 'connect', 'integrate'] slug: /integrations/python description: 'The ClickHouse Connect project suite for connecting Python to ClickHouse' -title: 'Python integration with ClickHouse Connect' +title: Introduction sidebarTitle: 'Introduction' doc_type: 'guide' integration: @@ -13,8 +13,6 @@ integration: import ConnectionDetails from '/snippets/_gather_your_details_http.mdx'; -# Introduction {#introduction} - ClickHouse Connect is a core database driver providing interoperability with a wide range of Python applications. - The main interface is the `Client` object in the package `clickhouse_connect.driver`. That core package also includes assorted helper classes and utility functions used for communicating with the ClickHouse server and "context" implementations for advanced management of insert and select queries.