`check_parts_columns` fails on tables with millions of rows in `system.parts_columns`: hardcoded `max_memory_usage=200000000` cannot be raised, even with `parts_columns_batch_size=1`

Since v2.7.2, the `CheckSystemPartsColumnsForTables` query carries a hardcoded `SETTINGS max_bytes_before_external_group_by=100000000, max_memory_usage=200000000` clause, introduced in c7613c01 as part of the fix for #1360.

This makes `create` / `create_remote` fail with `MEMORY_LIMIT_EXCEEDED (code: 241)` on any instance where a **single table** produces more rows in `system.parts_columns` than fit in ~190 MiB. Because the limit is injected directly in the query's `SETTINGS` clause, it overrides any user profile, and `parts_columns_batch_size=1` cannot help since the batch is already a single table.

Backups that worked fine on v2.7.1 (and earlier) now fail 100% of the time.

## Environment

- clickhouse-backup: **v2.7.2** (server mode with `watch`, Kubernetes sidecar)
- ClickHouse server: **26.3**
- Already set: `CLICKHOUSE_PARTS_COLUMNS_BATCH_SIZE=1` → no effect (expected, see below)

## Error

```
ERR pkg/backup/create.go:161 > backup failed error: CheckSystemPartsColumnsForTables failed: CheckSystemPartsColumnsForTables: select parts columns: code: 241, message: Query memory limit exceeded: would use 193.91 MiB (attempt to allocate chunk of 16.00 MiB), maximum: 190.73 MiB.
ERR pkg/server/server.go:688 > Watch error, error=too many errors create_remote: 1, delete local: 1, abort watching
```

Note `maximum: 190.73 MiB` = exactly the hardcoded `200000000` bytes.

## Root cause analysis

The offending table is an [Akvorado](https://github.com/akvorado/akvorado) flow table (~130 columns, ~30k active parts):

```sql
SELECT database, table, count() AS rows_pc
FROM system.parts_columns
WHERE active
GROUP BY database, table
ORDER BY rows_pc DESC
LIMIT 4;
```

| database | table | rows_pc |
| :--- | :--- | :--- |
| akvorado | flows_local | 3903900 |
| akvorado | flows_5m0s_local | 2540871 |
| akvorado | flows_1m0s_local | 1513986 |
| akvorado | flows_1h0m0s_local | 384081 |

With ~3.9M rows for a single table, the memory is consumed while **scanning/materializing** `system.parts_columns` for that table, not in the GROUP BY state — so `max_bytes_before_external_group_by` does not help (it only spills the aggregation state), and `parts_columns_batch_size=1` does not help either since the smallest possible batch (one table) already exceeds the cap.

This kind of volumetry is normal for wide, high-ingest tables with long retention (netflow, observability, etc.), so the hardcoded 200 MB cap effectively breaks `check_parts_columns: true` for this whole class of workloads.

## Steps to reproduce

1. Create a MergeTree table with ~100+ columns and enough active parts so that `SELECT count() FROM system.parts_columns WHERE active AND database = ... AND table = ...` returns a few million rows.
2. Run `clickhouse-backup create` with default config (`check_parts_columns: true`) on v2.7.2.
3. Backup fails with `code: 241, Query memory limit exceeded ... maximum: 190.73 MiB`, regardless of `parts_columns_batch_size` and regardless of the server's `max_memory_usage` profile (the query-level `SETTINGS` clause takes precedence).

## Expected behavior

Either of:

1. Make the memory limit configurable, e.g. `clickhouse.parts_columns_max_memory_usage` (env `CLICKHOUSE_PARTS_COLUMNS_MAX_MEMORY_USAGE`), defaulting to the current `200000000` to preserve the #1360 fix for the many-tables case, with `0` meaning "don't inject the setting at all" (pre-2.7.2 behavior);
2. And/or, on `MEMORY_LIMIT_EXCEEDED` (code 241) for a batch, retry that batch without the `SETTINGS` cap (or with a higher one) instead of failing the whole backup.

## Workaround currently in place

`CLICKHOUSE_CHECK_PARTS_COLUMNS=false`, after verifying manually (without the cap) that no type inconsistency exists:

```sql
SELECT database, table, column, min(type) AS min_type, max(type) AS max_type
FROM system.parts_columns
WHERE active
GROUP BY database, table, column
HAVING min_type != max_type;
-- 0 rows
```

We would prefer to keep the check enabled, hence this issue.

## Related

- #1360 (original memory regression in 2.6.42, fixed by the cap + batching)
- c7613c01 (introduces the hardcoded `SETTINGS` clause)
- a5b82987 (adds `parts_columns_batch_size`)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`check_parts_columns` fails on tables with millions of rows in `system.parts_columns`: hardcoded `max_memory_usage=200000000` cannot be raised, even with `parts_columns_batch_size=1` #1420

Environment

Error

Root cause analysis

Steps to reproduce

Expected behavior

Workaround currently in place

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

database	table	rows_pc
akvorado	flows_local	3903900
akvorado	flows_5m0s_local	2540871
akvorado	flows_1m0s_local	1513986
akvorado	flows_1h0m0s_local	384081

Uh oh!

check_parts_columns fails on tables with millions of rows in system.parts_columns: hardcoded max_memory_usage=200000000 cannot be raised, even with parts_columns_batch_size=1 #1420

Description

Environment

Error

Root cause analysis

Steps to reproduce

Expected behavior

Workaround currently in place

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

`check_parts_columns` fails on tables with millions of rows in `system.parts_columns`: hardcoded `max_memory_usage=200000000` cannot be raised, even with `parts_columns_batch_size=1` #1420