Skip to content

check_parts_columns fails on tables with millions of rows in system.parts_columns: hardcoded max_memory_usage=200000000 cannot be raised, even with parts_columns_batch_size=1 #1420

Description

@anthonypelletier

Since v2.7.2, the CheckSystemPartsColumnsForTables query carries a hardcoded SETTINGS max_bytes_before_external_group_by=100000000, max_memory_usage=200000000 clause, introduced in c7613c0 as part of the fix for #1360.

This makes create / create_remote fail with MEMORY_LIMIT_EXCEEDED (code: 241) on any instance where a single table produces more rows in system.parts_columns than fit in ~190 MiB. Because the limit is injected directly in the query's SETTINGS clause, it overrides any user profile, and parts_columns_batch_size=1 cannot help since the batch is already a single table.

Backups that worked fine on v2.7.1 (and earlier) now fail 100% of the time.

Environment

  • clickhouse-backup: v2.7.2 (server mode with watch, Kubernetes sidecar)
  • ClickHouse server: 26.3
  • Already set: CLICKHOUSE_PARTS_COLUMNS_BATCH_SIZE=1 → no effect (expected, see below)

Error

ERR pkg/backup/create.go:161 > backup failed error: CheckSystemPartsColumnsForTables failed: CheckSystemPartsColumnsForTables: select parts columns: code: 241, message: Query memory limit exceeded: would use 193.91 MiB (attempt to allocate chunk of 16.00 MiB), maximum: 190.73 MiB.
ERR pkg/server/server.go:688 > Watch error, error=too many errors create_remote: 1, delete local: 1, abort watching

Note maximum: 190.73 MiB = exactly the hardcoded 200000000 bytes.

Root cause analysis

The offending table is an Akvorado flow table (~130 columns, ~30k active parts):

SELECT database, table, count() AS rows_pc
FROM system.parts_columns
WHERE active
GROUP BY database, table
ORDER BY rows_pc DESC
LIMIT 4;
database table rows_pc
akvorado flows_local 3903900
akvorado flows_5m0s_local 2540871
akvorado flows_1m0s_local 1513986
akvorado flows_1h0m0s_local 384081

With ~3.9M rows for a single table, the memory is consumed while scanning/materializing system.parts_columns for that table, not in the GROUP BY state — so max_bytes_before_external_group_by does not help (it only spills the aggregation state), and parts_columns_batch_size=1 does not help either since the smallest possible batch (one table) already exceeds the cap.

This kind of volumetry is normal for wide, high-ingest tables with long retention (netflow, observability, etc.), so the hardcoded 200 MB cap effectively breaks check_parts_columns: true for this whole class of workloads.

Steps to reproduce

  1. Create a MergeTree table with ~100+ columns and enough active parts so that SELECT count() FROM system.parts_columns WHERE active AND database = ... AND table = ... returns a few million rows.
  2. Run clickhouse-backup create with default config (check_parts_columns: true) on v2.7.2.
  3. Backup fails with code: 241, Query memory limit exceeded ... maximum: 190.73 MiB, regardless of parts_columns_batch_size and regardless of the server's max_memory_usage profile (the query-level SETTINGS clause takes precedence).

Expected behavior

Either of:

  1. Make the memory limit configurable, e.g. clickhouse.parts_columns_max_memory_usage (env CLICKHOUSE_PARTS_COLUMNS_MAX_MEMORY_USAGE), defaulting to the current 200000000 to preserve the Backups in v2.6.42+ exhausting memory for database with lots of tables #1360 fix for the many-tables case, with 0 meaning "don't inject the setting at all" (pre-2.7.2 behavior);
  2. And/or, on MEMORY_LIMIT_EXCEEDED (code 241) for a batch, retry that batch without the SETTINGS cap (or with a higher one) instead of failing the whole backup.

Workaround currently in place

CLICKHOUSE_CHECK_PARTS_COLUMNS=false, after verifying manually (without the cap) that no type inconsistency exists:

SELECT database, table, column, min(type) AS min_type, max(type) AS max_type
FROM system.parts_columns
WHERE active
GROUP BY database, table, column
HAVING min_type != max_type;
-- 0 rows

We would prefer to keep the check enabled, hence this issue.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions