-
Notifications
You must be signed in to change notification settings - Fork 15
Description
✅ I checked the Altinity Stable Builds lifecycle table, and the Altinity Stable Build
version I'm using is still supported.
Type of problem
Bug report - something's broken
Describe the situation
A regression was introduced in PR #1407 which changes the default value of write_marks_for_substreams_in_compact_parts from true to false.
When this setting is false, ClickHouse server crashes with a SIGABRT when reading data from tables containing Array(Object('json')) with nested array structures inside the JSON objects.
The crash occurs during deserialization of tuple elements, where the reader expects per-substream marks that don't exist when the setting is disabled.
Error message:
Logical error: 'Unexpected size of tuple element 1: 0. Expected size: 1'.This issue:
- Is reproducible with debug builds (assertions enabled)
- Affects tables with
Array(Object('json'))containing nested arrays - Does not occur when
write_marks_for_substreams_in_compact_parts=true
How to reproduce the behavior
Environment
- Version: 25.8.16.10001.altinitytest (debug build)
- Build type: Debug (required to trigger the assertion)
Option 1: Using the debug binary
Download the debug binary from CI artifacts:
wget https://altinity-build-artifacts.s3.amazonaws.com/PRs/1407/98b1107b14d7fe362c5619374621bcc6efde9477/build_amd_debug/clickhouse
chmod +x clickhouse
mv clickhouse clickhouse-debug
./clickhouse-debug serverOption 2: Using stateless test
Run the existing test 01825_type_json_in_array which covers this scenario.
Manual reproduction steps
Connect to the server and execute:
SET allow_experimental_object_type = 1;
DROP TABLE IF EXISTS t_json_complex;
CREATE TABLE t_json_complex (id UInt32, arr Array(Object('json')))
ENGINE = MergeTree ORDER BY id;
-- Insert data with nested arrays inside JSON objects
INSERT INTO t_json_complex FORMAT JSONEachRow {"id": 1, "arr": [{"k1": [{"k2": "aaa", "k3": "bbb"}, {"k2": "ccc"}]}]}
INSERT INTO t_json_complex FORMAT JSONEachRow {"id": 2, "arr": [{"k1": [{"k3": "ddd", "k4": 10}, {"k4": 20}], "k5": {"k6": "foo"}}]}
-- This query crashes the server
SELECT id, arr.k1.k2, arr.k1.k3, arr.k1.k4, arr.k5.k6 FROM t_json_complex ORDER BY id;Expected behavior
The SELECT query should return the nested JSON data correctly:
┌─id─┬─arr.k1.k2─────────┬─arr.k1.k3─────────┬─arr.k1.k4─┬─arr.k5.k6─┐
│ 1 │ [['aaa','ccc']] │ [['bbb','']] │ [[0,0]] │ [''] │
│ 2 │ [['','']] │ [['ddd','']] │ [[10,20]] │ ['foo'] │
└────┴───────────────────┴───────────────────┴───────────┴───────────┘
Actual behavior
The server crashes with SIGABRT:
2026.02.17 02:25:21.414533 [ 1169016 ] {37b7e5de-6cdf-4b14-a514-2d567821155b} <Fatal> : Logical error: 'Unexpected size of tuple element 1: 0. Expected size: 1'.
2026.02.17 02:25:21.449885 [ 1169016 ] {37b7e5de-6cdf-4b14-a514-2d567821155b} <Fatal> : Stack trace (when copying this message, always include the lines below):
0. /home/ubuntu/_work/ClickHouse/ClickHouse/contrib/llvm-project/libcxx/include/__exception/exception.h:113: Poco::Exception::Exception(String const&, int) @ 0x000000002755deb2
1. /home/ubuntu/_work/ClickHouse/ClickHouse/src/Common/Exception.cpp:128: DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x00000000145ea2e9
2. /home/ubuntu/_work/ClickHouse/ClickHouse/src/Common/Exception.h:123: DB::Exception::Exception(String&&, int, String, bool) @ 0x000000000d24e18e
3. /home/ubuntu/_work/ClickHouse/ClickHouse/src/Common/Exception.h:58: DB::Exception::Exception(PreformattedMessage&&, int) @ 0x000000000d24db91
4. /home/ubuntu/_work/ClickHouse/ClickHouse/src/Common/Exception.h:141: DB::Exception::Exception<unsigned long&, unsigned long, unsigned long&>(int, FormatStringHelperImpl<std::type_identity<unsigned long&>::type, std::type_identity<unsigned long>::type, std::type_identity<unsigned long&>::type>, unsigned long&, unsigned long&&, unsigned long&) @ 0x000000001a6a9336
5. /home/ubuntu/_work/ClickHouse/ClickHouse/src/DataTypes/Serializations/SerializationTuple.cpp:810: DB::SerializationTuple::deserializeBinaryBulkWithMultipleStreams(COW<DB::IColumn>::immutable_ptr<DB::IColumn>&, unsigned long, unsigned long, DB::ISerialization::DeserializeBinaryBulkSettings&, std::shared_ptr<DB::ISerialization::DeserializeBinaryBulkState>&, std::unordered_map<String, std::unique_ptr<DB::ISerialization::ISubstreamsCacheElement, std::default_delete<DB::ISerialization::ISubstreamsCacheElement>>, std::hash<String>, std::equal_to<String>, std::allocator<std::pair<String const, std::unique_ptr<DB::ISerialization::ISubstreamsCacheElement, std::default_delete<DB::ISerialization::ISubstreamsCacheElement>>>>>*) const @ 0x000000001a6d50d6
6. /home/ubuntu/_work/ClickHouse/ClickHouse/src/DataTypes/Serializations/SerializationArray.cpp:492: DB::SerializationArray::deserializeBinaryBulkWithMultipleStreams(COW<DB::IColumn>::immutable_ptr<DB::IColumn>&, unsigned long, unsigned long, DB::ISerialization::DeserializeBinaryBulkSettings&, std::shared_ptr<DB::ISerialization::DeserializeBinaryBulkState>&, std::unordered_map<String, std::unique_ptr<DB::ISerialization::ISubstreamsCacheElement, std::default_delete<DB::ISerialization::ISubstreamsCacheElement>>, std::hash<String>, std::equal_to<String>, std::allocator<std::pair<String const, std::unique_ptr<DB::ISerialization::ISubstreamsCacheElement, std::default_delete<DB::ISerialization::ISubstreamsCacheElement>>>>>*) const @ 0x000000001a5df852
7. /home/ubuntu/_work/ClickHouse/ClickHouse/src/Storages/MergeTree/MergeTreeReaderCompact.cpp:248: DB::MergeTreeReaderCompact::readData(unsigned long, COW<DB::IColumn>::immutable_ptr<DB::IColumn>&, unsigned long, unsigned long, unsigned long, unsigned long, DB::MergeTreeReaderStream&, std::unordered_map<String, COW<DB::IColumn>::immutable_ptr<DB::IColumn>, std::hash<String>, std::equal_to<String>, std::allocator<std::pair<String const, COW<DB::IColumn>::immutable_ptr<DB::IColumn>>>>&, std::unordered_map<String, COW<DB::IColumn>::immutable_ptr<DB::IColumn>, std::hash<String>, std::equal_to<String>, std::allocator<std::pair<String const, COW<DB::IColumn>::immutable_ptr<DB::IColumn>>>>*, std::unordered_map<String, std::unique_ptr<DB::ISerialization::ISubstreamsCacheElement, std::default_delete<DB::ISerialization::ISubstreamsCacheElement>>, std::hash<String>, std::equal_to<String>, std::allocator<std::pair<String const, std::unique_ptr<DB::ISerialization::ISubstreamsCacheElement, std::default_delete<DB::ISerialization::ISubstreamsCacheElement>>>>>*) @ 0x000000001eed960d
8. /home/ubuntu/_work/ClickHouse/ClickHouse/src/Storages/MergeTree/MergeTreeReaderCompactSingleBuffer.cpp:70: DB::MergeTreeReaderCompactSingleBuffer::readRows(unsigned long, unsigned long, bool, unsigned long, unsigned long, std::vector<COW<DB::IColumn>::immutable_ptr<DB::IColumn>, std::allocator<COW<DB::IColumn>::immutable_ptr<DB::IColumn>>>&) @ 0x000000001eedee93
9. /home/ubuntu/_work/ClickHouse/ClickHouse/src/Storages/MergeTree/MergeTreeRangeReader.cpp:127: DB::MergeTreeRangeReader::DelayedStream::finalize(std::vector<COW<DB::IColumn>::immutable_ptr<DB::IColumn>, std::allocator<COW<DB::IColumn>::immutable_ptr<DB::IColumn>>>&) @ 0x000000001eeca184
10. /home/ubuntu/_work/ClickHouse/ClickHouse/src/Storages/MergeTree/MergeTreeRangeReader.cpp:309: DB::MergeTreeRangeReader::startReadingChain(unsigned long, DB::MarkRanges&) @ 0x000000001eed1ec3
11. /home/ubuntu/_work/ClickHouse/ClickHouse/src/Storages/MergeTree/MergeTreeReadersChain.cpp:68: DB::MergeTreeReadersChain::read(unsigned long, DB::MarkRanges&, std::vector<DB::MarkRanges, std::allocator<DB::MarkRanges>>&) @ 0x000000001eef8715
12. /home/ubuntu/_work/ClickHouse/ClickHouse/src/Storages/MergeTree/MergeTreeReadTask.cpp:229: DB::MergeTreeReadTask::read() @ 0x000000001eef60b3
13. /home/ubuntu/_work/ClickHouse/ClickHouse/src/Storages/MergeTree/MergeTreeSelectAlgorithms.h:53: DB::MergeTreeInOrderSelectAlgorithm::readFromTask(DB::MergeTreeReadTask&) @ 0x000000001fab9b0c
14. /home/ubuntu/_work/ClickHouse/ClickHouse/src/Storages/MergeTree/MergeTreeSelectProcessor.cpp:234: DB::MergeTreeSelectProcessor::read() @ 0x000000001ef08112
15. /home/ubuntu/_work/ClickHouse/ClickHouse/src/Storages/MergeTree/MergeTreeSource.cpp:229: DB::MergeTreeSource::tryGenerate() @ 0x000000001faa51a9
16. /home/ubuntu/_work/ClickHouse/ClickHouse/src/Processors/ISource.cpp:110: DB::ISource::work() @ 0x000000001f4dc742
17. /home/ubuntu/_work/ClickHouse/ClickHouse/src/Processors/Executors/ExecutionThreadContext.cpp:53: DB::ExecutionThreadContext::executeTask() @ 0x000000001f4f9950
18. /home/ubuntu/_work/ClickHouse/ClickHouse/src/Processors/Executors/PipelineExecutor.cpp:351: DB::PipelineExecutor::executeStepImpl(unsigned long, DB::IAcquiredSlot*, std::atomic<bool>*) @ 0x000000001f4ebbc5
19. /home/ubuntu/_work/ClickHouse/ClickHouse/src/Processors/Executors/PipelineExecutor.cpp:279: DB::PipelineExecutor::executeSingleThread(unsigned long, DB::IAcquiredSlot*) @ 0x000000001f4ec129
20. /home/ubuntu/_work/ClickHouse/ClickHouse/src/Processors/Executors/PipelineExecutor.cpp:565: void std::__function::__policy_invoker<void ()>::__call_impl[abi:se190107]<std::__function::__default_alloc_func<DB::PipelineExecutor::spawnThreads(std::shared_ptr<DB::IAcquiredSlot>)::$_0, void ()>>(std::__function::__policy_storage const*) @ 0x000000001f4ed1c3
21. /home/ubuntu/_work/ClickHouse/ClickHouse/contrib/llvm-project/libcxx/include/__functional/function.h:716: ? @ 0x0000000014738b53
22. /home/ubuntu/_work/ClickHouse/ClickHouse/contrib/llvm-project/libcxx/include/__type_traits/invoke.h:117: ThreadFromGlobalPoolImpl<false, true>::ThreadFromGlobalPoolImpl<void (ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool::*)(), ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool*>(void (ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool::*&&)(), ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool*&&)::'lambda'()::operator()() @ 0x000000001473f226
23. /home/ubuntu/_work/ClickHouse/ClickHouse/contrib/llvm-project/libcxx/include/__functional/function.h:716: ? @ 0x0000000014735fe6
24. /home/ubuntu/_work/ClickHouse/ClickHouse/contrib/llvm-project/libcxx/include/__type_traits/invoke.h:117: void* std::__thread_proxy[abi:se190107]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void (ThreadPoolImpl<std::thread>::ThreadFromThreadPool::*)(), ThreadPoolImpl<std::thread>::ThreadFromThreadPool*>>(void*) @ 0x000000001473ca00
25. start_thread @ 0x000000000009caa4
26. clone3 @ 0x0000000000129c6c
2026.02.17 02:25:21.450091 [ 1167620 ] {} <Trace> BaseDaemon: Received signal 6
2026.02.17 02:25:21.450122 [ 1167620 ] {} <Fatal> BaseDaemon: ########## Short fault info ############
2026.02.17 02:25:21.450129 [ 1167620 ] {} <Fatal> BaseDaemon: (version 25.8.16.10001.altinitytest, build id: 5E3C8A71D175863BBF837F2D50BFBE92DC49FDA3, git hash: 84102805cd7eacfd49b38f81ca46d868189c5b82, architecture: x86_64) (from thread 1169016) Received signal 6
2026.02.17 02:25:21.450130 [ 1167620 ] {} <Fatal> BaseDaemon: Signal description: Aborted
...
(query: SELECT id, arr.k1.k2, arr.k1.k3, arr.k1.k4, arr.k5.k6 FROM t_json_complex ORDER BY id;)
Received signal Aborted (6)Root cause analysis
The crash originates in src/DataTypes/Serializations/SerializationTuple.cpp:810:
if (column_tuple.getColumn(i).size() != expected_size)
throw Exception(... ErrorCodes::LOGICAL_ERROR,
"Unexpected size of tuple element {}: {}. Expected size: {}",
i, column_tuple.getColumn(i).size(), expected_size);When write_marks_for_substreams_in_compact_parts=false:
- Compact parts use
.mrk3format (column-level marks only) - Nested JSON arrays with tuples require per-substream marks (
.mrk4format) - Without per-substream marks, the deserializer cannot correctly position each substream
- This causes size mismatches between tuple elements during reading
Workaround
Set write_marks_for_substreams_in_compact_parts=true at table creation:
CREATE TABLE t_json_complex (id UInt32, arr Array(Object('json')))
ENGINE = MergeTree ORDER BY id
SETTINGS write_marks_for_substreams_in_compact_parts = true;Or alter existing tables:
ALTER TABLE t_json_complex MODIFY SETTING write_marks_for_substreams_in_compact_parts = true;Note: Existing parts written with false will still be unreadable.
Additional context
Test cases by complexity
| Case | Structure | Crashes? |
|---|---|---|
| Simple | Object('json') with flat fields |
No |
| Intermediate | Array(Object('json')) with flat fields |
No |
| Complex | Array(Object('json')) with nested arrays inside |
Yes |
Related PR
- PR 25.8.16 Stable: disable
write_marks_for_substreams_in_compact_partsby default #1407: Changed default ofwrite_marks_for_substreams_in_compact_partsfromtruetofalseto fix downgrade compatibility issues
CI failures
- Tests:
01825_type_json_in_array,01825_type_json_17,03208_array_of_json_read_subcolumns_1 - Run: https://github.com/Altinity/ClickHouse/actions/runs/22056137775/job/63817018890