Add hooks into velox core to initialize cudf-exchange components.#2
Add hooks into velox core to initialize cudf-exchange components.#2dan13bauer wants to merge 2 commits into
Conversation
78fa49f to
adaca30
Compare
|
@majetideepak this is the hook PR. we know it doesn't work in the current form. I think the best solution would be to make callback registration possible somehow for Task, so it knows what needs to get initialization/teardown. |
|
@zoltan Is it accurate that the CudfOutputQueueManager is the GPU counterpart for the OutputBufferManager on the CPU? |
|
yes, it's the GPU counterpart. the thing is, for almost everything, there is a GPU counterpart. I think we should define a common interface instead and let both implement it and let it register multiples of those interface callbacks. this would sound more extensible to me. do the velox folks frown upon multiple inheritance and generic designs like this? :) |
Gpu local partition
Summary:
Fixes OSS Asan segV due to calling 'as->' on a nullptr.
```
=================================================================
==4058438==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000000a563a4 bp 0x7ffd54ee5bc0 sp 0x7ffd54ee5aa0 T0)
==4058438==The signal is caused by a READ memory access.
==4058438==Hint: address points to the zero page.
#0 0x000000a563a4 in facebook::velox::FlatVector<int>* facebook::velox::BaseVector::as<facebook::velox::FlatVector<int>>() /velox/./velox/vector/BaseVector.h:116:12
#1 0x000000a563a4 in facebook::velox::test::(anonymous namespace)::FlatMapVectorTest_encodedKeys_Test::TestBody() /velox/velox/vector/tests/FlatMapVectorTest.cpp:156:5
#2 0x70874f90ce0b (/lib64/libgtest.so.1.11.0+0x4fe0b) (BuildId: 506b2df0fc901091ff83631fd797a325cae6b679)
#3 0x70874f8ed825 in testing::Test::Run() (/lib64/libgtest.so.1.11.0+0x30825) (BuildId: 506b2df0fc901091ff83631fd797a325cae6b679)
#4 0x70874f8ed9ef in testing::TestInfo::Run() (/lib64/libgtest.so.1.11.0+0x309ef) (BuildId: 506b2df0fc901091ff83631fd797a325cae6b679)
#5 0x70874f8edaf8 in testing::TestSuite::Run() (/lib64/libgtest.so.1.11.0+0x30af8) (BuildId: 506b2df0fc901091ff83631fd797a325cae6b679)
#6 0x70874f8fcfc4 in testing::internal::UnitTestImpl::RunAllTests() (/lib64/libgtest.so.1.11.0+0x3ffc4) (BuildId: 506b2df0fc901091ff83631fd797a325cae6b679)
#7 0x70874f8fa7c7 in testing::UnitTest::Run() (/lib64/libgtest.so.1.11.0+0x3d7c7) (BuildId: 506b2df0fc901091ff83631fd797a325cae6b679)
#8 0x70877c073153 in main (/lib64/libgtest_main.so.1.11.0+0x1153) (BuildId: c3a576d37d6cfc6875afdc98684c143107a226a0)
#9 0x70874f48460f in __libc_start_call_main (/lib64/libc.so.6+0x2a60f) (BuildId: 4dbf824d0f6afd9b2faee4787d89a39921c0a65e)
#10 0x70874f4846bf in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a6bf) (BuildId: 4dbf824d0f6afd9b2faee4787d89a39921c0a65e)
#11 0x00000044c1b4 in _start (/velox/_build/debug/velox/vector/tests/velox_vector_test+0x44c1b4) (BuildId: 6da0b0d1074134be8f4d4534e5dbac9eeb9d482b)
```
Reviewed By: peterenescu
Differential Revision: D91275269
fbshipit-source-id: 0806aa7562dc8cf4ad708fc6a8e4b29409507745
Summary: Pull Request resolved: facebookincubator#16102 Fixes Asan error in S3Util.cpp, See stack trace below: ``` ==4125762==ERROR: AddressSanitizer: global-buffer-overflow on address 0x0000006114ff at pc 0x70aa17bc0120 bp 0x7ffe905f3030 sp 0x7ffe905f3028 READ of size 1 at 0x0000006114ff thread T0 #0 0x70aa17bc011f in facebook::velox::filesystems::parseAWSStandardRegionName[abi:cxx11](std::basic_string_view<char, std::char_traits<char>>) /velox/velox/connectors/hive/storage_adapters/s3fs/S3Util.cpp:160:16 #1 0x00000055790b in facebook::velox::filesystems::S3UtilTest_parseAWSRegion_Test::TestBody() /velox/velox/connectors/hive/storage_adapters/s3fs/tests/S3UtilTest.cpp:147:3 #2 0x70aa2e89be0b (/lib64/libgtest.so.1.11.0+0x4fe0b) (BuildId: 506b2df0fc901091ff83631fd797a325cae6b679) #3 0x70aa2e87c825 in testing::Test::Run() (/lib64/libgtest.so.1.11.0+0x30825) (BuildId: 506b2df0fc901091ff83631fd797a325cae6b679) #4 0x70aa2e87c9ef in testing::TestInfo::Run() (/lib64/libgtest.so.1.11.0+0x309ef) (BuildId: 506b2df0fc901091ff83631fd797a325cae6b679) #5 0x70aa2e87caf8 in testing::TestSuite::Run() (/lib64/libgtest.so.1.11.0+0x30af8) (BuildId: 506b2df0fc901091ff83631fd797a325cae6b679) #6 0x70aa2e88bfc4 in testing::internal::UnitTestImpl::RunAllTests() (/lib64/libgtest.so.1.11.0+0x3ffc4) (BuildId: 506b2df0fc901091ff83631fd797a325cae6b679) #7 0x70aa2e8897c7 in testing::UnitTest::Run() (/lib64/libgtest.so.1.11.0+0x3d7c7) (BuildId: 506b2df0fc901091ff83631fd797a325cae6b679) #8 0x70aa2e8ba153 in main (/lib64/libgtest_main.so.1.11.0+0x1153) (BuildId: c3a576d37d6cfc6875afdc98684c143107a226a0) #9 0x70aa01ceb60f in __libc_start_call_main (/lib64/libc.so.6+0x2a60f) (BuildId: 4dbf824d0f6afd9b2faee4787d89a39921c0a65e) #10 0x70aa01ceb6bf in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a6bf) (BuildId: 4dbf824d0f6afd9b2faee4787d89a39921c0a65e) #11 0x000000408684 in _start (/velox/_build/debug/velox/connectors/hive/storage_adapters/s3fs/tests/velox_s3file_test+0x408684) (BuildId: bbf3099c9a66a548c6da234b17ad1b631e9ed649) 0x0000006114ff is located 33 bytes before global variable '.str.135' defined in '/velox/velox/connectors/hive/storage_adapters/s3fs/tests/S3UtilTest.cpp:126' (0x000000611520) of size 46 '.str.135' is ascii string 'isHostExcludedFromProxy(hostname, pair.first)' 0x0000006114ff is located 1 bytes before global variable '.str.133' defined in '/velox/velox/connectors/hive/storage_adapters/s3fs/tests/S3UtilTest.cpp:122' (0x000000611500) of size 1 '.str.133' is ascii string '' 0x0000006114ff is located 42 bytes after global variable '.str.132' defined in '/velox/velox/connectors/hive/storage_adapters/s3fs/tests/S3UtilTest.cpp:121' (0x0000006114c0) of size 21 '.str.132' is ascii string 'localhost,foobar.com' AddressSanitizer: global-buffer-overflow /velox/velox/connectors/hive/storage_adapters/s3fs/S3Util.cpp:160:16 in facebook::velox::filesystems::parseAWSStandardRegionName[abi:cxx11](std::basic_string_view<char, std::char_traits<char>>) Shadow bytes around the buggy address: ``` Reviewed By: pedroerp Differential Revision: D91278230 fbshipit-source-id: 05283bc8408069fa3f5ab8a7840b2bd0835fa7d6
The DataAndMetadata struct has a stream field that is used in onData() to create the PackedTableWithStream, but the stream was never stored after being obtained from the pool. This meant onData() would use an uninitialized stream view. Set ptr->stream = stream immediately after obtaining the stream from the global pool, before the allocation try/catch. Review: @wence- comment #2
… mode (facebookincubator#16401) Summary: Pull Request resolved: facebookincubator#16401 This diff fixes data races detected by ThreadSanitizer (TSAN) in the barrier processing code under multi-threaded execution mode. **Race condition #1**: Between `Driver::startBarrier()` and `Driver::hasBarrier()` - Write: `startBarrier()` setting `barrier_` state - Read: `hasBarrier()` (via `isDraining()`) checking barrier state - These accesses happen concurrently from different driver threads. **Race condition #2**: Between `Driver::dropInput()` and `Driver::shouldDropOutput()` - Write: `dropInput()` modifying `barrier_.dropInputOpId` (called from a different driver's thread via `Task::dropInputLocked()`) - Read: `shouldDropOutput()` reading `barrier_.dropInputOpId` (called from this driver's own thread) **Fix approach:** 1. Added atomic flag `hasBarrier_` to track whether barrier processing is active, with `memory_order_acquire` on reads and `memory_order_release` on writes. 2. Changed `dropInputOpId` from `std::optional<int32_t>` to `std::atomic_int32_t` with sentinel value `kNoDropInput = -1` for thread-safe cross-driver access. 3. Added `BarrierState::reset()` method to cleanly reset barrier state. 4. Note that `barrier_` state is only meaningful when `hasBarrier_` is true. 5. Added `waitForAllTasksToBeDeleted()` in `barrierAfterNoMoreSplits` and `MergeJoinTest.barrier` tests to ensure all driver threads complete before test iterations end. The acquire-release memory ordering ensures proper synchronization: any thread that reads `hasBarrier_` as `true` is guaranteed to see the fully initialized `barrier_` state. Reviewed By: kunigami, srsuryadev Differential Revision: D93355327 fbshipit-source-id: 5d7d3c636bef62f58daaa036089f41ea01572d3d
…sh (facebookincubator#16830) Summary: Pull Request resolved: facebookincubator#16830 ## Root Cause Analysis The production crash was a SIGSEGV at page-aligned address `0x7fa369c00000` in `PatternStringIterator::charAt()` during `LikeGeneric::apply()`. The root cause is that `StringView` is a non-owning pointer — when the backing memory (likely memory-mapped file pages from a scan operator) was reclaimed/unmapped under memory pressure (1.96GB peak spill), the pointer became dangling. This is a known class of bugs in Velox — the DWRF FlatMap writer had an identical issue (`TestFlatMapDanglingStringViewKeyOnRehash`), where `StringView` keys in an F14 map dangled after the input vector's buffer was freed. **Why the buffer can be freed during `apply()`:** - Memory arbitration can be triggered by any pool allocation (e.g., `context.ensureWritable()`) - The arbitrator reclaims from OTHER operators/tasks by spilling them - The current operator is protected by `NonReclaimableSectionGuard`, but operators that produced the input vectors are NOT - When those operators' memory is reclaimed, the mmap'd pages backing string data can be munmap'd - `DecodedVector` stores raw pointers (NOT shared_ptr) to the base vector's data, so it doesn't prevent reclamation ## Fix Copy the pattern string to a local `std::string` before passing it to `determinePatternKind()`, eliminating the dependency on the original buffer's lifetime. The performance cost is minimal: one heap allocation per row in the already-slow non-constant pattern path. ## Tests - `likeGenericWithLongPatterns`: Exercises the full `LikeGeneric::apply` code path with patterns >12 bytes (non-inline StringView) across all optimized pattern kinds (fixed, prefix, suffix, substring, generic). - `likePatternCopyProtectsAgainstDanglingPointer`: Uses `mmap`/`munmap` to precisely reproduce the production crash scenario where memory-mapped pages are unmapped. Verifies the defensive copy protects against the dangling pointer. Death test guarded with `#ifndef RE2_BUILDING_WITH_SAN` following the established Velox pattern from `ThreadDebugInfoDeathTest`. ``` W0317 10:45:17.338729 962 [MemoryCheckerTh] PeriodicMemoryChecker.cpp:171] System used memory 98.92GB exceeded limit: 98.00GB I0317 10:45:17.338799 962 [MemoryCheckerTh] AsyncDataCache.cpp:883] Try to shrink cache to free up 8.92GB memory I0317 10:45:18.426632 962 [MemoryCheckerTh] AsyncDataCache.cpp:912] Freed 8.92GB cache memory, spent 1.09s AsyncDataCache: Cache size: 47.83GB tinySize: 104.37MB large size: 47.73GB Cache entries: 631619 read pins: 9 write pins: 0 pinned shared: 3.76MB pinned exclusive: 0B num write wait: 2214463 empty entries: 24640747 Cache access miss: 1015380044 hit: 689290647 hit bytes: 297.70TB eviction: 1014733085 savable eviction: 41203463 eviction checks: 210822364904 aged out: 15340 stales: 0 Prefetch entries: 66813 bytes: 299.18MB Alloc Megaclocks 187071714 Allocated pages: 17191373 cached pages: 12511075 Backing: Memory Allocator[MMAP total capacity 79.00GB free capacity 13.42GB allocated pages 17191373 mapped pages 18370872 external mapped pages 4642968 [size 1: 80775(315MB) allocated 126234 mapped] [size 2: 81815(639MB) allocated 168151 mapped] [size 4: 53466(835MB) allocated 85150 mapped] [size 8: 25931(810MB) allocated 41224 mapped] [size 16: 23254(1453MB) allocated 31144 mapped] [size 32: 15357(1919MB) allocated 19535 mapped] [size 64: 28495(7123MB) allocated 34663 mapped] [size 128: 10600(5300MB) allocated 10600 mapped] [size 256: 30620(30620MB) allocated 30845 mapped] ] SSD: Ssd cache IO: Write 18.22TB read 4.24TB Size 1.44TB Occupied 1.37TB 3745K entries (max 9765K). GroupStats: <dummy FileGroupStats> I0317 10:45:18.555488 962 [MemoryCheckerTh] PeriodicMemoryChecker.cpp:228] Memory pushback shrunk 8.92GB Effective bytes shrunk: 9.01GB I0317 10:45:18.669054 1527 BcAdaptiveTokenManager.cpp:976] BcAdaptiveTokenManager[RX]: AIMD adjustment - UNDERUTILIZED (1.624 GB/s -> 1.65 GB/s) I0317 10:45:19.350598 1457 BcAdaptiveTokenManager.cpp:974] BcAdaptiveTokenManager[TX]: AIMD adjustment - UNDERUTILIZED (2.619 GB/s -> 2.619 GB/s) E0317 10:45:23.854575 843003 [ExchangeCPU3305] PrestoExchangeSource.cpp:550] Abort results failed: proxygen::HTTPException: ingress timeout, streamID=123, timeout=60000ms, path /v1/task/20260317_174241_33921_32fmu.2.0.156.0/results/79 E0317 10:45:23.855425 843003 [ExchangeCPU3305] PrestoExchangeSource.cpp:550] Abort results failed: proxygen::HTTPException: ingress timeout, streamID=219, timeout=60000ms, path /v1/task/20260317_174241_33921_32fmu.3.0.156.0/results/79 E0317 10:45:27.578008 843003 [ExchangeCPU3305] PrestoExchangeSource.cpp:550] Abort results failed: proxygen::HTTPException: ingress timeout, streamID=2569, timeout=60000ms, path /v1/task/20260317_174245_33922_32fmu.2.0.29.0/results/7 I0317 10:45:28.113592 954 [clean_old_tasks] TaskManager.cpp:1003] cleanOldTasks: Cleaned 66 old task(s) in 0 ms E0317 10:45:28.130775 954 [clean_old_tasks] TaskManager.cpp:313] There are 1 zombie Task that satisfy cleanup conditions but could not be cleaned up, because the Task are referenced by more than 1 owners. RUNNING[0] FINISHED[0] CANCELED[0] ABORTED[1] FAILED[0] Sample task IDs (shows only 20 IDs): E0317 10:45:28.130795 954 [clean_old_tasks] TaskManager.cpp:323] Zombie Task [1/1]: Extra Refs: 1, 20260317_172201_33407_32fmu.1.0.16.0 I0317 10:45:32.531476 951 [report_spill_st] PeriodicStatsReporter.cpp:264] Spill memory usage: current[0B] peak[2.24GB] E0317 10:45:36.031675 843003 [ExchangeCPU3305] PrestoExchangeSource.cpp:550] Abort results failed: proxygen::HTTPException: ingress timeout, streamID=295, timeout=60000ms, path /v1/task/20260317_174249_33924_32fmu.4.0.15.0/results/30 E0317 10:45:36.049696 843003 [ExchangeCPU3305] PrestoExchangeSource.cpp:550] Abort results failed: proxygen::HTTPException: ingress timeout, streamID=179, timeout=60000ms, path /v1/task/20260317_174249_33924_32fmu.3.0.15.0/results/30 E0317 10:45:36.054131 843003 [ExchangeCPU3305] PrestoExchangeSource.cpp:550] Abort results failed: proxygen::HTTPException: ingress timeout, streamID=59, timeout=60000ms, path /v1/task/20260317_174249_33924_32fmu.2.0.15.0/results/30 E0317 10:45:36.055419 843003 [ExchangeCPU3305] PrestoExchangeSource.cpp:550] Abort results failed: proxygen::HTTPException: ingress timeout, streamID=207, timeout=60000ms, path /v1/task/20260317_174249_33924_32fmu.5.0.15.0/results/30 E0317 10:45:36.084476 843003 [ExchangeCPU3305] PrestoExchangeSource.cpp:550] Abort results failed: proxygen::HTTPException: ingress timeout, streamID=123, timeout=60000ms, path /v1/task/20260317_174249_33924_32fmu.9.0.15.0/results/30 I0317 10:45:36.100535 842798 [HTTPSrvCpu24857] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.20.0.12.0 E0317 10:45:36.100556 842798 [HTTPSrvCpu24857] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE I0317 10:45:36.100762 842710 [HTTPSrvCpu24854] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.24.0.94.0 E0317 10:45:36.100787 842710 [HTTPSrvCpu24854] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE I0317 10:45:36.100847 842786 [HTTPSrvCpu24855] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.21.0.12.0 E0317 10:45:36.100879 842786 [HTTPSrvCpu24855] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE I0317 10:45:36.101078 842800 [HTTPSrvCpu24857] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.23.0.12.0 E0317 10:45:36.101096 842800 [HTTPSrvCpu24857] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE I0317 10:45:36.101325 842816 [HTTPSrvCpu24858] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.22.0.12.0 E0317 10:45:36.101351 842816 [HTTPSrvCpu24858] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE I0317 10:45:36.101361 842408 [HTTPSrvCpu24850] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.25.0.81.0 E0317 10:45:36.101398 842408 [HTTPSrvCpu24850] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE I0317 10:45:36.101961 843007 [HTTPSrvCpu24859] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.27.0.12.0 E0317 10:45:36.101987 843007 [HTTPSrvCpu24859] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE I0317 10:45:36.102123 842791 [HTTPSrvCpu24856] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.29.0.94.0 E0317 10:45:36.102156 842791 [HTTPSrvCpu24856] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE I0317 10:45:36.102209 843009 [HTTPSrvCpu24859] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.28.0.12.0 E0317 10:45:36.102236 843009 [HTTPSrvCpu24859] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE I0317 10:45:36.107154 843015 [HTTPSrvCpu24860] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.16.0.1.0 E0317 10:45:36.107189 843015 [HTTPSrvCpu24860] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE I0317 10:45:36.107321 843014 [HTTPSrvCpu24859] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.18.0.12.0 E0317 10:45:36.107344 843014 [HTTPSrvCpu24859] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE I0317 10:45:36.108820 842799 [HTTPSrvCpu24857] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.8.0.12.0 E0317 10:45:36.108840 842799 [HTTPSrvCpu24857] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE I0317 10:45:36.109087 843036 [HTTPSrvCpu24862] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.10.0.84.0 E0317 10:45:36.109143 843036 [HTTPSrvCpu24862] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE I0317 10:45:36.109302 842807 [HTTPSrvCpu24858] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.9.0.94.0 E0317 10:45:36.109320 842807 [HTTPSrvCpu24858] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE I0317 10:45:36.109992 843029 [HTTPSrvCpu24861] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.12.0.12.0 E0317 10:45:36.110028 843029 [HTTPSrvCpu24861] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE I0317 10:45:36.110075 842709 [HTTPSrvCpu24854] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.13.0.12.0 E0317 10:45:36.110100 842709 [HTTPSrvCpu24854] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE I0317 10:45:36.110234 843032 [HTTPSrvCpu24861] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.14.0.94.0 E0317 10:45:36.110255 843032 [HTTPSrvCpu24861] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE I0317 10:45:36.271823 843028 [HTTPSrvCpu24861] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.15.0.84.0 I0317 10:45:36.271857 843022 [HTTPSrvCpu24860] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.7.0.12.0 I0317 10:45:36.271853 843027 [HTTPSrvCpu24861] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.2.0.12.0 E0317 10:45:36.271932 843027 [HTTPSrvCpu24861] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE E0317 10:45:36.272019 843022 [HTTPSrvCpu24860] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE I0317 10:45:36.271862 843033 [HTTPSrvCpu24861] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.1.0.12.0 I0317 10:45:36.271858 842793 [HTTPSrvCpu24856] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.5.0.12.0 E0317 10:45:36.272382 843033 [HTTPSrvCpu24861] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE I0317 10:45:36.271858 843018 [HTTPSrvCpu24860] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.3.0.12.0 E0317 10:45:36.272401 842793 [HTTPSrvCpu24856] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE E0317 10:45:36.271872 843028 [HTTPSrvCpu24861] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE E0317 10:45:36.272423 843018 [HTTPSrvCpu24860] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE I0317 10:45:36.271854 843019 [HTTPSrvCpu24860] TaskManager.cpp:877] Deleting task 20260317_174453_33962_32fmu.6.0.12.0 E0317 10:45:36.272475 843019 [HTTPSrvCpu24860] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE I0317 10:45:38.938845 964 [Announcement] PeriodicServiceInventoryManager.cpp:130] Announcement succeeded: HTTP 202. State: active. I0317 10:45:41.508436 842050 [HTTPSrvCpu24849] TaskManager.cpp:877] Deleting task 20260317_174452_33961_32fmu.3.0.93.0 E0317 10:45:41.508466 842050 [HTTPSrvCpu24849] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE I0317 10:45:41.510092 843020 [HTTPSrvCpu24860] TaskManager.cpp:877] Deleting task 20260317_174452_33961_32fmu.1.0.109.0 E0317 10:45:41.510118 843020 [HTTPSrvCpu24860] Exceptions.h:53] Line: fbcode/velox/exec/Task.cpp:2468, Function:terminate, Expression: Aborted for external error, Source: RUNTIME, ErrorCode: INVALID_STATE *** Aborted at 1773769541 (Unix time, try 'date -d 1773769541') *** *** Signal 11 (SIGSEGV) (0x7facb7e00000) received by PID 113 (pthread TID 0x7faeb1fff000) (linux TID 838941) (code: address not mapped to object), stack trace: *** @ 0000000017d3365a folly::symbolizer::(anonymous namespace)::innerSignalHandler(int, siginfo_t*, void*) [clone .__uniq.302291754384189453301783370447166124111] ./fbcode/folly/debugging/symbolizer/SignalHandler.cpp:552 @ 0000000017d335c7 folly::symbolizer::(anonymous namespace)::signalHandler(int, siginfo_t*, void*) [clone .__uniq.302291754384189453301783370447166124111] [clone .llvm.3532004345868697328] ./fbcode/folly/debugging/symbolizer/SignalHandler.cpp:573 @ 000000000004455f (unknown) /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/signal/../sysdeps/unix/sysv/linux/libc_sigaction.c:8 -> /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c @ 0000000047329fde facebook::velox::functions::determinePatternKind(std::basic_string_view<char, std::char_traits<char> >, std::optional<char>) ./fbcode/velox/functions/lib/Re2Functions.cpp:1988 @ 0000000047329d4f facebook::velox::functions::(anonymous namespace)::LikeGeneric::apply(facebook::velox::SelectivityVector const&, std::vector<std::shared_ptr<facebook::velox::BaseVector>, std::allocator<std::shared_ptr<facebook::velox::BaseVector> > >&, std::shared_ptr<facebook::velox::Type const> const&, facebook::velox::exec::EvalCtx&, std::shared_ptr<facebook::velox::BaseVector>&) const::{lambda(facebook::velox::StringView const&, facebook::velox::StringView const&, std::optional<char> const&)#2}::operator()(facebook::velox::StringView const&, facebook::velox::StringView const&, std::optional<char> const&) const [clone .__uniq.96824254187906811421847846720006205206] ./fbcode/velox/functions/lib/Re2Functions.cpp:947 @ 0000000015cbed20 _ZNK8facebook5velox17SelectivityVector15applyToSelectedIZNS0_4exec7EvalCtx22applyToSelectedNoThrowIZNKS0_9functions12_GLOBAL__N_111LikeGeneric5applyERKS1_RSt6vectorISt10shared_ptrINS0_10BaseVectorEESaISE_EERKSC_IKNS0_4TypeEERS4_RSE_EUlT_E_ZNS4_22applyToSelectedNoThrowISQ_EEvSA_SP_EUlSP_E_EEvSA_SP_T0_EUlSP_E_EEvSP_.__uniq.96824254187906811421847846720006205206 ./fbcode/velox/functions/lib/Re2Functions.cpp:1010 @ 0000000047102acd facebook::velox::functions::(anonymous namespace)::LikeGeneric::apply(facebook::velox::SelectivityVector const&, std::vector<std::shared_ptr<facebook::velox::BaseVector>, std::allocator<std::shared_ptr<facebook::velox::BaseVector> > >&, std::shared_ptr<facebook::velox::Type const> const&, facebook::velox::exec::EvalCtx&, std::shared_ptr<facebook::velox::BaseVector>&) const [clone .__uniq.96824254187906811421847846720006205206] fbcode/velox/expression/EvalCtx.h:299 @ 0000000046c466b0 facebook::velox::exec::Expr::applyFunction(facebook::velox::SelectivityVector const&, facebook::velox::exec::EvalCtx&, std::shared_ptr<facebook::velox::BaseVector>&) ./fbcode/velox/expression/Expr.cpp:1604 @ 0000000046c457fa facebook::velox::exec::Expr::evalWithNulls(facebook::velox::SelectivityVector const&, facebook::velox::exec::EvalCtx&, std::shared_ptr<facebook::velox::BaseVector>&) ./fbcode/velox/expression/Expr.cpp:1519 @ 0000000046c8e7d0 facebook::velox::exec::ConjunctExpr::evalSpecialForm(facebook::velox::SelectivityVector const&, facebook::velox::exec::EvalCtx&, std::shared_ptr<facebook::velox::BaseVector>&) fbcode/velox/expression/Expr.cpp:1149 @ 0000000046c4572d facebook::velox::exec::Expr::evalWithNulls(facebook::velox::SelectivityVector const&, facebook::velox::exec::EvalCtx&, std::shared_ptr<facebook::velox::BaseVector>&) ./fbcode/velox/expression/Expr.cpp:1646 @ 0000000046c41399 facebook::velox::exec::Expr::eval(facebook::velox::SelectivityVector const&, facebook::velox::exec::EvalCtx&, std::shared_ptr<facebook::velox::BaseVector>&, facebook::velox::exec::ExprSet const*) ./fbcode/velox/expression/Expr.cpp:1149 @ 0000000046c40015 facebook::velox::exec::ExprSet::eval(int, int, bool, facebook::velox::SelectivityVector const&, facebook::velox::exec::EvalCtx&, std::vector<std::shared_ptr<facebook::velox::BaseVector>, std::allocator<std::shared_ptr<facebook::velox::BaseVector> > >&) ./fbcode/velox/expression/Expr.cpp:2064 @ 0000000046dda7e2 facebook::velox::exec::FilterProject::getOutput() ./fbcode/velox/exec/FilterProject.cpp:282 @ 0000000046c5f19d facebook::velox::exec::Driver::runInternal(std::shared_ptr<facebook::velox::exec::Driver>&, std::shared_ptr<facebook::velox::exec::BlockingState>&, std::shared_ptr<facebook::velox::RowVector>&) ./fbcode/velox/exec/Driver.cpp:486 @ 0000000046cfed97 facebook::velox::exec::Driver::run(std::shared_ptr<facebook::velox::exec::Driver>) ./fbcode/velox/exec/Driver.cpp:802 @ 0000000046cfdb98 folly::CPUThreadPoolExecutor::threadRun(std::shared_ptr<folly::ThreadPoolExecutor::Thread>) fbcode/velox/exec/Driver.cpp:281 @ 00000000162fd84d void std::__invoke_impl<void, void (folly::ThreadPoolExecutor::*&)(std::shared_ptr<folly::ThreadPoolExecutor::Thread>), folly::ThreadPoolExecutor*&, std::shared_ptr<folly::ThreadPoolExecutor::Thread>&>(std::__invoke_memfun_deref, void (folly::ThreadPoolExecutor::*&)(std::shared_ptr<folly::ThreadPoolExecutor::Thread>), folly::ThreadPoolExecutor*&, std::shared_ptr<folly::ThreadPoolExecutor::Thread>&) fbcode/third-party-buck/platform010/build/libgcc/include/c++/trunk/bits/invoke.h:74 @ 00000000000df5b4 execute_native_thread_routine /home/engshare/third-party2/libgcc/11.x/src/gcc-11.x/x86_64-facebook-linux/libstdc++-v3/src/c++11/../../../.././libstdc++-v3/src/c++11/thread.cc:82 @ 000000000009abc8 start_thread /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/nptl/pthread_create.c:434 @ 000000000012ce4b __clone3 /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 ``` Reviewed By: spershin Differential Revision: D97223914 fbshipit-source-id: f334cac3f7ed3885d951fab857717d2ab370812b
PR facebookincubator#16037 review comment #2: endpoints_ does not need a mutex because all access paths (assocEndpointRef, listenerCallback, removeEndpointRef) run on the single Communicator thread. Document this invariant on the member variable. Remove endpoints_.size() from stop() log message since stop() runs on an external thread — reading a non-thread-safe container from outside the owning thread is a data race. Also fix typo workQueue_._size().
No description provided.