CharacterSet: Memory-related refinements #2026
Closed
chloe-yeo wants to merge 32 commits into
Closed
Conversation
…iftlang#1944) Co-authored-by: Thomas Krajacic <tkrajacic@users.noreply.github.com>
…-2026-05-06_09-47 Merge `release/6.3` into `main`
…-2026-05-11_10-14 Merge `release/6.4.x` into `main`
…-2026-05-13_09-54 Merge `release/6.4.x` into `main`
…-2026-05-14_09-47 Merge `release/6.4.x` into `main`
…-2026-05-15_09-54 Merge `release/6.4.x` into `main`
* Decimal: stop reading past endIndex when matching a multi-byte separator.
Prior to this patch, `stringViewContainsDecimalSeparator` walked `0..<decimalSeparator.count` and indexed into `utf8View` for each offset, but the caller only checked that the index of the first byte was inside the input. With a separator of length > 1 and an input that happened to match the separator's first bytes but ended before the full separator, the loop walked past the input's `endIndex` and crashed:
```swift
Decimal._decimal(from: "1,".utf8,
decimalSeparator: ",,".utf8,
matchEntireString: false)
// Fatal error: String index is out of bounds
```
Now, it walks both views with `formIndex(after:)` and returns `false` as soon as the input is exhausted.
`Decimal._decimal` is exposed via `@_spi(SwiftCorelibsFoundation)`, so this is reachable from outside the module.
* Removed comment from the new `decimalParseTruncatedMultiByteSeparator` test.
…lang#1956) * Move URL.FormatStyle and URL.ParseStrategy to swift-foundation now that URLComponents is available 1. Move URLFormatStyle.swift, URLParseStrategy.swift, and URL+UnicodeLookalikeTable.swift into FoundationInternationalization. 2. Migrate XCTest-based tests to Swift Testing. Credit: @iCharlesHu * Fix cmake build
…-2026-05-20_10-07 Merge `release/6.4.x` into `main`
…-2026-05-21_10-15 Merge `release/6.4.x` into `main`
* Add workflow step to validate CMake file lists * Update CMakeLists.txt file lists
…ftlang#1990) (swiftlang#1998) Co-authored-by: Tina L <49205802+itingliu@users.noreply.github.com>
…lang#1996) The KeyedDecodingContainer.decodeIfPresent(_:forKey:configuration:) overloads only checked contains(key) before delegating to decode, so a present key with an explicit null value was forwarded to the configuration's decoder instead of being treated as nil. That threw a valueNotFound error rather than returning nil, which is inconsistent with the standard library's decodeIfPresent, with the UnkeyedDecodingContainer configuration overloads, and with the @CodableConfiguration property wrapper, all of which treat null as nil. Add the missing decodeNil(forKey:) check so a null value yields nil.
* Create _FoundationInternationalizationData library * Fix FOUNDATION_FRAMEWORK build failure * Fix build failure * Introduce copy of _CShimsMacros.h
The internal `Storage` enum's range-replacement subscript-setter for the `.pair` case enumerated `(0, 0)`, `(0, 1)`, `(0, 2)`, `(1, 2)`, `(2, 2)` and trapped the default branch — but it never handled `(1, 1)`. Inserting at the interior position of a two-element IndexPath via `path[1..<1] = newValue`, prior to this patch, crashes with `Fatal error: Range 1..<1 is out of bounds of count 2`, even though the analogous `(1, 1)` case is supported for `.single` and `.array`. `Storage` is a private size-bucketed representation of one logical thing — an `Array<Int>`. The `.single`/`.pair` cases are just compact spellings for arrays of length 1 and 2; the `.array` case is the general implementation. Whatever the public `IndexPath` API exposes has to behave identically regardless of which bucket happens to hold the data. Both `.single` and `.array` already handle the `(1, 1)` case - the latter because it handles every valid range, including interior insertions like `(1, 1)` on a 2-element path — via the generic `removeSubrange` + `insert` path). That means a 2-element `IndexPath` stored as `.array([a, b])` already accepts `path[1..<1] = newValue`. So this fix isn't changing semantics — it's making the `.pair` shortcut match the behaviour the `.array` fallback was already producing for the same input.
* Add pure-Swift _CalendarHebrew + parity suites * Expand Hebrew calendar parity coverage; fix 7 bugs at edge cases Adds ~300 new probe dates across 13 edge-case topics (year-length regimes, Cheshvan/Kislev boundaries, full Metonic cycle, RH postponement, Adar I↔II, year/month boundaries, holidays, time-of-day, far past/future, week-of-year wrap, DST timezones, locale variations) plus a 64-case DST policy parity test against _CalendarGregorian. Fixes: 1. ordinality(.weekday, in: .weekOfYear) ignored firstWeekday 2. dateInterval(.yearForWeekOfYear).duration used Hebrew calendar year 3. R&D post-hoc dehiyot disagreed with swift-foundation-icu's chained form 4. dateComponents extraction policy (use secondsFromGMT for UTC→local) 5. dateInterval(.hour/.minute/.second) re-construction policy 6. dateComponents(_:from:to:) iteration: cumulative, not iterative 7. utcDate must drop skippedTimePolicy at TZ-offset query (matches Gregorian) Hebcal regression skips a documented 385-day window at Hebrew year 5806 where Hebcal (standard R&D) disagrees with swift-foundation-icu's chained dehiyot algorithm; PARITY mandates ICU as authoritative. * Add bugs swiftlang#8-swiftlang#9 fixes, expand Suite B to ~300 dates, move DST policy tests - Bug swiftlang#8: split multi-field date(byAdding:) into sequential year-then-month operations (matches _CalendarGregorian's per-field decompose-adjust-clamp). Kislev 30 + 1 year landing in a deficient year now clamps to 29 before the month-add runs. - Bug swiftlang#9: nanosecond extraction simplified — single fractional subtraction matching _CalendarGregorian's truncation, replacing chained subtractions with FP rounding error. - Suite B expanded to ~300 dates across 11 topic-specific tests, mirroring Suite A's coverage through the public Calendar API. - Moved utcDate_allPolicyCombinations_matchGregorian and date_from_hebrewVsGregorian_atDSTBoundaries from FoundationEssentialsTests/HebrewCalendarTests.swift to FoundationInternationalizationTests/HebrewDSTPolicyParityTests.swift. IANA TimeZone identifiers require _TimeZoneICU (linked via dynamic replacement from FoundationInternationalization), so these tests silently failed in the Essentials target. - Added _CalendarHebrew.nextDate(after:matching:) as a proof-of-concept fast-path for {month, day} patterns. Not wired into _CalendarProtocol; Calendar.enumerateDates dispatches through Calendar_Enumerate.swift's generic framework which has no way to reach it. * Add fast-path nextDate(after:matching:direction:) protocol method Adds an optional fast-path on _CalendarProtocol that allows calendar implementations to answer Calendar.nextDate / Calendar.enumerateDates directly when they can compute the target in O(1), bypassing the generic month-loop in Calendar_Enumerate.swift. The default protocol extension returns nil, so all existing calendars (_CalendarICU, _CalendarGregorian, _CalendarChinese, etc.) continue using the existing framework path unchanged. Only _CalendarHebrew opts in. Wiring (Calendar.swift): - Calendar.nextDate(after:matching:matchingPolicy:repeatedTimePolicy:direction:) consults _calendar.nextDate(...) when policies are at their defaults (.nextTime + .first), falling through to enumerateDates otherwise. - Calendar.enumerateDates(...) does the same: if the calendar can answer the first match, drive the block via repeated nextDate calls; else use the generic framework. Hebrew fast paths (Calendar_Hebrew.swift) — recognized patterns: 1. {month, day, h?, m?, s?, ns?} — annual recurrence (e.g. Hanukkah, Passover; with optional time-of-day preserved). 2. {month, h?, m?, s?, ns?} — month-only (treated as day=1). 3. {day, h?, m?, s?, ns?} — month-walking (e.g. Rosh Chodesh, the 1st of every Hebrew month). 4. {weekday, h?, m?, s?, ns?} — weekday RD-modular arithmetic. Same- weekday inputs always step a full ±7 days to match ICU's nextWeekend semantics; we don't try to be clever about same-day-with-later-time. Any other component combination (era, year, weekdayOrdinal, weekOf*, yearForWeekOfYear, dayOfYear, mixed weekday+month) returns nil and falls through. Performance (Intel iMac, debug, GMT, _CalendarHebrew vs _CalendarICU through public Calendar.enumerateDates): {m,d} Hanukkah 1,447 µs/match → 2 µs (723×) {m,d,h,m,s} Hanukkah 18:30 1,910 µs/match → 2 µs (955×) {day:1} Rosh Chodesh 1,570 µs/match → 2 µs (785×) {month:1} Tishri 1 186 µs/match → 2 µs (93×) {weekday:7} Saturdays 432 µs/match → 1 µs (432×) Correctness verified against ICU's enumerateDates as ground truth: 9 patterns × 50–100 matches each, 0 divergences. Full Hebrew suite (49 tests across 7 suites) passes; 1,386/1,386 full Foundation tests pass. * Extend Hebrew fast-path to {month, weekday, weekdayOrdinal} + cache YearData - Recognize {m, wd, wdOrd} pattern in _CalendarHebrew.nextDate (e.g. "4th Thursday of November"). O(1) Hebrew arithmetic per candidate year, with year iteration to honor strict-after-input, leap-only Adar I, and out-of-range ordinals. Negative ordinals fall through to the generic framework to match ICU's enumerate contract. - Add single-slot YearData cache in HebrewArithmetic. Route call sites through it; skip caching inside hebrewFromFixed's year-approximation loop (would thrash a 1-slot cache). * Update Calendar_Hebrew for Swift 6.4: LockedState → Mutex, fix warning * Remove diagnostic probe files and icu4swift references * Address PR swiftlang#1953 review feedback: remove unused imports, fix DST tests, add benchmarks * Address PR swiftlang#1953 review feedback: trim verbose comments, add TODO for shared weekend logic * Fix missing `floor` and add a feature flag * Move ICU-dependent Hebrew calendar tests to FoundationInternationalizationTests FoundationEssentialsTests does not link FoundationInternationalization, so Calendar(identifier: .hebrew) resolves to _calendarICUClass() returning nil there, causing a SIGSEGV. Move the three tests that use Calendar(identifier: .hebrew) to a new HebrewCalendarICUTests.swift in FoundationInternationalizationTests where the ICU backing is available. --------- Co-authored-by: Tina Liu <tinaliu@apple.com>
* Add a CONTRIBUTION_GUIDELINE.md for coding style and testing practices that we use in this repo. * Add mention of exit testing
…wiftlang#2015) This reverts commit 953cf80.
* Reapply "Add Swift Hebrew calendar implementation (swiftlang#1953)" (swiftlang#2015) This reverts commit c15f44a. * Add Hebrew Calendar file to cmakelist
…-2026-06-03_10-34 Merge `release/6.4.x` into `main`
…-2026-06-04_10-18 Merge `release/6.4.x` into `main`
…es of Data allocation
… Data but we get a buffer than we fill in
cleanup 2 clean up 3 clean up 3 cleanup 4 cleanup 4 cleanup add comments + cleanup cleanup SetAlgebra file add comment about safety of immortal pointer refactor to avoid logic duplication add fatalError add check for cachedBMP explicitly copy Data for built-in sets Data that was initialized by bytesNoCopy streamline bitmapAll & bitmapEmpty cases handling + callsites cleanup remove conditional checking refactor bitmapBacked init deduplication address comments cleanup add test for copy add unit test for union and intersection add tests + address warnings address comment fix cleanup add comment + test add release only call makeBitmap() inside .bitmapFilled cases avoid busy work + add union test for additional validation preserve existing behavior recover previous method
Contributor
Author
|
@swift-ci please test |
Contributor
Author
|
@swift-ci please test |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR introduces a method that fills a buffer for a specified plane of a
BuiltInUnicodeScalarSetso that mutating methods inCharacterSetdo not instantiate 8KB ofDatathat are immediately thrown away after thatDatahas been used for mutating another buffer via mutating methods such as union and intersection. The method is equivalent in functionality to the previousbitmap(forPlane:isInverted:)method.Motivation:
This is a change that helps to reduce the memory usage of
CharacterSetwhen trying to fillAnnexplanes of built-inCharacterSets.Modifications:
Added a method
bitmap(forPlane:isInverted:into:apply:)method and refactoredbitmap(forPlane:isInverted:)to call into it to avoid duplicate logic.Result:
There should be no behavioral change.
Testing:
All the existing unit tests should still pass.