Skip to content

Conversation

@86xsk
Copy link
Contributor

@86xsk 86xsk commented Jan 13, 2026

Description

Reduces the amount of time it takes to run cargo test.

Before rebasing on master and with only the first two commits, this reduced the amount of time it takes to run cargo test from ~78s to ~57s on my system. This is primarily achieved by not running the chunker/tagger when creating the Document in FixedPhrase::from_phrase().

After rebasing, the does_not_oom_with_repeated_lints() test seems to have regressed significantly in performance, and running cargo test took about ~180s. I took a quick crack at trying to resolve this in 0659fab. This reduced the cargo test time to ~53s, albeit I'm not 100% certain that the test will still work as intended (I'm not super familiar with Wasm).

(The regressions in does_not_oom_with_repeated_lints() seemed to have been caused by lex_email_address() and lex_url() and the way they're used in some manner related to Weir. It seemed to be caused by the iterators at the beginning of those functions trying to find the ':' for the URL, or the '@' for the email. Replacing those iterators with a loop and adding early exits seemed to significantly reduce the impact of this. Without any changes that test took >100s, with those early exits it took ~37s, and with 0659fab it takes about ~1.5s. I didn't spend too much time investigating why this became so much of an issue after Weir was added, but I think it could be an interesting direction for further optimization.)

How Has This Been Tested?

  • cargo test

Checklist

  • I have performed a self-review of my own code
  • I have added tests to cover my changes

86xsk added 3 commits January 12, 2026 17:28
Avoid running expensive metadata generation when generating `Document`
in `FixedPhrase::from_phrase()`.
Use generate `Document` from simple tokenization in
`ProperNounCapitalizationLinter`.
Try to detect memory leak via monitoring memory usage rather than trying
to force an OOM in `does_not_oom_with_repeated_lints()`.

This significantly reduces the time it takes to run `cargo test`.
Copy link
Collaborator

@elijah-potter elijah-potter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this is great work. There are a few questions I need answered before I can merge this. Thanks!

let pid = Pid::from_u32(std::process::id());
let mut sys = System::new();

macro_rules! refresh_sys {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a particular reason this is a macro instead of a function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored it into a function inside that test module. If there's a better place for it, please let me know.

(iirc I initially tried making it a closure that would capture pid and sys to avoid having to pass them as arguments. Capturing &mut sys made the borrow checker unhappy though, and the closest alternative that came to mind at the time was a macro.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants