Improving CVISE for large projects by RRr89 · Pull Request #483 · marxin/cvise

RRr89 · 2026-03-19T13:12:16Z

This PR focusses on large projects
(1) the default number of threads currently does not take into consideration the disk-space available. This PR takes disk-space as well as CPU count into account.
(2) currently, there exists only a max_improvement, however, for large projects, one does not want to waste time on passes having small progress. Therefore, this PR introduces a min_improvement
(3) currently, files are ordered by size. However, for large projects (>500 files), this will always touch the same files and leave other files unaffected. This PR randomizes the file-order
(4) in large projects, not all files may be necessary for reproducing a bug. This PR introduces a clear-pass that simpliy clears a file.

Some bugs were fixed. Also, Ctrl-C now leads to statistics output, before exiting. That statistics output contains improvement per run in bytes.

Some of the command-line flags do not provide enough information, e.g., what is the default time-out? Others don't work as expected, e.g., --list-passes fails, when no TEST_CASE is provided. In addition to fixing those, this change tries to hint about possible errors, right in the beginning. Such as: * Not enough disk space available. The default parallel setting (`-n`/ `--n`) does not take into account the disk space. For large input files of several GB, we may run out of disk, before we run out of CPU. Therefore, this change provides a parallel setting calculation that takes disk space into account. If the user sets parallelism through the command line, the satisfaction of disk space requirements are checked. A warning is printed on the command line, if such test fails, but the execution continues. * Interestingness test check already exceeds the timeout. This change measures the initial check of the interestingness test. If that already fails the given timeout, a warning is issued, but the execution continues. * Creating backups may fail. When *.orig files already exist, TEST_CASEs are not copied. This change prints a warning, if this is the case. All warnings described above may be switched off by setting `-w` (similar to the gcc `-w` flag).

This pass clears a file and checks, whether that file is required for reproducing the observed bug, at all. This may be helpful, when building libraries with hundreds of files.

When clangbinarysearch fails due to a timeout, current implementation simply retries. For large projects, this may take minutes to hours. This change introduces a max timeout count (currently set to 20) and a current timeout count. When clangbinarysearch failed `max timeout count` times, due to a timeout, it stops.

By throwin an exception on KeyboardInterrupt, cvise can still print out statistics, when user decides to cancel the run.

When printing an overview at the end of the run, it makes sense to also add the progress per run (in bytes). Thus, user may explicitly exclude passes that take a lot of time, but make little progress.

Adding a parameter for min_improvement: Currently, only max_improvement is supported. However, in large projects it may be undesired to waste days on a pass that only makes a few byte with each pass. With min_improvement, pass which make little progress are skipped faster and passes that make larger progress are executed sooner. Also there seems to be a bug when accounting for the improvement of a pass on a file. I fixed that.

emaxx-google · 2026-03-29T22:49:25Z

Hello, thank you for publishing this. This is a big PR that squashes together many functional changes and bugfixes. Please split it into separate PRs, one per logically self-contained chunk of changes, so that they can be reviewed individually and the project's commit history remains clear.

There are also merge conflicts - those will need to be fixed. Thanks.

RRr89 added 6 commits June 3, 2025 16:20

Adding a Clear pass

05dd9d3

This pass clears a file and checks, whether that file is required for reproducing the observed bug, at all. This may be helpful, when building libraries with hundreds of files.

Introducing KeyboardInterrupt Exception

8047fae

By throwin an exception on KeyboardInterrupt, cvise can still print out statistics, when user decides to cancel the run.

Adding progress to statistics

4205064

When printing an overview at the end of the run, it makes sense to also add the progress per run (in bytes). Thus, user may explicitly exclude passes that take a lot of time, but make little progress.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improving CVISE for large projects#483

Improving CVISE for large projects#483
RRr89 wants to merge 6 commits into
marxin:masterfrom
RRr89:interface_changes

RRr89 commented Mar 19, 2026

Uh oh!

emaxx-google commented Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

RRr89 commented Mar 19, 2026

Uh oh!

emaxx-google commented Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants