Skip to content

pytest migration#61

Open
fbrausse wants to merge 21 commits into
masterfrom
regr-pytest
Open

pytest migration#61
fbrausse wants to merge 21 commits into
masterfrom
regr-pytest

Conversation

@fbrausse

@fbrausse fbrausse commented Mar 18, 2026

Copy link
Copy Markdown
Collaborator

The test script "test_cmdline.py" was generated by a script from the smlp_regr.csv file. The logic for constructing the arguments for the command line tests has been reverse engineered from the original smlp_regr.py script.

For now, this is just the beginning. Comparing the generated files and outputs on stdout/stderr for a test run remains to be done. The commands generated by pytest exactly match those smlp_regr.py produces (modulo -out_dir, which is produced by pytest now and no longer defaults to ./).

The script convert-csv-pytest.py was used to convert the .csv to the new "test_cmdline.py" and cmp.py was used to check that the generated command lines exactly match of those of smlp_regr.py. Those files are not part of this PR.

I intend to finish implementing the check of the result files according to the logic of the current regression script in this PR and will mark it as draft until then. In the long run, I think it would be better to replace the "compare log and output files for changes" approach with a more robust check for the properties actually supposed to be tested - we discussed about this earlier. The approach taken in this PR putting the base functionality into a separate class living in lib.py looks extendible enough to me to support those future checks.

@fbrausse

Copy link
Copy Markdown
Collaborator Author

PS: I've added two warnings while generating the command line: one about absolute paths being used (e.g., for the -solver_path), and one about non-existing -data and -new_dat files. This is the current warning output:

Details
regr_smlp/code/test_cmdline.py: 17 warnings
  /home/kane/intel/smlp-pip/regr_smlp/code/lib.py:75: UserWarning: path for option -data does not exist: ../data/smlp_toy_num_resp_noknobs.csv
    warnings.warn(f'path for option {o} does not exist: {datapath}')

regr_smlp/code/test_cmdline.py::Test74::test
regr_smlp/code/test_cmdline.py::Test75::test
  /home/kane/intel/smlp-pip/regr_smlp/code/lib.py:75: UserWarning: path for option -new_dat does not exist: ../data/smlp_toy_num_resp_noknobs_pred_labeled.csv
    warnings.warn(f'path for option {o} does not exist: {datapath}')

regr_smlp/code/test_cmdline.py::Test130::test
regr_smlp/code/test_cmdline.py::Test131::test
regr_smlp/code/test_cmdline.py::Test132::test
regr_smlp/code/test_cmdline.py::Test133::test
regr_smlp/code/test_cmdline.py::Test134::test
regr_smlp/code/test_cmdline.py::Test135::test
regr_smlp/code/test_cmdline.py::Test138::test
  /home/kane/intel/smlp-pip/regr_smlp/code/lib.py:75: UserWarning: path for option -data does not exist: ../data/smlp_toy_const_input.csv
    warnings.warn(f'path for option {o} does not exist: {datapath}')

regr_smlp/code/test_cmdline.py::Test136::test
  /home/kane/intel/smlp-pip/regr_smlp/code/lib.py:75: UserWarning: path for option -data does not exist: ../data/smlp_toy_num_resp_mult_compressed.csv.gz.csv
    warnings.warn(f'path for option {o} does not exist: {datapath}')

regr_smlp/code/test_cmdline.py::Test137::test
  /home/kane/intel/smlp-pip/regr_smlp/code/lib.py:75: UserWarning: path for option -data does not exist: ../data/smlp_toy_num_resp_mult_compressed.csv
    warnings.warn(f'path for option {o} does not exist: {datapath}')

regr_smlp/code/test_cmdline.py::Test172::test
  /home/kane/intel/smlp-pip/regr_smlp/code/lib.py:59: UserWarning: Test #172: path given to -solver_path in args is absolute: /nfs/iil/proj/dt/eva/smlp/external/mathsat-5.6.8-linux-x86_64-reentrant/bin/mathsat
    warnings.warn(

regr_smlp/code/test_cmdline.py::Test173::test
  /home/kane/intel/smlp-pip/regr_smlp/code/lib.py:59: UserWarning: Test #173: path given to -solver_path in args is absolute: /nfs/iil/proj/dt/eva/smlp/external/mathsat-5.6.8-linux-x86_64-reentrant/bin/mathsat
    warnings.warn(

regr_smlp/code/test_cmdline.py::Test182::test
  /home/kane/intel/smlp-pip/regr_smlp/code/lib.py:59: UserWarning: Test #182: path given to -solver_path in args is absolute: /nfs/iil/proj/dt/eva/smlp/external/mathsat-5.6.8-linux-x86_64-reentrant/bin/mathsat
    warnings.warn(

regr_smlp/code/test_cmdline.py::Test197::test
  /home/kane/intel/smlp-pip/regr_smlp/code/lib.py:59: UserWarning: Test #197: path given to -solver_path in args is absolute: /nfs/iil/proj/dt/eva/smlp/external/mathsat-5.6.8-linux-x86_64-reentrant/bin/mathsat
    warnings.warn(

regr_smlp/code/test_cmdline.py::Test207::test
regr_smlp/code/test_cmdline.py::Test208::test
  /home/kane/intel/smlp-pip/regr_smlp/code/lib.py:75: UserWarning: path for option -data does not exist: ../data/smlp_toy_frontier_beta.csv
    warnings.warn(f'path for option {o} does not exist: {datapath}')

regr_smlp/code/test_cmdline.py::Test209::test
regr_smlp/code/test_cmdline.py::Test212::test
  /home/kane/intel/smlp-pip/regr_smlp/code/lib.py:75: UserWarning: path for option -data does not exist: ../data/smlp_toy_frontier_null_bounds_int.csv
    warnings.warn(f'path for option {o} does not exist: {datapath}')

regr_smlp/code/test_cmdline.py::Test210::test
regr_smlp/code/test_cmdline.py::Test211::test
regr_smlp/code/test_cmdline.py::Test213::test
regr_smlp/code/test_cmdline.py::Test214::test
  /home/kane/intel/smlp-pip/regr_smlp/code/lib.py:75: UserWarning: path for option -data does not exist: ../data/smlp_toy_frontier_null_bounds_empty.csv
    warnings.warn(f'path for option {o} does not exist: {datapath}')

@fbrausse

fbrausse commented Mar 19, 2026

Copy link
Copy Markdown
Collaborator Author

With the logic pulled over from smlp_regr.py, I now get the following test result summary when running pytest -n 16:

148 failed, 43 passed, 36 skipped, 43 warnings in 493.39s (0:08:13)

I've not yet checked all the failed tests, however here is a short summary that already accounts for a substantial portion of the 148 failures I saw:

  • checking against master log file failed: the paths differ, the -out_dir is part of the log file and thus fails when comparing ./ against some /tmp path generated by pytest.
  • -model_name appears to be interpreted relative to -out_dir and not relative to the current working directory; however it is expected that this file exists in the path relative to -out_dir, which is very strange. E.g. for Test22:
    $ ../../src/run_smlp.py -model_name ../models/test22_model -out_dir /tmp/pytest-of-kane/pytest-1223/test0 -pref Test22 -mode predict -resp 'PF ,|PF |' -model poly_sklearn -save_model f -use_model t -pred_plots t -resp_plots t -data_scaler none -mrmr_pred 0 -plots f -seed 10 -log_time f -new_dat ../data/smlp_toy_num_metasymbol_mult_reg_pred_labeled.csv
    [...]
      File "/home/kane/intel/smlp-pip/src/smlp_py/smlp_data.py", line 633, in _load_data_scaler
        mm_scaler_feat = pickle.load(open(features_scaler_file, 'rb'))
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pytest-of-kane/pytest-1223/test0/../models/test22_model_features_scaler.pkl'
    
  • absolute paths in -solver_path: in some tests appears the Intel path for @zurabksmlp to be hardcoded: /nfs/iil/proj/dt/eva/smlp/external/mathsat-5.6.8-linux-x86_64-reentrant/bin/mathsat, e.g. for Test197:
    /bin/bash: line 1: /nfs/iil/proj/dt/eva/smlp/external/mathsat-5.6.8-linux-x86_64-reentrant/bin/mathsat: No such file or directory
    python3.11: ../src/ext-solver.cc:87: smlp::str smlp::ext_solver::get_info(const char*): Assertion `reply' failed.
    Abgebrochen
    
    Fix: regression tests: use relative -solver_path to mathsat #63

@fbrausse fbrausse marked this pull request as ready for review March 19, 2026 09:58
@zurabksmlp

Copy link
Copy Markdown
Collaborator

Loading saved models requires full path. In current regression the saved model files are at smlp/repo/smlp/regr_smlp/models/ . Here is an example of a genrated command:

../../src/run_smlp.py -model_name "../models/Test5_smlp_toy_num_resp_mult" -out_dir ./ -pref Test15 -mode predict -resp y1 -feat x,p1,p2 -model dt_caret -save_model f -use_model t -mrmr_pred 0 -plots f -pred_plots f -resp_plots f -seed 10 -log_time f -new_dat "../data/smlp_toy_num_resp_mult_pred_labeled.csv"

The path ../models/Test5_smlp_toy_num_resp_mult is created using the value in the smlp_regr.csv file in the first column and corresponding row (row 15 in this case), which looks like this:
15,Test5_smlp_toy_num_resp_mult,smlp_toy_num_resp_mult_pred_labeled,"-mode predict -resp y1 -feat x,p1,p2 -model dt_caret -save_model f -use_model t -mrmr_pred 0 -plots f -pred_plots f -resp_plots f -seed 10 -log_time f ",basic dt_caret prediction test from saved model on new data with numeric labels

commenting just in case this clarifies/helps.

@zurabksmlp

Copy link
Copy Markdown
Collaborator

Is this PR intended for current smlp repo?

In case the recovered test data is not yet in master, I think it is higher priority -- I mean the test input data that Franz recovered, after I renaming input, output and knob names to use the same convention (x, y, p).

Also, I am waiting for the new repo (with limited license) to be opened.

@fbrausse

fbrausse commented Mar 19, 2026

Copy link
Copy Markdown
Collaborator Author

Loading saved models requires full path.

Does "full path" mean "absolute path"? I am observing that the -out_dir value is prepended to the -model_name. That means, those options are not independent of one another, and basically only work when -out_dir is set to ./. See the error message I posted for the generated command of Test22, where only the -out_dir is different.

The path ../models/Test5_smlp_toy_num_resp_mult is created using the value in the smlp_regr.csv file in the first column and corresponding row (row 15 in this case), [...]

Indeed, the command line is generated this way. But running SMLP somehow expects -model_name to be relative to -out_dir. Which I consider to be a bug that prevents this PR from progressing.

Is this PR intended for current smlp repo?

Yes, a proper way to test that doesn't take hours and is a bit more flexible while also using standard tools such as pytest is essential for the more invasive changes regarding pySMT, etc. I cannot use the smlp_regr.py reliably here because it randomly hangs with -w 16 and I don't have the time to wait for hours until it (hopefully) finishes with -w 1.

In case the recovered test data is not yet in master, I think it is higher priority -- I mean the test input data that Franz recovered, after I renaming input, output and knob names to use the same convention (x, y, p).

This will be a next step after I can reliably run the tests.

@konstantin-korovin

konstantin-korovin commented Mar 19, 2026 via email

Copy link
Copy Markdown
Collaborator

@zurabksmlp

Copy link
Copy Markdown
Collaborator

model_name is used both for saving models and loading models. A saved model consists with many files, and model_name is used to identify these files while loading the model, and to save these files while saving the model.

in SMLP out_dir is used for all files generated by a run. These files include saved model files, that s why the output directory is prefixed to saved model file names to generate a full path. Full path can be absolute or relative, depending of course on whether one supplies absolute or relative path as argument to -out_dir.

@fbrausse

Copy link
Copy Markdown
Collaborator Author

Is there a way to avoid overwriting those model files?

@zurabksmlp

Copy link
Copy Markdown
Collaborator

Franz, can you clarify your question:
Is there a way to avoid overwriting those model files?

@fbrausse

Copy link
Copy Markdown
Collaborator Author

Sure. The problem I'm facing is that there are regression tests that read models and expect to find them in a location determined by -out_dir. Writing models should in my view be done using the -out_dir, but reading should not. With the current set of parameters, as I understand it, this might not be easy to do (think -out_dir /tmp -model_name ../models/something - what does that mean for the path the model is read from?). Alternatively, it might be cleaner if we could separate the read path from the write path. So my question is: can those two paths be separated?

@zurabksmlp

Copy link
Copy Markdown
Collaborator

Loading/reading models in regression are done from saved models directory -- directory smlp/repo/smlp/regr_smlp/models/, located in parallel to code, data, master (and several other) directories. When SMLP is run and saves models, the saved model is in the output directory. When SMLP runs a saved model, it takes the models from the models directory, so if the same command also writes / saves model there will be no clash. In fact, as far as I remember, SMLP does not permit to have -save_model t and use_model t in one command because it does not make sense.

@zurabksmlp

Copy link
Copy Markdown
Collaborator

If for some reason a regression test model changes and the model is saved, and is reused in some other test that reads that model, the models must be updated in the models/ directory. smlp_regr.py script supports that -- when a difference between newly saved model at output directory is compared to the saved model in the models directory and mismatch is found, the regression scripts asks what to do with the diff -- accept the diff and update both the models/ and/or master/ directories, or to ignore the diff.

@fbrausse

Copy link
Copy Markdown
Collaborator Author

I think we're still not talking about the same thing. You describe what happens when running ../../src/run_smlp.py -out_dir ./ -model_name ../models/something ..., yes?

I am modifying only the -out_dir parameter to point not to ./ but to some directory under /tmp outside of the repo's tree.

This is what happens:

$ ../../src/run_smlp.py -model_name ../models/test22_model -out_dir /tmp/pytest-of-kane/pytest-1223/test0 -pref Test22 -mode predict -resp 'PF ,|PF |' -model poly_sklearn -save_model f -use_model t -pred_plots t -resp_plots t -data_scaler none -mrmr_pred 0 -plots f -seed 10 -log_time f -new_dat ../data/smlp_toy_num_metasymbol_mult_reg_pred_labeled.csv
[...]
  File "/home/kane/intel/smlp-pip/src/smlp_py/smlp_data.py", line 633, in _load_data_scaler
    mm_scaler_feat = pickle.load(open(features_scaler_file, 'rb'))
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pytest-of-kane/pytest-1223/test0/../models/test22_model_features_scaler.pkl'

Note the path in the last line of this error message. It is constructed from putting the -out_dir parameter in front of the -model_name parameter. For reading a model! That is not just counter-intuitive, but also means that the -out_dir option itself becomes a bit useless, because as soon as I change it, I need to use a different -model_name for - again - reading. For reading the given -model_name should be taken as it is, while for writing the current behaviour could stay as it is (though I would argue that writing outside of the specified -out_dir should be disallowed).

@fbrausse

fbrausse commented Mar 19, 2026

Copy link
Copy Markdown
Collaborator Author

Working around this -model_name reading problem by using absolute paths and fixing some issues (also one in original smlp_regr.py), I now arrive at

41 failed, 150 passed, 36 skipped, 43 warnings in 483.56s (0:08:03)

Edit: Out of the 41 failing, 13 already fail on master for me.

Investigating the remaining failed ones...

@zurabksmlp

Copy link
Copy Markdown
Collaborator

Hi Franz, when you run:

../../src/run_smlp.py -model_name ../models/test22_model -out_dir /tmp/pytest-of-kane/pytest-1223/test0 -pref Test22 -mode predict -resp 'PF ,|PF |' -model poly_sklearn -save_model f -use_model t -pred_plots t -resp_plots t -data_scaler none -mrmr_pred 0 -plots f -seed 10 -log_time f -new_dat ../data/smlp_toy_num_metasymbol_mult_reg_pred_labeled.csv

does directory /tmp/pytest-of-kane/pytest-1223/test0 exist?
Also, what happens if you use -out_dir /tmp/ ?

@fbrausse

Copy link
Copy Markdown
Collaborator Author

does directory /tmp/pytest-of-kane/pytest-1223/test0 exist?

Yes, it does exist before I run the command (and still does after).

Also, what happens if you use -out_dir /tmp/ ?

The same error, except the FileNotFoundError's message changes:

FileNotFoundError: [Errno 2] No such file or directory: '/tmp/../models/test22_model_features_scaler.pkl'

@fbrausse

fbrausse commented Mar 20, 2026

Copy link
Copy Markdown
Collaborator Author

Investigating the remaining failed ones...

Outcome: All 41 that fail with this PR already fail on master for me. Here is a summary:

  • Tests 8, 13, 16, 28, 36, 59, 60, 66, 68, 69, 70, 72, 77: smlp_regr.py already says they fail.
  • Test 104: Exception: Knobs ['p1', 'p2'] are not assigned constant values as part of specification, in "verify" mode: aborting...
  • Test 107: Exception: Beta constraints are not supported in "verify" mode: aborting...
  • Tests 145, 205: Exception: DOE levels grid file ../grids/doe_two_levels_opt.csv does not eist
  • Test 146: Exception: DOE levels grid file ../grids/explore_doe_two_levels.csv does not eist
  • Tests 149, 150, 151, 152, 154, 155, 156, 157, 158, 172, 173, 174, 175, 176, 177, 178, 179, 180: keras private attribute issue, see Compatibility with keras >= 3.11.3 #64
  • Test 160: KeyError: "The path: (0,) in the `loss` argument, can't be found in either the model's output (`y_pred`) or in the labels (`y_true`)."
  • Tests 182, 197: absolute solver path issue, see regression tests: use relative -solver_path to mathsat #63
  • Tests 201, 202: Exception: Spec file ../specs/smlp_toy_num_resp_mult_no_input_beta.spec does not exist

Why the smlp_regr.py script claims those tests from the above list except for the first item would be "Passed" I don't know, yet. The above results come from manually running the commands printed by the tool for the respective test.

Thus, this PR performs no worse than the current master, at least on my installation.

Edit: 104 and 107 are expected to fail, this is handled as intended by pytest now.

@fbrausse fbrausse mentioned this pull request Mar 20, 2026
25 tasks
@zurabksmlp

Copy link
Copy Markdown
Collaborator
  1. These tests have correct behavior -- these are sanity check tests, should be aborting:
    Test 104: Exception: Knobs ['p1', 'p2'] are not assigned constant values as part of specification, in "verify" mode: aborting...
    Test 107: Exception: Beta constraints are not supported in "verify" mode: aborting...
  2. These tests have a simple fix:
    Tests 149, 150, 151, 152, 154, 155, 156, 157, 158, 172, 173, 174, 175, 176, 177, 178, 179, 180: keras private attribute issue, see Compatibility with keras >= 3.11.3 #64
    This test too is related to Keras versions:
    Test 160: KeyError: "The path: (0,) in the loss argument, can't be found in either the model's output (y_pred) or in the labels (y_true)."
    I have a fix in nlp_text.rebase branch for these tests, I can check that these tests are indeed running without errors in nlp_text.rebased branch. As far as I know @mdmitry1 also has fixes for these issues -- maybe his fixes are implemented a little different.
  3. Are the tests with missing input files among the recovered tests?

@fbrausse

Copy link
Copy Markdown
Collaborator Author

Thanks, Zurab! I've marked 104 and 107 as expected to fail.

I have a fix in nlp_text.rebase branch for these tests,

Do you remember what changes fixed this keras "(0,) in the loss argument" issue?

Are the tests with missing input files among the recovered tests?

I have not finished recovering all those missing files. It's quite some time ago by now that I worked on that. I plan to return there once I can reliably run the tests.

@zurabksmlp

Copy link
Copy Markdown
Collaborator

The fixes to keras version compatibility are in src/smlp_py/train_keras.py -- changes are conditional with respect to:
if version.parse(keras.version) <= version.parse("3.0.0"):

It makes sense to grep on "3.0.0" or on keras.version to make sure all relevant changes are in train_keras.py.

@fbrausse

fbrausse commented Mar 20, 2026

Copy link
Copy Markdown
Collaborator Author

With keras-3.13.2 installed (as before), I now get

39 failed, 150 passed, 36 skipped, 2 xfailed, 43 warnings in 518.45s (0:08:38)

Will try the keras-3 patches from your nlp_text branch next.

Edit: However, they are not in master, yet, so we'll have to see whether it might make sense to extract them from your branch as a separate PR.

@fbrausse

fbrausse commented Mar 20, 2026

Copy link
Copy Markdown
Collaborator Author

With keras-2.15.0 and tensorflow<2.16:

14 failed, 172 passed, 39 skipped, 2 xfailed, 43 warnings in 465.13s (0:07:45)

(I skipped Tests 154, 155 and 157 manually because they were taking too long).

@mdmitry1

mdmitry1 commented May 6, 2026

Copy link
Copy Markdown
Collaborator

I'm getting an error during pytest invocation.
@fbrausse , please advise

git branch --show-current
git status
cd $(git rev-parse --show-toplevel)
python3.11 -m venv venv
source venv/bin/activate
regr-pytest
On branch regr-pytest
Your branch is up to date with 'origin/regr-pytest'.
nothing to commit, working tree clean
pip install . pytest
pytest regr_smlp/code/test_cmdline.py 
ERROR: /home/mdmitry/github/regr-pytest/pyproject.toml: Cannot use both [tool.pytest] (native TOML types) and [tool.pytest.ini_options] (string-based INI format) simultaneously. Please use [tool.pytest] with native TOML types (recommended) or [tool.pytest.ini_options] for backwards compatibility.

@fbrausse

fbrausse commented May 6, 2026

Copy link
Copy Markdown
Collaborator Author

Hi @mdmitry1. I've been using pytest-8 before, which required the markers to be put into the ini_options. If you're using pytest-9, just remove the section marker [tool.pytest.ini_options] and put the markers into [tool.pytest].

@fbrausse

fbrausse commented Jun 1, 2026

Copy link
Copy Markdown
Collaborator Author

I still need to include the new regression tests which Zurab provided recently, though.

Done.

= 31 failed, 163 passed, 38 skipped, 2 xfailed, 41 warnings in 558.42s (0:09:18) =

@fbrausse

fbrausse commented Jun 1, 2026

Copy link
Copy Markdown
Collaborator Author

The warnings seem helpful:

=============================== warnings summary ===============================
regr_smlp/code/test_cmdline.py: 17 warnings
  /home/kane/intel/smlp-pip/regr_smlp/code/lib.py:134: UserWarning: path for option -data does not exist: ../data/smlp_toy_num_resp_noknobs.csv
    warnings.warn(f'path for option {o} does not exist: {datapath}')

regr_smlp/code/test_cmdline.py::Test74::test
regr_smlp/code/test_cmdline.py::Test75::test
  /home/kane/intel/smlp-pip/regr_smlp/code/lib.py:134: UserWarning: path for option -new_dat does not exist: ../data/smlp_toy_num_resp_noknobs_pred_labeled.csv
    warnings.warn(f'path for option {o} does not exist: {datapath}')

regr_smlp/code/test_cmdline.py::Test5::test
regr_smlp/code/test_cmdline.py::Test92::test
regr_smlp/code/test_cmdline.py::Test87::test
  /home/kane/intel/smlp-pip/regr_smlp/code/lib.py:272: UserWarning: -save_model t specified in args but no -model_name given
    warnings.warn(f'-save_model {o} specified in args but no -model_name given')

regr_smlp/code/test_cmdline.py::Test132::test
regr_smlp/code/test_cmdline.py::Test133::test
regr_smlp/code/test_cmdline.py::Test130::test
regr_smlp/code/test_cmdline.py::Test131::test
regr_smlp/code/test_cmdline.py::Test134::test
regr_smlp/code/test_cmdline.py::Test135::test
regr_smlp/code/test_cmdline.py::Test138::test
  /home/kane/intel/smlp-pip/regr_smlp/code/lib.py:134: UserWarning: path for option -data does not exist: ../data/smlp_toy_const_input.csv
    warnings.warn(f'path for option {o} does not exist: {datapath}')

regr_smlp/code/test_cmdline.py::Test136::test
  /home/kane/intel/smlp-pip/regr_smlp/code/lib.py:134: UserWarning: path for option -data does not exist: ../data/smlp_toy_num_resp_mult_compressed.csv.gz.csv
    warnings.warn(f'path for option {o} does not exist: {datapath}')

regr_smlp/code/test_cmdline.py::Test137::test
  /home/kane/intel/smlp-pip/regr_smlp/code/lib.py:134: UserWarning: path for option -data does not exist: ../data/smlp_toy_num_resp_mult_compressed.csv
    warnings.warn(f'path for option {o} does not exist: {datapath}')

regr_smlp/code/test_cmdline.py::Test208::test
regr_smlp/code/test_cmdline.py::Test207::test
  /home/kane/intel/smlp-pip/regr_smlp/code/lib.py:134: UserWarning: path for option -data does not exist: ../data/smlp_toy_frontier_beta.csv
    warnings.warn(f'path for option {o} does not exist: {datapath}')

regr_smlp/code/test_cmdline.py::Test214::test
regr_smlp/code/test_cmdline.py::Test210::test
regr_smlp/code/test_cmdline.py::Test211::test
regr_smlp/code/test_cmdline.py::Test213::test
  /home/kane/intel/smlp-pip/regr_smlp/code/lib.py:134: UserWarning: path for option -data does not exist: ../data/smlp_toy_frontier_null_bounds_empty.csv
    warnings.warn(f'path for option {o} does not exist: {datapath}')

regr_smlp/code/test_cmdline.py::Test209::test
regr_smlp/code/test_cmdline.py::Test212::test
  /home/kane/intel/smlp-pip/regr_smlp/code/lib.py:134: UserWarning: path for option -data does not exist: ../data/smlp_toy_frontier_null_bounds_int.csv
    warnings.warn(f'path for option {o} does not exist: {datapath}')

regr_smlp/code/test_cmdline.py::Test230::test
regr_smlp/code/test_cmdline.py::Test231::test
  /home/kane/intel/smlp-pip/regr_smlp/code/lib.py:134: UserWarning: path for option -data does not exist: ../data/smlp_toy_monotone_basic.csv.csv
    warnings.warn(f'path for option {o} does not exist: {datapath}')

@mdmitry1

mdmitry1 commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator

Hi @fbrausse

Please, see below:

  1. I'm still getting an error
pytest -n 8 test_cmdline.py
FAILED test_cmdline.py::Test231::test - TypeError: PurePath.relative_to() got an unexpected keyword argument 'walk_up'
git show -s
commit 4c4f52e406934e9918869302b3cdec7e060c0345 (HEAD -> regr-pytest, origin/regr-pytest)
Author: Franz Brauße <fbrausse@paxle.org>
Date:   Mon Jun 1 16:39:49 2026 +0200

    regr: mark test 229 as expected to fail as well
  1. smlp_regr.py has tolerance parameter, which is not supported by test_cmdline.py:
grep -c tolerance $(git rev-parse --show-toplevel)/regr_smlp/code/{test_cmdline.py,smlp_regr.py}
/home/mdmitry/github/smlp_regr-pytest/regr_smlp/code/test_cmdline.py:0
/home/mdmitry/github/smlp_regr-pytest/regr_smlp/code/smlp_regr.py:3

Therefore there are more diffs than in expected results, generated by command

smlp_regr.py -w 8 -def n -t all -tol 7

@fbrausse

fbrausse commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator Author
  1. There is a reason for walk_up=True (supported since python-3.12), namely that the smlp commands generated match those of the current regression script. So if you run pytest with python-3.12 (or later), that issue should disappear. Please note, as said before, that the Python version pytest is run with does not affect the Python version used for SMLP execution.
  2. Where does your -tol 7 come from? Why would that be of any significance? There is a default in the current regression script and that default is also used by pytest (or, more precisely: lib.py).

@mdmitry1

mdmitry1 commented Jun 6, 2026

Copy link
Copy Markdown
Collaborator
  1. I believe there should be better way to handle paths than requiring installation of extra Python version. Please, let me know why using python3.12 is preferred option.

  2. -tol 7 is coming from smlp_regr.py

Value 7 is taken from regression command, recommended in the original README.md:

git checkout fa96ab4 &> /dev/null && git blame README.md |& grep "tol 7"
e0f1e15d (Franz Brauße       2024-04-09 05:43:20 +0000 100) 	./smlp_regr.py -w 8 -def n -t all -tol 7

Anyway, there are tests, which pass for smlp_regr.py and fail with test_commandline.py.
Two most common causes of differences:

  1. DOE tests produce different results for every run, for example Test39.
for i in $(seq 1 3); do ../../src/run_smlp.py -doe_spec "../grids/doe_two_levels.csv" -out_dir ./ -pref Test39 -mode doe -doe_algo latin_hypercube -doe_prob_distr Exponential -doe_samples 30 -log_time f >& /dev/null && sum Test39_doe_two_levels_doe.csv; done
20926     2 Test39_doe_two_levels_doe.csv
23407     2 Test39_doe_two_levels_doe.csv
40043     2 Test39_doe_two_levels_doe.csv
  1. Differences between very small numbers, for example Test3.
../../src/run_smlp.py -data "../data/smlp_toy_num_resp_mult.csv" -out_dir ./ -pref Test3 -mode predict -resp y1 -feat x,p1,p2 -model poly_sklearn -save_model_config f -mrmr_pred 0 -plots f -pred_plots f -resp_plots f -seed 10 -log_time f  -new_dat "../data/smlp_toy_num_resp_mult_pred_unlabeled.csv" &> /dev/null && diff Test3_smlp_toy_num_resp_mult_smlp_toy_num_resp_mult_pred_unlabeled_training_prediction_precisions.csv ../master
2c2
< y1,7.099748146989106e-30,1.0
---
> y1,1.8341016046388524e-29,1.0
  1. References:
  1. Comparison details

All differences between pytest and ./smlp_regr.py results are due to:

  • Differences in csv files. Actually, ./smlp_regr.py results do not contain any csv file diffs:
grep Diff $(git rev-parse --show-toplevel)/tests/smlp_regression/run_smlp_regression_expected_diff_report.log | 
grep -c csv
0
  • Tests, which failed for python3.12 pytest due to missing *.spec file are reported as having diffs for ./smlp_regr.py . However, output *.txt file is empty.

For example:

grep Test66 pytest_312.log
FAILED test_cmdline.py::Test66::test - subprocess.CalledProcessError: Command '['smlp', '-model_name', '../models/test65_model', '-out_dir', '/tmp/pytest-of-mdmitry/pytest-7/popen-gw7/test16', ...
../../src/run_smlp.py -model_name "../models/test65_model" -out_dir ./ -pref Test66 -mode verify -resp y1,y2 -feat x0,x1,x2 -model dt_sklearn -dt_sklearn_max_depth 15 -tree_encoding nested -compress_rules f  -save_model f -use_model t -spec ../specs/smlp_toy_num_resp_noknobs_verify.spec -asrt_names asrt1,asrt2 -asrt_exprs "x0**2+y1>4.3;(y1+x2)/2<6" -mrmr_pred 0 -plots f -pred_plots f -resp_plots f -seed 10 -log_time f >& /dev/null; ls -ltr Test66_test65_model.txt
-rw-r--r-- 1 mdmitry mdmitry 0 Jun  6 11:52 Test66_test65_model.txt
  • All non-csv differences are common:
cd $(git rev-parse --show-toplevel)
grep -h content /tmp/pytest-of-mdmitry/pytest-7/popen-gw*/test*/.meta/test_log.txt  | grep -v csv | awk '{print $1}' | sed -e 's/^/grep -c /' -e 's@$@ tests/smlp_regression/run_smlp_regression_expected.log@' | source /dev/stdin
1
1
1
1
1

@mdmitry1

mdmitry1 commented Jun 6, 2026

Copy link
Copy Markdown
Collaborator

With csv files filtered out from comparison and failures due to missing inputs ignored, the remaining diffs are:

grep -h content /tmp/pytest-of-mdmitry/pytest-3/popen-gw*/test*/.meta/test_log.txt 
test26_model_dt_sklearn_tree_rules.txt Failed -> content diff
Test7_smlp_toy_num_resp_mult_rf_sklearn_tree_rules.txt Failed -> content diff
test110_model_poly_sklearn_formula.txt Failed -> content diff
Test10_smlp_toy_num_resp_mult_et_sklearn_tree_rules.txt Failed -> content diff
Test97_smlp_toy_num_resp_mult.txt Failed -> content diff
cd $(git rev-parse --show-toplevel)/regr_smlp/code
diff test110_model_poly_sklearn_formula.txt ../master
1c1
< y2 == 0.600562379236181 + -1.304512053934559e-15 + -0.07693796044916358 * x1 + -0.2710996066404448 * x2 + -0.04262102751392794 * p1 + 0.004790427082252857 * p2 + 0.48265534804226007 * x1^2 + -0.5064660321645905 * x1 * x2 + 0.37404344140014817 * x1 * p1 + 0.12043754302436678 * x1 * p2 + 0.8027897367455045 * x2^2 + 0.11900234619303918 * x2 * p1 + 0.1908401310215638 * x2 * p2 + -0.2652720480291326 * p1^2 + -0.35917950293902495 * p1 * p2 + -0.2927244049240656 * p2^2
\ No newline at end of file
---
> y2 == -1.0269562977782698e-15 + -0.07693796044916366 * x1 + -0.2710996066404446 * x2 + -0.04262102751392881 * p1 + 0.004790427082252834 * p2 + 0.48265534804226046 * x1^2 + -0.5064660321645915 * x1 * x2 + 0.37404344140014784 * x1 * p1 + 0.12043754302436768 * x1 * p2 + 0.8027897367455041 * x2^2 + 0.1190023461930393 * x2 * p1 + 0.19084013102156372 * x2 * p2 + -0.2652720480291325 * p1^2 + -0.35917950293902456 * p1 * p2 + -0.29272440492406493 * p2^2
\ No newline at end of file
  • Master probably should be updated, as diff seems to be not significant
    Test97_smlp_toy_num_resp_mult.txt
cd $(git rev-parse --show-toplevel)/regr_smlp/code
diff Test97_smlp_toy_num_resp_mult.txt ../master/
252c252
< smlp_logger - INFO - Model operator counts for y2: {'add': 100, 'mul': 715, 'const': 2547, 'ite': 305, 'and': 408, 'prop': 713, 'sub': 713, 'var': 713}
---
> smlp_logger - INFO - Model operator counts for y2: {'add': 100, 'mul': 716, 'const': 2550, 'ite': 305, 'and': 409, 'prop': 714, 'sub': 714, 'var': 714}

With all suggestions mentioned above implemented, then regression should pass.

@mdmitry1

mdmitry1 commented Jun 6, 2026

Copy link
Copy Markdown
Collaborator

There is an issue inherited from ./smlp_regr.py:
New model files are removed from output directory.

For example, run belolw commands from

grep -h content /tmp/pytest-of-${USER}/pytest-current/*/*/.meta/test_log.txt | grep test110
test110_model_poly_sklearn_formula.txt Failed -> content diff
find /tmp/pytest-of-${USER}/pytest-current/ -name test110_model_poly_sklearn_formula.txt | wc -l
0

Proposed fix is to comment out line 410 in lib.py:

git diff $(git rev-parse --show-toplevel)/regr_smlp/code/lib.py
diff --git a/regr_smlp/code/lib.py b/regr_smlp/code/lib.py
index 0e111b6..4d468d9 100644
--- a/regr_smlp/code/lib.py
+++ b/regr_smlp/code/lib.py
@@ -407,7 +407,7 @@ def _check_outputs(test_id, smlp_args, stdout, stderr, regrdir, output_path):
                                                answer = user_input
                        if model_file:
                                master_files.remove(file_name)
-                               os.remove(new_file)
+                               #os.remove(new_file)
                                if file in master_files:
                                        master_files.remove(file)
                        else:

@mdmitry1

mdmitry1 commented Jun 7, 2026

Copy link
Copy Markdown
Collaborator

Implemented all proposed changes:

git branch --show-current && git log -1 --oneline
regr-pytest_mdmitry1
43ff775 (HEAD -> regr-pytest_mdmitry1, origin/regr-pytest_mdmitry1) Proposed fixes for PR #51 'pytest migration'

Wheel:

=== 1. Run 27087972698 [regr-pytest_mdmitry1]: Build ===
  attempt 1 [by: mdmitry1] [started: 2026-06-07 11:56:50 IDT] [finished: 2026-06-07 12:10:34 IDT]:
    smlptech-1.2.1rc11.dev40+g43ff77542-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl

Regression passed
Log file: run_smlp_regression_pytest.log
Results: pytest-23.tar.gz

@mdmitry1

mdmitry1 commented Jun 7, 2026

Copy link
Copy Markdown
Collaborator

One more fix: bc6ffc6 .
Now all tests, which pass with ./smlp_regr.py pass with pytest :

==== 186 passed, 36 skipped, 12 xfailed, 39 warnings in 1569.03s (0:26:09) =====

Expected log
Expected results

@mdmitry1

mdmitry1 commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

Added support of compressed and gzipped data files to legacy and pytest regression a939513.

==== 188 passed, 34 skipped, 12 xfailed, 37 warnings in 1694.04s (0:28:14) =====

Expected log
Expected results

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants