SemiBin2 fails sometimes with the combination of --evironment global and --sequencing-type long_read + --depth-metabat2
Succeeds when using short_read sequencing type. Or using the bam file with --input-bam (in combination with long_read)
Full command:
docker run -v $PWD:/data quay.io/biocontainers/semibin:2.2.1--pyhdfd78af_0 SemiBin2 single_easy_bin --threads 60 -o /data/semibin2_phbv_b2 --environment global --depth-metabat2 /data/PHBV_B2_lib4_hac_contigdepths.tsv --input-fasta /data/PHBV_B2_lib4_hac_metaMDBG_contigs.fasta --sequencing-type long_read
Complete Stacktrace:
2026-02-02 12:40:24 ffc638d94b92 SemiBin2[1] INFO Running SemiBin2 version 2.2.1
2026-02-02 12:40:24 ffc638d94b92 SemiBin2[1] INFO Binning for long_read
2026-02-02 12:40:35 ffc638d94b92 SemiBin2[1] WARNING Did not detect GPU or CUDA was not installed/supported, using CPU.
2026-02-02 12:40:35 ffc638d94b92 SemiBin2[1] INFO Generating training data...
2026-02-02 12:40:54 ffc638d94b92 SemiBin2[1] INFO Start binning.
/usr/local/lib/python3.13/site-packages/SemiBin/long_read_cluster.py:78: RuntimeWarning: divide by zero encountered in log
embedding_new = np.concatenate((embedding, np.log(depth)), axis=1)
2026-02-02 12:40:55 ffc638d94b92 SemiBin2[1] INFO Running naive ORF finder
Traceback (most recent call last):
File "/usr/local/bin/SemiBin2", line 10, in
sys.exit(main2())
~~~~~^^
File "/usr/local/lib/python3.13/site-packages/SemiBin/main.py", line 1632, in main2
single_easy_binning(
~~~~~~~~~~~~~~~~~~~^
logger,
^^^^^^^
...<3 lines>...
contig_dict,
^^^^^^^^^^^^
device)
^^^^^^^
File "/usr/local/lib/python3.13/site-packages/SemiBin/main.py", line 1319, in single_easy_binning
binning_long(**binning_kwargs)
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.13/site-packages/SemiBin/main.py", line 1195, in binning_long
cluster_long_read(logger,
~~~~~~~~~~~~~~~~~^^^^^^^^
model,
^^^^^^
...<8 lines>...
args=args,
^^^^^^^^^^
)
^
File "/usr/local/lib/python3.13/site-packages/SemiBin/long_read_cluster.py", line 103, in cluster_long_read
dist_matrix = kneighbors_graph(
embedding_new,
...<2 lines>...
p=2,
n_jobs=args.num_process)
File "/usr/local/lib/python3.13/site-packages/sklearn/utils/_param_validation.py", line 218, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.13/site-packages/sklearn/neighbors/_graph.py", line 142, in kneighbors_graph
).fit(X)
~~~^^^
File "/usr/local/lib/python3.13/site-packages/sklearn/base.py", line 1365, in wrapper
return fit_method(estimator, *args, **kwargs)
File "/usr/local/lib/python3.13/site-packages/sklearn/neighbors/_unsupervised.py", line 179, in fit
return self._fit(X)
~~~~~~~~~^^^
File "/usr/local/lib/python3.13/site-packages/sklearn/neighbors/_base.py", line 526, in _fit
X = validate_data(
self,
...<3 lines>...
order="C",
)
File "/usr/local/lib/python3.13/site-packages/sklearn/utils/validation.py", line 2954, in validate_data
out = check_array(X, input_name="X", **check_params)
File "/usr/local/lib/python3.13/site-packages/sklearn/utils/validation.py", line 1105, in check_array
_assert_all_finite(
~~~~~~~~~~~~~~~~~~^
array,
^^^^^^
...<2 lines>...
allow_nan=ensure_all_finite == "allow-nan",
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/usr/local/lib/python3.13/site-packages/sklearn/utils/validation.py", line 120, in _assert_all_finite
_assert_all_finite_element_wise(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
X,
^^
...<4 lines>...
input_name=input_name,
^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/usr/local/lib/python3.13/site-packages/sklearn/utils/validation.py", line 169, in _assert_all_finite_element_wise
raise ValueError(msg_err)
ValueError: Input X contains infinity or a value too large for dtype('float32').
SemiBin2 fails sometimes with the combination of
--evironmentglobaland--sequencing-typelong_read+--depth-metabat2Succeeds when using
short_readsequencing type. Or using the bam file with--input-bam(in combination with long_read)Full command:
docker run -v $PWD:/data quay.io/biocontainers/semibin:2.2.1--pyhdfd78af_0 SemiBin2 single_easy_bin --threads 60 -o /data/semibin2_phbv_b2 --environment global --depth-metabat2 /data/PHBV_B2_lib4_hac_contigdepths.tsv --input-fasta /data/PHBV_B2_lib4_hac_metaMDBG_contigs.fasta --sequencing-type long_readComplete Stacktrace:
2026-02-02 12:40:24 ffc638d94b92 SemiBin2[1] INFO Running SemiBin2 version 2.2.1
2026-02-02 12:40:24 ffc638d94b92 SemiBin2[1] INFO Binning for long_read
2026-02-02 12:40:35 ffc638d94b92 SemiBin2[1] WARNING Did not detect GPU or CUDA was not installed/supported, using CPU.
2026-02-02 12:40:35 ffc638d94b92 SemiBin2[1] INFO Generating training data...
2026-02-02 12:40:54 ffc638d94b92 SemiBin2[1] INFO Start binning.
/usr/local/lib/python3.13/site-packages/SemiBin/long_read_cluster.py:78: RuntimeWarning: divide by zero encountered in log
embedding_new = np.concatenate((embedding, np.log(depth)), axis=1)
2026-02-02 12:40:55 ffc638d94b92 SemiBin2[1] INFO Running naive ORF finder
Traceback (most recent call last):
File "/usr/local/bin/SemiBin2", line 10, in
sys.exit(main2())
~~~~~^^
File "/usr/local/lib/python3.13/site-packages/SemiBin/main.py", line 1632, in main2
single_easy_binning(
~~~~~~~~~~~~~~~~~~~^
logger,
^^^^^^^
...<3 lines>...
contig_dict,
^^^^^^^^^^^^
device)
^^^^^^^
File "/usr/local/lib/python3.13/site-packages/SemiBin/main.py", line 1319, in single_easy_binning
binning_long(**binning_kwargs)
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.13/site-packages/SemiBin/main.py", line 1195, in binning_long
cluster_long_read(logger,
~~~~~~~~~~~~~~~~~^^^^^^^^
model,
^^^^^^
...<8 lines>...
args=args,
^^^^^^^^^^
)
^
File "/usr/local/lib/python3.13/site-packages/SemiBin/long_read_cluster.py", line 103, in cluster_long_read
dist_matrix = kneighbors_graph(
embedding_new,
...<2 lines>...
p=2,
n_jobs=args.num_process)
File "/usr/local/lib/python3.13/site-packages/sklearn/utils/_param_validation.py", line 218, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.13/site-packages/sklearn/neighbors/_graph.py", line 142, in kneighbors_graph
).fit(X)
~~~^^^
File "/usr/local/lib/python3.13/site-packages/sklearn/base.py", line 1365, in wrapper
return fit_method(estimator, *args, **kwargs)
File "/usr/local/lib/python3.13/site-packages/sklearn/neighbors/_unsupervised.py", line 179, in fit
return self._fit(X)
~~~~~~~~~^^^
File "/usr/local/lib/python3.13/site-packages/sklearn/neighbors/_base.py", line 526, in _fit
X = validate_data(
self,
...<3 lines>...
order="C",
)
File "/usr/local/lib/python3.13/site-packages/sklearn/utils/validation.py", line 2954, in validate_data
out = check_array(X, input_name="X", **check_params)
File "/usr/local/lib/python3.13/site-packages/sklearn/utils/validation.py", line 1105, in check_array
_assert_all_finite(
~~~~~~~~~~~~~~~~~~^
array,
^^^^^^
...<2 lines>...
allow_nan=ensure_all_finite == "allow-nan",
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/usr/local/lib/python3.13/site-packages/sklearn/utils/validation.py", line 120, in _assert_all_finite
_assert_all_finite_element_wise(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
X,
^^
...<4 lines>...
input_name=input_name,
^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/usr/local/lib/python3.13/site-packages/sklearn/utils/validation.py", line 169, in _assert_all_finite_element_wise
raise ValueError(msg_err)
ValueError: Input X contains infinity or a value too large for dtype('float32').