We are using semibin in our MAGs workflow: https://iwc.galaxyproject.org/workflow/mags-building-main/
And found that there is a large difference between v2.1.0 vs v2.0.2 for bin creation in terms of numbers as well as quality.
Here is an example of a real-life analysis of the bee microbiome using checkm2.
Any idea why this could be the case?
v2.1.0
Name Completeness Contamination Completeness_Model_Used Translation_Table_Used Coding_Density Contig_N50 Average_Gene_Length Genome_Size GC_Content Total_Coding_Sequences Total_Contigs Max_Contig_Length Additional_Notes
SemiBin_0 29.34 3.36 Neural Network (Specific Model) 11 0.865 3922 274.72700296735906 641692 0.42 674 154 23611 None
SemiBin_1 50.36 19.25 Neural Network (Specific Model) 11 0.831 4102 268.05308588692276 2238014 0.45 2317 530 26601 None
SemiBin_1024 16.65 4.62 Neural Network (Specific Model) 11 0.761 4297 219.95294117647057 1030336 0.3 1190 235 22552 None
SemiBin_1025 46.0 8.8 Neural Network (Specific Model) 11 0.881 4617 251.60856720827178 1157997 0.36 1354 262 16146 None
SemiBin_1026 53.4 10.87 Neural Network (Specific Model) 11 0.904 8172 302.56342412451363 1289043 0.37 1285 201 51180 None
SemiBin_2 10.53 0.07 Neural Network (Specific Model) 11 0.787 6811 225.45338208409507 468390 0.36 547 84 36178 None
SemiBin_4090 79.38 32.26 Neural Network (Specific Model) 11 0.868 11804 297.6619880349747 4464082 0.35 4346 544 130205 None
SemiBin_4091 33.39 0.62 Neural Network (Specific Model) 11 0.88 6398 275.74976744186046 1009154 0.38 1075 180 33958 None
SemiBin_4327 3.93 0.0 Neural Network (Specific Model) 11 0.776 239878 320.9846153846154 239878 0.52 195 1 239878 None
SemiBin_5113 20.99 0.71 Neural Network (Specific Model) 11 0.837 6513 275.096018735363 840777 0.41 854 138 80687 None
SemiBin_5114 45.57 0.55 Neural Network (Specific Model) 11 0.893 28063 278.17007874015746 592880 0.45 635 37 61370 None
SemiBin_6133 46.26 0.0 Neural Network (Specific Model) 11 0.885 11768 296.1252093802345 2395077 0.6 2388 266 33128 None
SemiBin_6135 31.31 3.13 Neural Network (Specific Model) 11 0.877 5849 269.575104727708 1537517 0.42 1671 263 113188 None
SemiBin_6136 37.3 0.0 Neural Network (Specific Model) 11 0.907 4375 280.7223032069971 1270666 0.42 1372 296 13692 None
SemiBin_6137 17.79 1.83 Neural Network (Specific Model) 11 0.863 4201 273.00318133616116 892578 0.56 943 217 16432 None
SemiBin_6138 32.29 1.2 Neural Network (Specific Model) 11 0.881 38873 292.14128595600675 2345787 0.54 2364 124 124190 None
v2.0.2
Name Completeness Contamination Completeness_Model_Used Translation_Table_Used Coding_Density Contig_N50 Average_Gene_Length Genome_Size GC_Content Total_Coding_Sequences Total_Contigs Max_Contig_Length Additional_Notes
SemiBin_0 23.79 0.12 Neural Network (Specific Model) 11 0.817 3258 251.88888888888889 398745 0.4 432 117 6680 None
SemiBin_1 52.09 5.87 Neural Network (Specific Model) 11 0.834 6426 283.44331133773244 1696173 0.46 1667 281 30922 None
SemiBin_11 67.74 0.33 Gradient Boost (General Model) 11 0.91 4546 280.95963933018464 2152417 0.42 2329 489 13692 None
SemiBin_12 65.58 3.54 Neural Network (Specific Model) 11 0.873 5107 280.06475300400535 1439659 0.43 1498 297 17316 None
SemiBin_13 87.69 5.74 Neural Network (Specific Model) 11 0.864 46666 304.42977420589364 2757890 0.34 2613 86 195230 None
SemiBin_15 12.79 2.58 Neural Network (Specific Model) 11 0.711 3512 224.34619883040935 808039 0.42 855 223 13385 None
SemiBin_16 10.88 0.44 Neural Network (Specific Model) 11 0.768 4943 239.263862332696 487844 0.29 523 102 19894 None
SemiBin_19 12.3 1.7 Neural Network (Specific Model) 11 0.857 22925 261.6875687568757 830477 0.5 909 56 74194 None
SemiBin_2 20.56 0.1 Neural Network (Specific Model) 11 0.807 6054 260.4830028328612 681860 0.39 706 119 32058 None
SemiBin_20 13.6 0.85 Neural Network (Specific Model) 11 0.836 3764 263.53846153846155 1005195 0.56 1066 253 33621 None
SemiBin_21 100.0 0.4 Neural Network (Specific Model) 11 0.888 52033 316.9252607184241 5536488 0.56 5178 167 176966 None
SemiBin_22 53.62 3.99 Neural Network (Specific Model) 11 0.849 4708 269.8386225523295 1409261 0.42 1481 307 23489 None
SemiBin_23 37.38 0.0 Neural Network (Specific Model) 11 0.88 3904 262.1492042440318 1344765 0.43 1508 341 10874 None
SemiBin_233 6.91 0.08 Neural Network (Specific Model) 11 0.761 239878 306.86 479658 0.51 400 4 239878 None
SemiBin_24 81.74 1.94 Neural Network (Specific Model) 11 0.887 6522 275.62341078474356 2124102 0.38 2281 367 33958 None
SemiBin_29 96.74 1.68 Neural Network (Specific Model) 11 0.881 12196 293.2869528441531 4714596 0.6 4729 491 79088 None
SemiBin_3 80.58 7.3 Neural Network (Specific Model) 11 0.903 7158 293.48175725986596 1307590 0.37 1343 214 24699 None
SemiBin_30 28.56 0.04 Neural Network (Specific Model) 11 0.832 4046 260.3989417989418 885576 0.46 945 220 11911 None
SemiBin_31 86.54 3.07 Neural Network (Specific Model) 11 0.901 17391 292.4266284896206 1358660 0.34 1397 127 64643 None
SemiBin_32 22.89 0.05 Neural Network (Specific Model) 11 0.824 5774 279.81328320802004 811680 0.41 798 148 30462 None
SemiBin_33 7.85 0.0 Neural Network (Specific Model) 11 0.75 4030 247.1933962264151 209245 0.3 212 55 5922 None
SemiBin_36 23.99 0.23 Neural Network (Specific Model) 11 0.913 5113 297.3452380952381 491923 0.37 504 99 31381 None
SemiBin_37 5.63 0.02 Neural Network (Specific Model) 11 0.863 113197 290.98623853211006 220395 0.35 218 13 113197 None
SemiBin_39 61.87 3.91 Neural Network (Specific Model) 11 0.912 4470 276.1839622641509 961775 0.37 1060 213 22627 None
SemiBin_4 29.73 0.11 Neural Network (Specific Model) 11 0.902 3547 256.61322081575247 605910 0.37 711 162 10396 None
SemiBin_42 22.28 1.33 Neural Network (Specific Model) 11 0.91 4408 296.7756183745583 553209 0.36 566 126 12478 None
SemiBin_44 100.0 0.79 Neural Network (Specific Model) 11 0.861 53399 318.0313479623824 3529330 0.39 3190 110 187436 None
SemiBin_45 36.59 4.63 Neural Network (Specific Model) 11 0.883 3979 289.0352526439483 833778 0.6 851 202 28432 None
SemiBin_5 72.88 29.28 Neural Network (Specific Model) 11 0.84 4332 278.1305147058824 2696139 0.46 2720 614 26601 None
SemiBin_50 6.25 0.0 Neural Network (Specific Model) 11 0.835 4959 274.4099099099099 218478 0.44 222 49 14929 None
SemiBin_51 6.58 0.14 Neural Network (Specific Model) 11 0.865 19562 244.65814696485623 264835 0.4 313 24 34399 None
SemiBin_52 4.58 0.02 Neural Network (Specific Model) 11 0.686 4255 221.66666666666666 264129 0.39 273 65 9536 None
SemiBin_53 9.26 0.12 Neural Network (Specific Model) 11 0.744 5827 241.62303664921467 371780 0.28 382 71 18159 None
SemiBin_54 14.45 0.13 Neural Network (Specific Model) 11 0.91 4727 316.46264367816093 362811 0.36 348 77 16911 None
SemiBin_55 49.85 0.59 Neural Network (Specific Model) 11 0.875 5967 275.24599708879185 1294529 0.35 1374 241 22855 None
SemiBin_56 94.46 15.69 Neural Network (Specific Model) 11 0.867 38965 320.2335025380711 2835660 0.34 2561 149 168383 None
SemiBin_57 10.63 0.35 Neural Network (Specific Model) 11 0.862 6291 270.52452025586354 441008 0.52 469 74 19904 None
SemiBin_6 47.36 13.44 Neural Network (Specific Model) 11 0.861 5540 270.496338028169 1671343 0.33 1775 315 57389 None
SemiBin_60 7.23 0.08 Neural Network (Specific Model) 11 0.898 8503 249.22 332324 0.39 400 51 35261 None
SemiBin_69 62.21 3.3 Neural Network (Specific Model) 11 0.888 5392 260.7740885416667 1350459 0.36 1536 265 19350 None
SemiBin_7 87.95 2.29 Neural Network (Specific Model) 11 0.897 25417 287.10086004691163 1225954 0.46 1279 84 80957 None
SemiBin_70 7.88 0.02 Neural Network (Specific Model) 11 0.861 7058 259.63141993957703 298880 0.36 331 46 41450 None
SemiBin_71 11.3 0.35 Neural Network (Specific Model) 11 0.877 11107 259.26720647773277 655290 0.52 741 77 34926 None
SemiBin_77 9.55 0.21 Neural Network (Specific Model) 11 0.881 10779 317.185303514377 337566 0.36 313 35 147518 None
SemiBin_9 33.38 9.66 Neural Network (Specific Model) 11 0.862 6382 280.41838134430725 1421971 0.33 1458 247 33203 None
SemiBin_91 5.58 0.03 Neural Network (Specific Model) 11 0.806 3565 245.2053872053872 270421 0.32 297 75 8490 None
We are using semibin in our MAGs workflow: https://iwc.galaxyproject.org/workflow/mags-building-main/
And found that there is a large difference between v2.1.0 vs v2.0.2 for bin creation in terms of numbers as well as quality.
Here is an example of a real-life analysis of the bee microbiome using checkm2.
Any idea why this could be the case?
v2.1.0
v2.0.2