Skip to content

Substantially less bins found with v2.1.0 vs v2.0.2 #212

@paulzierep

Description

@paulzierep

We are using semibin in our MAGs workflow: https://iwc.galaxyproject.org/workflow/mags-building-main/
And found that there is a large difference between v2.1.0 vs v2.0.2 for bin creation in terms of numbers as well as quality.
Here is an example of a real-life analysis of the bee microbiome using checkm2.
Any idea why this could be the case?

v2.1.0

Name	Completeness	Contamination	Completeness_Model_Used	Translation_Table_Used	Coding_Density	Contig_N50	Average_Gene_Length	Genome_Size	GC_Content	Total_Coding_Sequences	Total_Contigs	Max_Contig_Length	Additional_Notes
SemiBin_0	29.34	3.36	Neural Network (Specific Model)	11	0.865	3922	274.72700296735906	641692	0.42	674	154	23611	None
SemiBin_1	50.36	19.25	Neural Network (Specific Model)	11	0.831	4102	268.05308588692276	2238014	0.45	2317	530	26601	None
SemiBin_1024	16.65	4.62	Neural Network (Specific Model)	11	0.761	4297	219.95294117647057	1030336	0.3	1190	235	22552	None
SemiBin_1025	46.0	8.8	Neural Network (Specific Model)	11	0.881	4617	251.60856720827178	1157997	0.36	1354	262	16146	None
SemiBin_1026	53.4	10.87	Neural Network (Specific Model)	11	0.904	8172	302.56342412451363	1289043	0.37	1285	201	51180	None
SemiBin_2	10.53	0.07	Neural Network (Specific Model)	11	0.787	6811	225.45338208409507	468390	0.36	547	84	36178	None
SemiBin_4090	79.38	32.26	Neural Network (Specific Model)	11	0.868	11804	297.6619880349747	4464082	0.35	4346	544	130205	None
SemiBin_4091	33.39	0.62	Neural Network (Specific Model)	11	0.88	6398	275.74976744186046	1009154	0.38	1075	180	33958	None
SemiBin_4327	3.93	0.0	Neural Network (Specific Model)	11	0.776	239878	320.9846153846154	239878	0.52	195	1	239878	None
SemiBin_5113	20.99	0.71	Neural Network (Specific Model)	11	0.837	6513	275.096018735363	840777	0.41	854	138	80687	None
SemiBin_5114	45.57	0.55	Neural Network (Specific Model)	11	0.893	28063	278.17007874015746	592880	0.45	635	37	61370	None
SemiBin_6133	46.26	0.0	Neural Network (Specific Model)	11	0.885	11768	296.1252093802345	2395077	0.6	2388	266	33128	None
SemiBin_6135	31.31	3.13	Neural Network (Specific Model)	11	0.877	5849	269.575104727708	1537517	0.42	1671	263	113188	None
SemiBin_6136	37.3	0.0	Neural Network (Specific Model)	11	0.907	4375	280.7223032069971	1270666	0.42	1372	296	13692	None
SemiBin_6137	17.79	1.83	Neural Network (Specific Model)	11	0.863	4201	273.00318133616116	892578	0.56	943	217	16432	None
SemiBin_6138	32.29	1.2	Neural Network (Specific Model)	11	0.881	38873	292.14128595600675	2345787	0.54	2364	124	124190	None

v2.0.2

Name	Completeness	Contamination	Completeness_Model_Used	Translation_Table_Used	Coding_Density	Contig_N50	Average_Gene_Length	Genome_Size	GC_Content	Total_Coding_Sequences	Total_Contigs	Max_Contig_Length	Additional_Notes
SemiBin_0	23.79	0.12	Neural Network (Specific Model)	11	0.817	3258	251.88888888888889	398745	0.4	432	117	6680	None
SemiBin_1	52.09	5.87	Neural Network (Specific Model)	11	0.834	6426	283.44331133773244	1696173	0.46	1667	281	30922	None
SemiBin_11	67.74	0.33	Gradient Boost (General Model)	11	0.91	4546	280.95963933018464	2152417	0.42	2329	489	13692	None
SemiBin_12	65.58	3.54	Neural Network (Specific Model)	11	0.873	5107	280.06475300400535	1439659	0.43	1498	297	17316	None
SemiBin_13	87.69	5.74	Neural Network (Specific Model)	11	0.864	46666	304.42977420589364	2757890	0.34	2613	86	195230	None
SemiBin_15	12.79	2.58	Neural Network (Specific Model)	11	0.711	3512	224.34619883040935	808039	0.42	855	223	13385	None
SemiBin_16	10.88	0.44	Neural Network (Specific Model)	11	0.768	4943	239.263862332696	487844	0.29	523	102	19894	None
SemiBin_19	12.3	1.7	Neural Network (Specific Model)	11	0.857	22925	261.6875687568757	830477	0.5	909	56	74194	None
SemiBin_2	20.56	0.1	Neural Network (Specific Model)	11	0.807	6054	260.4830028328612	681860	0.39	706	119	32058	None
SemiBin_20	13.6	0.85	Neural Network (Specific Model)	11	0.836	3764	263.53846153846155	1005195	0.56	1066	253	33621	None
SemiBin_21	100.0	0.4	Neural Network (Specific Model)	11	0.888	52033	316.9252607184241	5536488	0.56	5178	167	176966	None
SemiBin_22	53.62	3.99	Neural Network (Specific Model)	11	0.849	4708	269.8386225523295	1409261	0.42	1481	307	23489	None
SemiBin_23	37.38	0.0	Neural Network (Specific Model)	11	0.88	3904	262.1492042440318	1344765	0.43	1508	341	10874	None
SemiBin_233	6.91	0.08	Neural Network (Specific Model)	11	0.761	239878	306.86	479658	0.51	400	4	239878	None
SemiBin_24	81.74	1.94	Neural Network (Specific Model)	11	0.887	6522	275.62341078474356	2124102	0.38	2281	367	33958	None
SemiBin_29	96.74	1.68	Neural Network (Specific Model)	11	0.881	12196	293.2869528441531	4714596	0.6	4729	491	79088	None
SemiBin_3	80.58	7.3	Neural Network (Specific Model)	11	0.903	7158	293.48175725986596	1307590	0.37	1343	214	24699	None
SemiBin_30	28.56	0.04	Neural Network (Specific Model)	11	0.832	4046	260.3989417989418	885576	0.46	945	220	11911	None
SemiBin_31	86.54	3.07	Neural Network (Specific Model)	11	0.901	17391	292.4266284896206	1358660	0.34	1397	127	64643	None
SemiBin_32	22.89	0.05	Neural Network (Specific Model)	11	0.824	5774	279.81328320802004	811680	0.41	798	148	30462	None
SemiBin_33	7.85	0.0	Neural Network (Specific Model)	11	0.75	4030	247.1933962264151	209245	0.3	212	55	5922	None
SemiBin_36	23.99	0.23	Neural Network (Specific Model)	11	0.913	5113	297.3452380952381	491923	0.37	504	99	31381	None
SemiBin_37	5.63	0.02	Neural Network (Specific Model)	11	0.863	113197	290.98623853211006	220395	0.35	218	13	113197	None
SemiBin_39	61.87	3.91	Neural Network (Specific Model)	11	0.912	4470	276.1839622641509	961775	0.37	1060	213	22627	None
SemiBin_4	29.73	0.11	Neural Network (Specific Model)	11	0.902	3547	256.61322081575247	605910	0.37	711	162	10396	None
SemiBin_42	22.28	1.33	Neural Network (Specific Model)	11	0.91	4408	296.7756183745583	553209	0.36	566	126	12478	None
SemiBin_44	100.0	0.79	Neural Network (Specific Model)	11	0.861	53399	318.0313479623824	3529330	0.39	3190	110	187436	None
SemiBin_45	36.59	4.63	Neural Network (Specific Model)	11	0.883	3979	289.0352526439483	833778	0.6	851	202	28432	None
SemiBin_5	72.88	29.28	Neural Network (Specific Model)	11	0.84	4332	278.1305147058824	2696139	0.46	2720	614	26601	None
SemiBin_50	6.25	0.0	Neural Network (Specific Model)	11	0.835	4959	274.4099099099099	218478	0.44	222	49	14929	None
SemiBin_51	6.58	0.14	Neural Network (Specific Model)	11	0.865	19562	244.65814696485623	264835	0.4	313	24	34399	None
SemiBin_52	4.58	0.02	Neural Network (Specific Model)	11	0.686	4255	221.66666666666666	264129	0.39	273	65	9536	None
SemiBin_53	9.26	0.12	Neural Network (Specific Model)	11	0.744	5827	241.62303664921467	371780	0.28	382	71	18159	None
SemiBin_54	14.45	0.13	Neural Network (Specific Model)	11	0.91	4727	316.46264367816093	362811	0.36	348	77	16911	None
SemiBin_55	49.85	0.59	Neural Network (Specific Model)	11	0.875	5967	275.24599708879185	1294529	0.35	1374	241	22855	None
SemiBin_56	94.46	15.69	Neural Network (Specific Model)	11	0.867	38965	320.2335025380711	2835660	0.34	2561	149	168383	None
SemiBin_57	10.63	0.35	Neural Network (Specific Model)	11	0.862	6291	270.52452025586354	441008	0.52	469	74	19904	None
SemiBin_6	47.36	13.44	Neural Network (Specific Model)	11	0.861	5540	270.496338028169	1671343	0.33	1775	315	57389	None
SemiBin_60	7.23	0.08	Neural Network (Specific Model)	11	0.898	8503	249.22	332324	0.39	400	51	35261	None
SemiBin_69	62.21	3.3	Neural Network (Specific Model)	11	0.888	5392	260.7740885416667	1350459	0.36	1536	265	19350	None
SemiBin_7	87.95	2.29	Neural Network (Specific Model)	11	0.897	25417	287.10086004691163	1225954	0.46	1279	84	80957	None
SemiBin_70	7.88	0.02	Neural Network (Specific Model)	11	0.861	7058	259.63141993957703	298880	0.36	331	46	41450	None
SemiBin_71	11.3	0.35	Neural Network (Specific Model)	11	0.877	11107	259.26720647773277	655290	0.52	741	77	34926	None
SemiBin_77	9.55	0.21	Neural Network (Specific Model)	11	0.881	10779	317.185303514377	337566	0.36	313	35	147518	None
SemiBin_9	33.38	9.66	Neural Network (Specific Model)	11	0.862	6382	280.41838134430725	1421971	0.33	1458	247	33203	None
SemiBin_91	5.58	0.03	Neural Network (Specific Model)	11	0.806	3565	245.2053872053872	270421	0.32	297	75	8490	None

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions