Current behavior:
Recently, I have been working on evaluating essential genes. I've found that there are issues with the current evaluation workflow (also in auto-tasks in github) in estimateEssentialGenes.
ihuman = readYAMLmodel('model/Human-GEM.yml');
taskStruct = parseTaskList('data/metabolicTasks/metabolicTasks_Essential.txt');
[eGenes, INIT_output] = estimateEssentialGenes(ihuman, 'Hart2015_RNAseq.txt', taskStruct);
results = evaluateHart2015Essentiality(eGenes);
I found that the output context-specific models were very strange, with only a small amount of content as you can see below.
| cell_type |
DLD1 |
GBM |
HCT116 |
HELA |
RPE1 |
| genes |
475 |
475 |
475 |
475 |
475 |
| rxns |
250 |
250 |
250 |
250 |
250 |
| mets |
339 |
339 |
339 |
339 |
339 |
Further investigation revealed that the reason for this result is due to the fourth parameter useGeneSymbol of the estimateEssentialGenes function defaulting as true, which then converts the genes in the template model into geneSymbol format. However, in reality, the genes in the Hart2015_RNAseq.txt data are in the 'ENSG0000' format, leading to no gene matches and thus no gene expression being detected by default.
So, I manually tried changing the fourth parameter to false, and while the content of the resulting model was much more normal.
| cell_type |
DLD1 |
GBM |
HCT116 |
HELA |
RPE1 |
| genes |
1734 |
1731 |
1772 |
1743 |
1669 |
| rxns |
6870 |
6265 |
6888 |
6902 |
6097 |
| mets |
5680 |
4986 |
5649 |
5665 |
4845 |
However, the result of essential gene evaluation turned out to be all zeros because the genes in in Hart2015_TableS2.xlsx (Experimental result) are geneSymbol format. So, I believe that after the model is generated, all genes in the model (include template model) need to be converted into GeneSymbol format before performing the essential gene evaluation.
Current behavior:
Recently, I have been working on evaluating essential genes. I've found that there are issues with the current evaluation workflow (also in auto-tasks in github) in
estimateEssentialGenes.I found that the output context-specific models were very strange, with only a small amount of content as you can see below.
Further investigation revealed that the reason for this result is due to the fourth parameter
useGeneSymbolof theestimateEssentialGenesfunction defaulting astrue, which then converts the genes in the template model intogeneSymbolformat. However, in reality, the genes in theHart2015_RNAseq.txtdata are in the 'ENSG0000' format, leading to no gene matches and thus no gene expression being detected by default.So, I manually tried changing the fourth parameter to false, and while the content of the resulting model was much more normal.
However, the result of essential gene evaluation turned out to be all zeros because the genes in in
Hart2015_TableS2.xlsx(Experimental result) aregeneSymbolformat. So, I believe that after the model is generated, all genes in the model (include template model) need to be converted intoGeneSymbolformat before performing the essential gene evaluation.