Difficulty Matching Code Results with the Leaderboard

I find that my coding results capture more informative (metric) than the leaderboard metric, and I am thinking about how to reconcile them.

Could you please advise on how to match the results from the code to the leaderboard? It seems that each metric has three versions in the code, whereas only one version is reported on the leaderboard.


- dataset_id: op
  date_created: 17-12-2025
  file_size: 23048
  method_id: pearson_corr
  metric_ids:
  - sem_grn
  - sem_hvg
  - sem
  metric_values:
  - '0.0987688622048502'
  - '0.07334421631793887'
  - '0.08605653926139453'
- dataset_id: op
  date_created: 17-12-2025
  file_size: 23048
  method_id: pearson_corr
  metric_ids:
  - tfb_precision
  - tfb_recall
  - tfb_f1
  metric_values:
  - '0.04776565481165778'
  - '0.02756733797997705'
  - '0.03495870537547806'
- dataset_id: op
  date_created: 17-12-2025
  file_size: 23048
  method_id: pearson_corr
  metric_ids:
  - gs_precision
  - gs_recall
  - gs_f1
  - gs_n_active
  metric_values:
  - '0.70174168789988'
  - '0.3279813455378404'
  - '0.40692143797401853'
  - '348.6666666666667'
- dataset_id: op
  date_created: 17-12-2025
  file_size: 23048
  method_id: pearson_corr
  metric_ids:
  - r2_raw
  - r_precision
  - r_recall
  - r_f1
  metric_values:
  - '0.6934894982541678'
  - '0.49783144316145583'
  - '0.273573899861934'
  - '0.35310538256265267'
- dataset_id: op
  date_created: 17-12-2025
  file_size: 23048
  method_id: pearson_corr
  metric_ids:
  - vc_grn
  - vc_hvg
  - vc
  metric_values:
  - '0.6184723973274231'
  - '0.6489825248718262'
  - '0.6337274610996246'
- dataset_id: op
  date_created: 17-12-2025
  file_size: 23048
  method_id: pearson_corr
  metric_ids:
  - rc_tf_act
  metric_values:
  - '0.35853068099355406'



In addition, we replicated the Pearson correlation–based evaluation. However, although TF binding precision, recall, and F1 score are intended to reflect transcription factor binding relevance, none of the reproduced values match the leaderboard result (reported as 0.09). Specifically, we obtained TF binding precision = 0.0478, recall = 0.0276, and F1 score = 0.0350.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difficulty Matching Code Results with the Leaderboard #118

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Difficulty Matching Code Results with the Leaderboard #118

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions