Skip to content
View alizat's full-sized avatar
Shooting the Moon
Shooting the Moon

Block or report alizat

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
alizat/README.md

Lead/Staff Data Scientist — ML Systems, NLP/LLM, RecSys

  • 🔭 Current Role: Chief Data Scientist at Synapse Analytics
  • 🌱 Education: PhD in Computer Science & MSc in Bioinformatics from Nanyang Technological University

✨ Featured Repos

These are my HOT repos:

  • dbparser
    • Open-source R package for parsing and integrating heterogeneous public drug datasets
    • 49K+ downloads!
  • Chemogenomic DTI Prediction Methods
    • Companion website for a survey paper I wrote on computational methods for predicting drug-target interactions
    • Contains organized links to other works in the fields and their corresponding source codes as well as essential third-party feature extraction tools
  • CAFA 6 Protein Function Prediction
    • Kaggle competition for predicting protein function prediction from their protein sequences and gene ontology (GO) information
    • Competition type: Extreme multi-label classification --> There are 26K GO terms, and for each protein, predict which terms apply to it
    • Contains my EDA on the data as well as my feature engineering pipeline for generating features

Nowadays, I am mostly working on the "Life with" repos here on my GitHub:

I also have my own pet projects that I work on from time to time:

  • HEROIC Surfer
    • HEROIC is a self-development platform that provides lots of content (book summaries, daily wisdom videos, meditations, etc.)
    • I scraped the HEROIC website. With what I am scraping, I intend to build a LLM-powered Shiny app that lets you explore the HEROIC database.
  • BGG Scraper
    • Board Game Geek (BGG) is an encyclopedic website that has all kinds of information on all board games ine existence.
    • I developed many functions for scraping different kinds of board game info. Most of these functions made use of the BGG's XML API.

📫 Let's Connect

Feel free to explore my repositories and reach out if you'd like to collaborate on exciting data science projects!

📈 GitHub Stats

Ali's GitHub Stats

Pinned Loading

  1. Life-with-LLMs Life-with-LLMs Public

    Pet projects involving Gen AI

    Jupyter Notebook 1

  2. Chemogenomic-DTI-Prediction-Methods Chemogenomic-DTI-Prediction-Methods Public

    Algorithms for prediction of drug-target interactions via computational (chemogenomic) methods

    MATLAB 48 14

  3. CAFA-6-Protein-Function-Prediction CAFA-6-Protein-Function-Prediction Public

    Kaggle competition at https://www.kaggle.com/competitions/cafa-6-protein-function-prediction

    Jupyter Notebook

  4. ropensci/dbparser ropensci/dbparser Public

    Source code for the R package, "dbparser" (i.e. DrugBank Parser)

    R 63 20

  5. bggscraper bggscraper Public

    Scripts for scraping all sorts of (publicly accessible) board games data from boardgamegeek.com

    R

  6. my_r_snippets my_r_snippets Public

    My own R snippets. Feel free to copy and use.

    Vim Snippet