add working files by rearustagi · Pull Request #1 · GoldenPlanetaryHealthLab/NepalSnakebite

rearustagi · 2026-03-09T16:51:54Z

Adds my initial visualization and data files to project

TinasheMTapera

He @rearustagi this is AWESOME so far 🥳. You're using the cluster just as intended and tracking your work with git!

A couple of changes I'd like to make before approving:

Please remove the source data from git tracking

Tracking individual data files in git is not a recommended pattern because it quickly slows down git and overwhelms memory. Git should be used only for plain text files (and notebooks). So, in your repo, add the "raw data" folder to your gitignore.

Delineate between source and intermediate data

I notice a proc_data repo with data that I'm not aware of. First, the same principle as point 1. applies: do not track data in git repositories. It will slow down git and you may accidentally expose PII or PHI 😬. Second, make sure that the code that generates these intermediate objects is similarly tracked and reproducible. I can't immediately tell from the notebooks but it looks like this is the case. My rule of thumb is that if I were to remove the intermediate data, I should still be able to reproduce it without modifying any code. So just make sure of that

Delineate cont'd.

If the proc_data does in fact contain RAW data (ie not generated, original files you got from chris/meghnath etc), please help me out by adding it to our data catalog: https://docs.google.com/forms/d/e/1FAIpQLSdzeBquqe_4ghFDu7QN-ChzXgCBsnHLty3is8yR1VOMADet3w/viewform?usp=sharing&ouid=106438662307402236405 This is how I make sure to track and document all of the data that goes through the lab

Organisation

This is less urgent, but I would recommend putting a little bit of organization into your repo with, at minimum, folders for scripts, notebooks or nbs, outputs, underwhich you can put outputs/figures etc... just so I know what I'm looking at.

Great work again!

ETA: 5. You can use the grdrive mapped source data

The source data for the files I sent are actually already mapped, they are in /n/holylabs/LABS/cgolden_lab/Lab/data_freeze/golden_googledrive_rclone/Climate-Smart Public Health - Nepal/4. Datasets/snake_bites

add working files

07082dd

TinasheMTapera requested changes Mar 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add working files#1

add working files#1
rearustagi wants to merge 1 commit intoGoldenPlanetaryHealthLab:mainfrom
rearustagi:add-files

rearustagi commented Mar 9, 2026

Uh oh!

TinasheMTapera left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rearustagi commented Mar 9, 2026

Uh oh!

TinasheMTapera left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

TinasheMTapera left a comment •

edited

Loading