Skip to content

lanethefox/alf-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Labrys Project

This project consists of two discrete packages:

  • LabrysPlatform: Data platform for ALF calculations (swappable between local and GCP implementations)
  • LabrysAnalytics: Analytics and ML development package

Project Structure

labrys_platform/
├── LabrysPlatform/          # Data platform package
│   ├── __init__.py
│   ├── setup.py
│   ├── src/                 # Core platform code
│   ├── scripts/             # Data processing scripts
│   ├── dbt/                 # dbt models
│   └── data/                # Data storage
├── LabrysAnalytics/         # Analytics package
│   ├── __init__.py
│   ├── setup.py
│   └── notebooks/           # Jupyter notebooks
├── requirements.txt         # Combined requirements
├── venv/                    # Virtual environment (create this)
└── README.md               # This file

Setup Instructions

1. Create Virtual Environment

# Install python3-venv if needed (requires sudo)
sudo apt install python3.13-venv

# Create virtual environment
python3 -m venv venv

# Activate virtual environment
source venv/bin/activate  # On Windows: venv\Scripts\activate

2. Install Requirements

# Install all dependencies
pip install -r requirements.txt

# If you encounter errors, use minimal requirements:
pip install -r requirements-minimal.txt

# Install local packages in editable mode
pip install -e ./LabrysPlatform
pip install -e ./LabrysAnalytics

Note: Some packages may have issues:

  • ydata-profiling has been renamed from pandas-profiling
  • rdkit requires specific system dependencies on some platforms
  • Use requirements-minimal.txt if you encounter installation issues

3. Verify Installation

# Test LabrysPlatform
from labrys_platform import calculate_alf
alf = calculate_alf(epsilon=100000, quantum_yield=0.1, lifetime=5.0)
print(f"ALF: {alf}")

# Test LabrysAnalytics
import labrys_analytics
print(labrys_analytics.__version__)

Development Workflow

Using LabrysPlatform

from labrys_platform import get_connection, calculate_alf
from labrys_platform.src.molecular.utils import standardize_smiles

# Connect to database
conn = get_connection()

# Calculate ALF
alf_value = calculate_alf(epsilon=100000, quantum_yield=0.1, lifetime=5.0)

# Standardize SMILES
canonical_smiles = standardize_smiles("CCO")

Using LabrysAnalytics

# In LabrysAnalytics notebooks or scripts
import pandas as pd
from labrys_platform import get_connection

# Get data from platform
conn = get_connection()
df = pd.read_sql("SELECT * FROM mart_compound_alf", conn)

# Perform analytics
# ... ML model development, visualizations, etc.

Swapping Implementations

The packages are designed to be loosely coupled. To swap LabrysPlatform's local implementation with a GCP implementation:

  1. Create a new package labrys-platform-gcp with the same interface
  2. Update LabrysAnalytics' dependency from labrys-platform to labrys-platform-gcp
  3. No code changes needed in LabrysAnalytics

Key Commands

Data Processing (LabrysPlatform)

# Run preprocessing pipeline
cd LabrysPlatform
python scripts/preprocess_scidata2020.py
python scripts/preprocess_photochemcad.py
python scripts/preprocess_chembl.py
python scripts/cross_dataset_integration.py

# Run dbt models
cd dbt
dbt run

Analytics (LabrysAnalytics)

# Start Jupyter
cd LabrysAnalytics
jupyter notebook

Testing

# Test LabrysPlatform
cd LabrysPlatform
pytest tests/

# Test LabrysAnalytics
cd LabrysAnalytics
pytest tests/

About

Data platform and ML analytics for ALF calculations, with swappable local/GCP backends and dbt models

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors