This project consists of two discrete packages:
- LabrysPlatform: Data platform for ALF calculations (swappable between local and GCP implementations)
- LabrysAnalytics: Analytics and ML development package
labrys_platform/
├── LabrysPlatform/ # Data platform package
│ ├── __init__.py
│ ├── setup.py
│ ├── src/ # Core platform code
│ ├── scripts/ # Data processing scripts
│ ├── dbt/ # dbt models
│ └── data/ # Data storage
├── LabrysAnalytics/ # Analytics package
│ ├── __init__.py
│ ├── setup.py
│ └── notebooks/ # Jupyter notebooks
├── requirements.txt # Combined requirements
├── venv/ # Virtual environment (create this)
└── README.md # This file
# Install python3-venv if needed (requires sudo)
sudo apt install python3.13-venv
# Create virtual environment
python3 -m venv venv
# Activate virtual environment
source venv/bin/activate # On Windows: venv\Scripts\activate# Install all dependencies
pip install -r requirements.txt
# If you encounter errors, use minimal requirements:
pip install -r requirements-minimal.txt
# Install local packages in editable mode
pip install -e ./LabrysPlatform
pip install -e ./LabrysAnalyticsNote: Some packages may have issues:
ydata-profilinghas been renamed frompandas-profilingrdkitrequires specific system dependencies on some platforms- Use
requirements-minimal.txtif you encounter installation issues
# Test LabrysPlatform
from labrys_platform import calculate_alf
alf = calculate_alf(epsilon=100000, quantum_yield=0.1, lifetime=5.0)
print(f"ALF: {alf}")
# Test LabrysAnalytics
import labrys_analytics
print(labrys_analytics.__version__)from labrys_platform import get_connection, calculate_alf
from labrys_platform.src.molecular.utils import standardize_smiles
# Connect to database
conn = get_connection()
# Calculate ALF
alf_value = calculate_alf(epsilon=100000, quantum_yield=0.1, lifetime=5.0)
# Standardize SMILES
canonical_smiles = standardize_smiles("CCO")# In LabrysAnalytics notebooks or scripts
import pandas as pd
from labrys_platform import get_connection
# Get data from platform
conn = get_connection()
df = pd.read_sql("SELECT * FROM mart_compound_alf", conn)
# Perform analytics
# ... ML model development, visualizations, etc.The packages are designed to be loosely coupled. To swap LabrysPlatform's local implementation with a GCP implementation:
- Create a new package
labrys-platform-gcpwith the same interface - Update LabrysAnalytics' dependency from
labrys-platformtolabrys-platform-gcp - No code changes needed in LabrysAnalytics
# Run preprocessing pipeline
cd LabrysPlatform
python scripts/preprocess_scidata2020.py
python scripts/preprocess_photochemcad.py
python scripts/preprocess_chembl.py
python scripts/cross_dataset_integration.py
# Run dbt models
cd dbt
dbt run# Start Jupyter
cd LabrysAnalytics
jupyter notebook# Test LabrysPlatform
cd LabrysPlatform
pytest tests/
# Test LabrysAnalytics
cd LabrysAnalytics
pytest tests/