Skip to content

stardag-dev/stardag

Repository files navigation

Stardag

PyPI version Python versions Documentation License

Declarative and composable DAGs for Python.

Stardag provides a clean Python API for representing persistently stored assets, the code that produces them, and their dependencies as a declarative Directed Acyclic Graph (DAG). It is a spiritual—but highly modernized—descendant of Luigi, designed for iterative data and ML workflows.

Built on Pydantic, Stardag uses expressive type annotations to reduce boilerplate and make task I/O contracts explicit—enabling composable tasks and pipelines while maintaining a fully declarative specification of every produced asset.

Quick Example

import stardag as sd

@sd.task
def get_range(limit: int) -> list[int]:
    return list(range(limit))

@sd.task
def get_sum(integers: sd.Depends[list[int]]) -> int:
    return sum(integers)

# Declarative DAG specification - no computation yet
sum_task = get_sum(integers=get_range(limit=4))

# Materialize all tasks' targets
sd.build(sum_task)

# Load results
assert sum_task.load() == 6
assert sum_task.integers.load() == [0, 1, 2, 3]

Installation

pip install stardag

Or with uv:

uv add stardag

Optional extras:

pip install stardag[s3]      # S3 storage support
pip install stardag[prefect] # Prefect integration
pip install stardag[modal]   # Modal integration

Documentation

📚 Read the docs for tutorials, guides, and API reference.

Stardag Cloud

Stardag Cloud provides optional services for team collaboration and monitoring:

  • Web UI — Dashboard for build monitoring and task inspection
  • API Service — Task tracking and coordination across distributed builds

The SDK works fully standalone—the platform adds value for teams needing shared visibility and coordination.

Why Stardag?

  • Composability — Task instances as first-class parameters enable loose coupling and reusability
  • Declarative — Full DAG specification before execution; inspect, serialize, and reason about pipelines
  • Deterministic — Parameter hashing gives each task a unique, reproducible ID and output path
  • Pydantic-native — Tasks are Pydantic models with full validation and serialization support
  • Framework-agnostic — Integrate with Prefect, Modal, or run standalone

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

License

  • SDK & Examples (lib/): Apache License 2.0
  • API & UI (app/): BSL 1.1 — free for self-hosted use, converts to Apache 2.0 in 2029

See LICENSE for details.

Links