One query: which community college courses transfer (via ASSIST) and are offered this term (via live schedule lookups).
Current status (v0.1): tuned for UCLA CS; ASSIST ingest fairly complete, schedule coverage pilot (~8 community colleges).
Prereqs: Python 3.12 + uv
Clone repo, then install deps.
git clone https://github.com/FYC23/CC-course-finder
cd https://github.com/FYC23/CC-course-finderuv python install 3.12
uv venv --python 3.12
source .venv/bin/activate
uv syncFirst-time (required): ingest ASSIST articulation rows into data/assist.sqlite3.
uv run python -m src.assist.cli ingest \
--target-school "University of California, Los Angeles" \
--target-major "Computer Science" \
--max-cc 100Run web UI (joins articulation + schedule availability).
uv run uvicorn src.web.app:app --reloadOpen http://127.0.0.1:8000.
Optional: run schedule CLI directly (example uses term "Summer 2026" and college name substring "West Valley").
uv run python -m src.schedule.cli query \
--target-school "University of California, Los Angeles" \
--target-major "Computer Science" \
--term "Summer 2026" \
--cc-name "West Valley" \
--requirement "MATH 31B"Finding community college courses that transfer to a specific university (e.g., UCLA CS) for a given term requires manually cross-referencing two separate systems:
- ASSIST.org — tells you which CC courses transfer to your target school/major
- Each CC's class schedule — tells you whether that course is actually being offered this term
No existing tool does both in one query. Result: students hand-check school-by-school, term-by-term.
Two components:
1. ASSIST layer (implemented here as a first-pass ingest pipeline)
- Input: target school + major (e.g., UCLA CS)
- Query ASSIST for agreements across CCs, download the corresponding PDF artifacts, and parse simple direct mappings
- Output: normalized articulation rows in SQLite, queryable by target requirement/equivalent (see CLI examples below)
2. Schedule layer (the novel part)
- For each CC in the result set, hit their class schedule search
- ~80% of CA CCs run on Banner or PeopleSoft — predictable URL/form patterns
- Parse: is this course offered in the target term? Online or in-person?
- Output: filter ASSIST results to only currently-offered courses
Stack: Python; ASSIST ingest via requests + pypdf; schedule lookups via per-system adapters.
Final output: "Here are 12 sections of Calc II transferable to UCLA CS, offered Summer 2026 — 5 are online."
Only pull in an LLM if one of these specific problems comes up:
- Messy course name matching — ASSIST says "Calculus II" but CC lists "Calculus for Life Sciences II." LLM fuzzy-matches better than regex.
- Non-standard CC portals — a handful of CCs don't use Banner/PeopleSoft and have custom HTML. LLM-based extractor can parse arbitrary schedule pages without writing one-off scrapers.
- Natural language query interface — e.g., "find me an online async stats course this summer under 3 units that transfers to UCLA" — that's where an LLM earns its place.
Build the dumb version first. Add LLM only when hitting a wall that can't be rule-based out of.
This repo now includes a first-pass ASSIST ingestion pipeline under src/assist.
src/assist/discovery.pyresolves institutions and agreement references.src/assist/fetch.pydownloads and caches agreement artifacts.src/assist/parser.pyruns a minimal deterministic parser for direct mappings.src/assist/store.pypersists normalized articulation rows in SQLite.src/assist/cli.pyprovides ingest/query commands.
Artifacts are cached under data/assist_artifacts/, and the local SQLite database lives at data/assist.sqlite3.
This project uses uv with a repo-local .venv. Install steps in Quick start.
uv run pytestThis repo includes a small FastAPI-backed web UI under src/web that joins ASSIST articulation rows with schedule availability for a given term.
uv run uvicorn src.web.app:app --reloadOpen http://127.0.0.1:8000 and search by university, major, term, and optional requirement filter.
Results UX notes:
- Grouped by UC requirement.
- Sorted within each group by availability: Offered → Not offered → Articulation only.
- Availability filter lets you show only one status.
- "Articulation only" means the course is articulated in ASSIST, but this term's schedule availability wasn't found for that CC/course.
uv run python -m src.assist.cli ingest \
--target-school "University of California, Los Angeles" \
--target-major "Computer Science" \
--max-cc 100
uv run python -m src.assist.cli query \
--target-school "University of California, Los Angeles" \
--target-major "Computer Science" \
--requirement "MATH 31B"--max-cc caps processing by unique community colleges, not raw ASSIST agreement candidate rows.
If ASSIST changes endpoint routing, you can override --api-prefix.
The schedule layer now includes a pilot query command under src/schedule.
uv run python -m src.schedule.cli query \
--target-school "University of California, Los Angeles" \
--target-major "Computer Science" \
--term "Summer 2026" \
--cc-id 2 \
--requirement "MATH 31B"College selection: Default --cc-id is 0 (omit the flag): query all catalog-backed community colleges that appear in the articulation result. Use a nonzero --cc-id to pin one college. --cc-name accepts a case-insensitive substring of a catalog college name and must match exactly one entry (otherwise the CLI errors). --cc-name cannot be used together with a nonzero --cc-id.
uv run python -m src.schedule.cli query \
--target-school "University of California, Los Angeles" \
--target-major "Computer Science" \
--term "Summer 2026" \
--cc-name "West Valley" \
--requirement "MATH 31B"Current v1 scope:
- Canonical term input is a human label like
"Summer 2026"(strictSpring|Summer|Fall YYYY). - Schedule request failures are fail-soft per course (
offered=false, error marker inraw_summary).
Supported colleges and adapters (current):
Pilot set only; expect this list to expand.
| College | cc_id |
Adapter | Status |
|---|---|---|---|
| Evergreen Valley College | 2 | banner — Ellucian COLSS (PostSearchCriteria / Sections) |
works |
| West Valley College | 80 | wvm_static — schedule.wvm.edu static JSON |
works |
| Diablo Valley College | 114 | vsb_4cd — VSB api/class-data XML |
works |
| Los Medanos College | 61 | vsb_4cd — VSB api/class-data XML |
works |
| Contra Costa College | 28 | vsb_4cd — VSB api/class-data XML |
works |
| Mount San Antonio College | 62 | banner_ssb_classic — old SSB REST API |
works |
| City College of San Francisco | 33 | banner_ssb_classic — old SSB REST API (port 8105) |
works |
| Los Angeles City College | 3 | banner — (LACCD schedule likely not Banner) |
broken |
| College of Marin | 4 | marin_colleague — public ASP.NET schedule grid |
works |
| College of San Mateo | 5 | smcccd_colleague — SMCCD schedule API (/courses) |
needs creds |
banner_ssb_classic resolves term codes dynamically via getTerms (each institution uses a different numeric suffix scheme). Raw snippets only when SCHEDULE_DEBUG_RAW_SUMMARY=1.
vsb_4cd uses the Visual Schedule Builder (vsb.4cd.edu) shared by Diablo Valley, Los Medanos, and Contra Costa colleges. Term codes are derived deterministically (YYYY + 10/20/30 for Summer/Fall/Spring). Campus filtering is applied per-block using the locations field.
smcccd_colleague uses the documented SMCCD API surface. The public docs expose /courses, but live responses require Basic Auth credentials; configure SMCCD_API_USERNAME and SMCCD_API_PASSWORD to enable live schedule pulls.
