Skip to content

[Bug]: load_all_projects() never uses _projects_cache — cache is dead code, file re-read on every call #709

@sujitsingh8

Description

@sujitsingh8

What happened?

data_loader.py defines a _projects_cache variable and a clear_cache() function, and there's even a benchmark_cache.py script plus tests that call clear_cache(). So caching is clearly intended. But load_all_projects() never reads or writes _projects_cache — it opens and re-reads projects.json from disk on every single call. The cache is dead code, and clear_cache() clears something that's never populated.

Steps to reproduce

  1. Open utils/data_loader.py
  2. Look at load_all_projects() — it always does open(DATA_FILE) and json.load, no cache check
  3. _projects_cache (line 52) is defined but never read or assigned anywhere except being reset to None in clear_cache()
  4. During one /api/recommend call, the file is read multiple times instead of once

Expected behaviour

load_all_projects() should load the file once, store it in _projects_cache, and return the cached copy on later calls. clear_cache() should then actually force a fresh reload. This matches what the benchmark script and tests already expect.

Area of the app affected

Recommendation results

Python version

3.14

Operating system

Windows 11

Relevant error output or logs

No error — silent performance/dead-code bug. _projects_cache is defined at line 52 but load_all_projects() (line 11) never references it.

Before submitting

  • I searched existing issues and this has not been reported before.
  • I can reproduce this bug consistently with the steps above.
  • I am running the latest version of the main branch.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions