Skip to content

[Bug]: Duplicate skills inflate recommendation score — parse_skills() doesn't deduplicate #710

@sujitsingh8

Description

@sujitsingh8

What happened?

parse_skills() in recommender.py doesn't remove duplicate skills. If a user enters the same skill multiple times (like "python, python, python"), each copy is counted as a separate match in score_single_project, multiplying the skill score. This lets a project's score be inflated just by repeating a skill, which distorts the ranking. Also "py" and "python" both normalize to "python" but are still counted as two separate skills.

Steps to reproduce

  1. Call get_recommendations("python", "Beginner", "Web", "Low") and note the scores
  2. Call get_recommendations("python,python,python", "Beginner", "Web", "Low")
  3. The repeated skill multiplies the score (a single match gives 8, three copies give 14)
  4. parse_skills("python, python, py") returns ["python","python","python"] instead of ["python"]

Expected behaviour

Duplicate skills should be counted only once. parse_skills() should return a deduplicated list so repeating a skill can't inflate the recommendation score or change the ranking.

Area of the app affected

Recommendation results

Python version

3.14

Operating system

Windows 11

Relevant error output or logs

No error thrown — silent ranking bug. parse_skills("python, python, py") returns ['python', 'python', 'python'].

Before submitting

  • I searched existing issues and this has not been reported before.
  • I can reproduce this bug consistently with the steps above.
  • I am running the latest version of the main branch.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions