Skip to content

Clarify attention output projection#84

Open
cat0825 wants to merge 1 commit into
poloclub:mainfrom
cat0825:clarify-attention-output-projection
Open

Clarify attention output projection#84
cat0825 wants to merge 1 commit into
poloclub:mainfrom
cat0825:clarify-attention-output-projection

Conversation

@cat0825
Copy link
Copy Markdown

@cat0825 cat0825 commented May 31, 2026

Summary

  • Clarify that each attention head output is concatenated and then passed through GPT-2's attention output projection matrix (c_proj).
  • Update the article and guided textbook copy so learners understand why c_proj is present in the code but not drawn as a separate matrix.
  • Fix the attention output tooltip to refer to the selected head instead of always saying Head 1.

Fixes #71

Verification

  • npx prettier --check src/components/Attention.svelte src/components/article/Article.svelte src/utils/textbookPages.ts
  • git diff --check
  • npm run check could not complete because the current project baseline reports 499 existing errors unrelated to this PR.
  • SASS_PATH=. npm run build compiles the app bundles, then fails during prerender because sharp was installed with scripts disabled and its native binary is unavailable in this local environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Output Projection missing (c_proj from Python code)

1 participant