Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 19 additions & 1 deletion backend/backend.md
Original file line number Diff line number Diff line change
Expand Up @@ -565,7 +565,10 @@ response:
```text
POST /v1/judge
```
make sure to add your `GROQ_API_KEY` to `.env` .

Uses another LLM (Llama 3.3 70B served via Groq) to evaluate whether the predicted token is a factually correct and plausible given the input.

Add your `GROQ_API_KEY` to `.env` .

request body:

Expand All @@ -586,6 +589,21 @@ response:
"passed": true
}
```
**Parameters:**

- `input_text` (required): input text
- `generated_text` (required): text generated by the pre-existing model

**Interpreting Results:**

- `score`: float between 0–1 representing hallucination risk (higher = more likely hallucinated)
- `conclusion`: `"low"` / `"medium"` / `"high"` based on score thresholds
- **Low (score < 0.3)** ✅ Prediction is factually correct and plausible
- **Medium (score 0.3–0.7)** ⚠️ Prediction is uncertain
- **High (score > 0.7)** 🚨 Prediction is likely incorrect or hallucinated
- `reason`: explanation from the judge model
- `passed`: `true` if score is below the threshold (0.5), meaning the prediction is acceptable


### SGI (Semantic Grounding Index)

Expand Down
1 change: 1 addition & 0 deletions notes/language-support-implementation.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Multilingual support for Transformer Visualizer

This summarises how we would implement multilingual support without TransformerLens.
To generalize, for different langauges, we have to:
1. In the backend (for Transformer Explainer, src/utils/model, change `modelname`), export the model to ONNX if we want to run on the browser. If not, and if we want to do everything in the backend, use AutoModelForCausalLM .
2. we would have to use a separate tokenizer too using AutoTokenizer (usually included in the Transformers library)
Expand Down
10 changes: 3 additions & 7 deletions notes/multilingual-LLM.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
# Language Support in Transformer Lens

## Quick Summary
- TransformerLens just uses model weights as-is;it doesn't modify how the model handles multilingualism, it just lets you inspect it!
- MWork: LLMs process non-English input in three stages: convert to English-centric representations --> reason in English --> convert back to original language
- It's not actual translation
- TransformerLens just uses model weights as-is;it doesn't modify how the model handles multilingualism, it just lets you inspect it.
- MWork: LLMs process non-English input in three stages: convert to English-centric representations --> reason in English --> convert back to original language. Note that the model doesn't actually translate it.
- Self-attention handles reasoning (English-centric), FFNs handle factual knowledge retrieval (multilingual), these are separable
- Just 0.13% of neurons being deactivated can destroy multilingual capabilities entirely. This shows how concentrated the language-switching work is
- Neuron behaviour is input dependent, not fixed. A "language-specific" neuron per MWork might behave differently on a different input.
Expand All @@ -19,7 +18,7 @@ So basically, Transformer Lens doesn't deal with how the models interact with mu

## MWork - Multilingual Workflow [2]
High-level: translate to English-centric (not exactly English) --> reason/task-solving (English-centric + some non-English tokens) --> translate back to the original language
LLMs used: Mistral, Vicuna, BLOOMZ, Chinese Llama
LLMs used for their paper: Mistral, Vicuna, BLOOMZ, Chinese Llama

The authors hypothesize that the process, known as MWork, is how multilingual models work.
They test the hypothesis using PLND (Parallel Language-specific Neuron Detection), which finds language-specific neurons that are consistently activated when processing documents in a specific language.
Expand Down Expand Up @@ -63,9 +62,6 @@ Also, they found that the model is doing roughly the same amount of work per lay
Although all-shared only makes up of ~20% of neurons in BLOOM (~30% in BLOOMZ, BLOOM-560m, the model we're using may have a lower percentage since it doesn't have IFT), regardless the language inputs, all-shared neurons are the top contributing neurons to the outputs at every layer. Specifically, "they contribute 91.6% to the generation of the correct output in the German test set". So, language-specific neurons are more about surface-level switching than actual correctness.
- They then propose at the end that increasing all-shared neurons (via replacing or IFT, instruction fine-tuning) can "significantly enhance the accuracy of an LLM in multilingual tasks."


### another paper to explore (if time permits): https://arxiv.org/pdf/2502.15603

Sources: \
[1] https://transformerlensorg.github.io/TransformerLens/ \
[2] https://arxiv.org/pdf/2402.18815 \
Expand Down