diff --git a/backend/backend.md b/backend/backend.md index 758fcd3..ab5318b 100644 --- a/backend/backend.md +++ b/backend/backend.md @@ -565,7 +565,10 @@ response: ```text POST /v1/judge ``` -make sure to add your `GROQ_API_KEY` to `.env` . + +Uses another LLM (Llama 3.3 70B served via Groq) to evaluate whether the predicted token is a factually correct and plausible given the input. + +Add your `GROQ_API_KEY` to `.env` . request body: @@ -586,6 +589,21 @@ response: "passed": true } ``` +**Parameters:** + +- `input_text` (required): input text +- `generated_text` (required): text generated by the pre-existing model + +**Interpreting Results:** + +- `score`: float between 0–1 representing hallucination risk (higher = more likely hallucinated) +- `conclusion`: `"low"` / `"medium"` / `"high"` based on score thresholds + - **Low (score < 0.3)** ✅ Prediction is factually correct and plausible + - **Medium (score 0.3–0.7)** ⚠️ Prediction is uncertain + - **High (score > 0.7)** 🚨 Prediction is likely incorrect or hallucinated +- `reason`: explanation from the judge model +- `passed`: `true` if score is below the threshold (0.5), meaning the prediction is acceptable + ### SGI (Semantic Grounding Index) diff --git a/notes/language-support-implementation.md b/notes/language-support-implementation.md index 5c89401..608f20d 100644 --- a/notes/language-support-implementation.md +++ b/notes/language-support-implementation.md @@ -1,5 +1,6 @@ # Multilingual support for Transformer Visualizer +This summarises how we would implement multilingual support without TransformerLens. To generalize, for different langauges, we have to: 1. In the backend (for Transformer Explainer, src/utils/model, change `modelname`), export the model to ONNX if we want to run on the browser. If not, and if we want to do everything in the backend, use AutoModelForCausalLM . 2. we would have to use a separate tokenizer too using AutoTokenizer (usually included in the Transformers library) diff --git a/notes/multilingual-LLM.md b/notes/multilingual-LLM.md index 41c731d..dc68a89 100644 --- a/notes/multilingual-LLM.md +++ b/notes/multilingual-LLM.md @@ -1,9 +1,8 @@ # Language Support in Transformer Lens ## Quick Summary -- TransformerLens just uses model weights as-is;it doesn't modify how the model handles multilingualism, it just lets you inspect it! -- MWork: LLMs process non-English input in three stages: convert to English-centric representations --> reason in English --> convert back to original language - - It's not actual translation +- TransformerLens just uses model weights as-is;it doesn't modify how the model handles multilingualism, it just lets you inspect it. +- MWork: LLMs process non-English input in three stages: convert to English-centric representations --> reason in English --> convert back to original language. Note that the model doesn't actually translate it. - Self-attention handles reasoning (English-centric), FFNs handle factual knowledge retrieval (multilingual), these are separable - Just 0.13% of neurons being deactivated can destroy multilingual capabilities entirely. This shows how concentrated the language-switching work is - Neuron behaviour is input dependent, not fixed. A "language-specific" neuron per MWork might behave differently on a different input. @@ -19,7 +18,7 @@ So basically, Transformer Lens doesn't deal with how the models interact with mu ## MWork - Multilingual Workflow [2] High-level: translate to English-centric (not exactly English) --> reason/task-solving (English-centric + some non-English tokens) --> translate back to the original language -LLMs used: Mistral, Vicuna, BLOOMZ, Chinese Llama +LLMs used for their paper: Mistral, Vicuna, BLOOMZ, Chinese Llama The authors hypothesize that the process, known as MWork, is how multilingual models work. They test the hypothesis using PLND (Parallel Language-specific Neuron Detection), which finds language-specific neurons that are consistently activated when processing documents in a specific language. @@ -63,9 +62,6 @@ Also, they found that the model is doing roughly the same amount of work per lay Although all-shared only makes up of ~20% of neurons in BLOOM (~30% in BLOOMZ, BLOOM-560m, the model we're using may have a lower percentage since it doesn't have IFT), regardless the language inputs, all-shared neurons are the top contributing neurons to the outputs at every layer. Specifically, "they contribute 91.6% to the generation of the correct output in the German test set". So, language-specific neurons are more about surface-level switching than actual correctness. - They then propose at the end that increasing all-shared neurons (via replacing or IFT, instruction fine-tuning) can "significantly enhance the accuracy of an LLM in multilingual tasks." - -### another paper to explore (if time permits): https://arxiv.org/pdf/2502.15603 - Sources: \ [1] https://transformerlensorg.github.io/TransformerLens/ \ [2] https://arxiv.org/pdf/2402.18815 \