Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 2 additions & 11 deletions experiment/aim.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,3 @@
To understand and demonstrate the application of the Viterbi algorithm for Part-of-Speech (POS) tagging in Natural Language Processing. This experiment provides hands-on experience with the Viterbi decoding process, which is a fundamental dynamic programming algorithm used to find the most likely sequence of hidden states (POS tags) given observable sequences (words) in Hidden Markov Models.
**To understand and practice sequence decoding for Part-of-Speech (POS) tagging using the Viterbi algorithm in Natural Language Processing.**

The Viterbi algorithm is crucial in statistical NLP for solving the decoding problem: given a sequence of words and pre-computed emission and transition probabilities from a training corpus, determine the most probable sequence of POS tags that generated those words. This experiment allows learners to practice filling Viterbi tables step-by-step and understand how dynamic programming efficiently finds optimal tag sequences.

For example, given the sentence "Book a park", the algorithm determines whether "Book" should be tagged as a noun or verb, considering both:

- **Emission probabilities**: How likely each word is to be generated by each POS tag
- **Transition probabilities**: How likely each POS tag is to follow another in sequence

Through interactive simulation, learners will master the mathematical foundations of the Viterbi algorithm and its practical application in modern POS tagging systems.

<img src="images/viterbi-4.gif" alt="Viterbi Decoding Animation" style="display:block;margin:auto;max-width:400px;">
This experiment aims to help students develop proficiency in applying the Viterbi algorithm to find the most probable sequence of POS tags for a given sentence, using emission and transition probabilities. Through interactive exercises, learners will gain hands-on experience with dynamic programming and sequence labeling in NLP.
8 changes: 4 additions & 4 deletions experiment/assignment.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,21 +14,21 @@

**Emission Matrix P(word|tag):**

```
<pre>
The dog runs
Noun 0.1 0.6 0.1
Verb 0.0 0.1 0.8
Det 0.9 0.0 0.0
```
</pre>

**Transition Matrix P(tag_j|tag_i):**

```
<pre>
Noun Verb Det
Noun 0.3 0.4 0.1
Verb 0.4 0.1 0.2
Det 0.7 0.2 0.1
```
</pre>

Assume equal initial probabilities π[tag] = 1/3 for all tags.

Expand Down
Loading