Home

Jump to bottom

Márton Makrai edited this page May 30, 2016 · 17 revisions

Overview of the project

determining the number of word senses

(Gábor) from monolingual dictionaries (Longman, Collins)
- the number of senses attributed by the lexicographers
- comparing 4lang graphs of different meanings
(Gábor) from bilingual dictionaries
(Dávid és Marci) with non-uniform multi-prototype embeddings (MPE)
attic: (Dávid és Marci) különböző k jelentésszámokkal k-uniform vagy max k jelentést tartalmazó embedding tanítani, és azt vizsgálni, hogy a több jelentés jobban leírja-e az adatot

further steps

correlation between the ambiguity level of each word computed form different sources. Dávid: KL-t? vagy spearmant? végülis mindkettőnek van valamennyi értelme, bár utóbbinak talán nem olyan sok
compare the distribution of words sense numbers
- parametric families of distributions: gamma and beta distribution
translation from MPE
- test whether they are better than single-prototype embeddings
- attic: egyértelműsítő cimkékkel ellátni egy korpuszt, például a UMBC WebBase corpus t, majd az egyértelműsített korpuszon sima embeddinget tanítani
Alexander Panchenko (2016) Best of Both Worlds: Making Word Sense Embeddings Interpretable

graph of word senses

H Younb, L Sutton, E Smithe, C Moore, JF Wilkinsf, I Maddiesonh, W Croft and T Bhattacharyai (2016) On the universal structure of human lexical semantics
nagy összefüggő komponens Johannes Dellert () Evaluating Cross-Linguistic Polysemies as a Model of Semantic Change for Cognate Finding