Menu

 

Thoughts on Software Engineering

RANLP’2009 Workshop: A Knowledge-Rich Approach to Measuring the Similarity between Bulgarian and Russian Words

Today I presented a scientific publication about measuring modified orthographic similarity between Bulgarian and Russian words at the Workshop “Multilingual Resources, Technologies and Evaluation for Central and Eastern European Languages”, held in conjunction with the scientific conference RANLP’2009. The paper is titled “A Knowledge-Rich Approach to Measuring the Similarity between Bulgarian and Russian Words” and is a small part of my PhD thesis.

Abstract

We propose a novel knowledge-rich approach to measuring the similarity between a pair of words. The algorithm is tailored to Bulgarian and Russian and takes into account the orthographic and the phonetic correspondences between the two Slavic languages: it combines lemmatization, hand-crafted transformation rules, and weighted Levenshtein distance. The experimental results show an 11-pt interpolated average precision of 90.58%, which represents a significant improvement over two classic rivaling approaches.

Download

Download the article: RANLP2009-Workshop-Nakov-Paskaleva-Nakov-MMEDR-Similarity-Bulgarian-Russian-Words.pdf

Download the presentation: RANLP-2009-Workshop-Nakov-Paskaleva-Nakov-MMEDR-Similarity-Bulgarian-Russian.ppt.

Previews (3,864), Views (196), Comments (0)

RSS feed for comments on this post. TrackBack URL

LEAVE A COMMENT