Lemmagen slovenian lemmatizer available for python

Jernej Virag

November 6, 2013

In last days I've managed to finish my wrestling with Pythons awful packaging systems and have managed to publish Lemmagen lemmatizer to Python Pypi repository.

To install it just run

pip install Lemmagen

inside your favorite Python/virtualenv environment. Note that installation requires a working C++ compiler for your platform.

Then to use it, instantiate Lemmatizer class and call lemmatize() on it. By default the lemmatizer is instantiated with slovenian dictionary, others are avaiable via dictionary keyword argument to the constructor.

Example:


import lemmagen.lemmatizer
from lemmagen.lemmatizer import Lemmatizer

lemmatizer = Lemmatizer(dictionary=lemmagen.DICTIONARY_SLOVENE)
print lemmatizer.lemmatize("hodimo")

The source of the bindings is available as a part of Lemmagen slovene_lemmatizer project. The project is licensed under LGPLv2.