Prototypes
Main prototypes, designed in 2009-2014:
- Lithuanian Large Vocabulary Continuous Speech Recognition (LVCSR) engine that is able to recognize over 1.2 millions of word forms of Lithuanian language.
- Lithuanian Speech corpus (accessible to the public at http://donelaitis.vdu.lt/garsynas).
- Lithuanian word stress assignment web service, aka. “Accentuator” (accessible to the public at http://tekstynas.vdu.lt ).
- Lexical Lithuanian language ontology (Wordnet-type), containing Lithuanian language words and their paradigmatic and syntagmatic links.
- Lithuanian Internet corpus, about 800 mln. running words.
- Prototype for automated sentiment analysis in Lithuanian texts.
- Lithuanian Soundex (an algorithm for phonetic encoding of names).
- Lithuanian Stemmer (an algorithm for reducting inflected forms of words to stems in Lithuanian language).
- Language identification for Lithuanian, Azeri, Russian, Belarusian, Ukranian languages.
- Prototype for automated morphological analysis and lematization of Lithuanian texts.
- Named entity recognition (NER) annotated corpora in Lithuanian, Azeri, Russian languages.
- BHave – prototype simulator for Behavioural Hybrid Process Calculus.