.Prototypes

Main prototypes, designed in 2009-2014:

 

  • Lithuanian Large Vocabulary Continuous Speech Recognition (LVCSR) engine that is able to recognize over 1.2 millions of word forms of Lithuanian language.
  • Lithuanian Speech corpus (accessible to the public at http://donelaitis.vdu.lt/garsynas).
  • Lithuanian word stress assignment web service, aka. “Accentuator” (accessible to the public at http://tekstynas.vdu.lt ).
  • Lexical Lithuanian language ontology (Wordnet-type), containing Lithuanian language words and their paradigmatic and syntagmatic links.
  • Lithuanian Internet corpus, about 800 mln. running words.
  • Prototype for automated sentiment analysis in Lithuanian texts.
  • Lithuanian Soundex (an algorithm for phonetic encoding of names).
  • Lithuanian Stemmer (an algorithm for reducting inflected forms of words to stems in Lithuanian language).
  • Language identification for Lithuanian, Azeri, Russian, Belarusian, Ukranian languages.
  • Prototype for automated morphological analysis and lematization of Lithuanian texts.
  • Named entity recognition (NER) annotated corpora in Lithuanian, Azeri, Russian languages.
  • BHave – prototype simulator for Behavioural Hybrid Process Calculus.