Petro Orynycz

Blog

  • Lemko земля ⟨zemlja⟩ 'earth’

    Lemko земля ⟨zemlja⟩ 'earth’

    Meaning

    The Lemko noun земля ⟨zemlja⟩ is translatable into English as „earth”, „ground”, or „floor”, depending on the context. It translates into Polish as ziemia.

    How to Pronounce and Memorize

    The first syllable is pronounced like English zen, but with an ⟨m⟩ sound at the end. The second syllable is pronounced as in „la la la”. To memorize, imagine a zen monk meditating in mud outside and saying „La la la, I can’t hear you!”

    Etymology

    The Lemko noun земля ⟨zemlja⟩ 'earth’ comes from Proto-Slavic *zemļà (Derksen, 2008, p. 542). Cognates include Old Church Slavonic землꙗ (ⰸⰵⰿⰾⱑ) ⟨zemlja⟩ Avestan 𐬰𐬃‎ ⟨zā̊⟩ 'earth’ (accusative form 𐬰𐬆𐬨‎ ⟨zəm⟩), Sanskrit क्ष ⟨kṣá⟩ 'earth’, Persian زمین‎ ⟨zamin⟩ 'earth’, Ancient Greek χθών ⟨khthṓn⟩ “earth”, Hittite 𒋼𒂊𒃷 ⟨tēkan⟩, (genitive 𒁖𒈾𒀸 ⟨taknas⟩), Latin humus 'ground’, and Ancient Greek χαμαί ⟨khamaí⟩ 'on the ground’ (Vasmer 1953, pp. 452–453, see also Derksen, 2008, p. 542 and Pokorny, 1959, p. 415).

    Declension

    Lemko земля ⟨zemlja⟩ is a soft, first-declension noun that declines as follows:

    Singular

    CaseLemkoPolishUkrainianRussian
    Nomзе́мля ⟨zémlja⟩ziemiaземля́земля́
    Genзе́млі ⟨zémli⟩ziemiземлі́земли́
    Datзе́мли ⟨zémly⟩aziemiземлі́земле́
    Accзе́млю ⟨zémlju⟩ziemięзе́млюзе́млю
    Insзе́мльом ⟨zémlʹom⟩ziemiąземле́юземлёй
    Locзе́мли ⟨zémly⟩aziemiземлі́земле́
    Vocзе́мльо ⟨zémlʹo⟩bziemioзе́мле
    The singular declension of the Lemko soft first declension noun земля ⟨zemlja⟩ 'earth’ compared to its Polish, Ukrainian, and Russian cognates.

    a Pyrtej (2013, p. 38) gives зе́млі ⟨zémli⟩ as the dative and locative singular forms, yet Fontański and Chomiak (2000, p. 64) provide зе́мли ⟨zémly⟩.

    b Fontański and Chomiak (2000, p. 64) provide земле ⟨zemle⟩ as an alternative vocative singular form.

    Plural

    CaseLemkoPolishUkrainianRussian
    Nomзе́млі ⟨zémli⟩ziemieзе́млізе́мли
    Genзе́мель ⟨zémlʹ⟩ziemземе́льземе́ль
    Datзе́млям ⟨zémljam⟩ziemiomзе́млямзе́млям
    Accзе́млі ⟨zémli⟩ziemieзе́млізе́мли
    Insземля́ми ⟨zemljámy⟩ziemiamiзе́млямизе́млями
    Locзе́млях ⟨zémljax⟩ziemiachзе́мляхзе́млях
    Vocзе́млі ⟨zémli⟩ziemieзе́млі
    The plural declension of the Lemko soft first declension noun земля ⟨zemlja⟩ 'earth’ compared to its Polish, Ukrainian, and Russian cognates.

    References

    ^ Derksen, Rick. (2008). In Lubotsky, A. (Ed.), Leiden Indo-European Etymological Dictionary Series: Vol. 4. Etymological Dictionary of the Slavic Inherited Lexicon. Koninklijke Brill NV. https://brill.com/view/title/12607

    ^ Fontański, H., Chomiak, M.  (2000). Ґраматыка лемківского языка [Grammar of the Lemko Language]. Śląsk.

    ^ Pokorny, Julius. (1959). Indogermanisches Etymologisches Wörterbuch [Indo-Germanic Etymological Dictionary]. A. Francke AG Verlag Bern.

    ^ ^ Pyrtej, P. (2013). Лемківські говірки. Фонетика і морфологія. Об’єднання лемків [Lemko Dialects. Phonetics and Morphology]. Обʼєднання лемків [Lemko Union].

    ^ Vasmer, M. (1953). Russisches Etymologisches Wörterbuch, Erster Band: A – K [Russian Etymological Dictionary, Volume One: A – K]. Carl Winter Universitätsverlag.

  • Lemko рік ⟨rik⟩ 'year’

    Lemko рік ⟨rik⟩ 'year’

    Learn the meaning, origin, and morphology of the Lemko masculine noun рікrik⟩, as well as how to memorize it.

    Translation

    The forms of the Lemko word рікrik⟩ listed below are translatable into English as “year” or „years”.

    Mnemonic

    To memorize the Lemko word рікrik⟩, English speakers might imagine something reeking at a New Year’s Eve party (Lemko rik and English reek are pronounced practically the same).

    Etymology

    From Proto-Slavic *rokŭ 'time’, itself a deverbal noun from *rekti 'say’, whose cognates include Old Church Slavonic рокъ (ⱃⱁⰽⱏ) ⟨rokŭ⟩ 'time, term’, as well as possibly English reckon, Sanskrit रचयति ⟨racáyati⟩ „construct, work”, Gothic 𐍂𐌰𐌷𐌽𐌾𐌰𐌽 ⟨rahnjan⟩ 'reckon’ (Pokorny 1959, p. 863, see also Vasmer, 1955, p. 532) and Welsh rhegi 'curse’ (Derksen, 2008, pp. 433, 438).

    The entry for the Proto-Slavic noun *rokъ on page 438 of Derksen’s Etymological Dictionary of the Slavic Inherited Lexicon.
    The entry for the Proto-Slavic verb *rekti on page 433 of Derksen’s Etymological Dictionary of the Slavic Inherited Lexicon.
    The entry rē̆k- on page 863 of Pokorny’s Indo-Germanic Etymological Dictionary (1959), which mentions Old Church Slavonic rokъ.
    The entry for the Muscovite Russian noun рок ⟨rok⟩ in Vasmer’s Russian Etymological Dictionary (1955, p. 532), which mentions Ukrainian rik.

    Declension

    Singular

    CaseLemkoPolishUkrainianRussian
    Nominativeрік ⟨rikarokрікго́д
    Genitiveро́ка ⟨rókabrokuро́куго́да
    Dativeроко́ви ⟨rókovybrokowiро́кові, ро́куго́ду
    Accusativeрік ⟨rikarokрікго́д
    Instrumentalро́ком ⟨rókomcrokiemро́комго́дом
    Locativeро́ці ⟨rócicrokuро́ціго́де
    Vocativeроку ⟨rókurokuро́куго́д

    a The nominative and accusative form of Lemko рік ⟨rik⟩ 'year’ is the same as the genitive plural of ріка ⟨rika⟩ 'river’. Horoszczak (2004, p. 330) provides the nominative and accusative singular as „рик ryk⟩, рікrik⟩”.

    b See Pyrtej (2013, p. 46) for the genitive and dative singular forms of Lemko рік ⟨rik⟩ 'year’. Photograph below.

    Table on page 46 of Pyrtej’s Lemko Dialects. Phonetics and Morphology

    c See Pyrtej (2013, p. 47) for the instrumental and locative singular forms of Lemko рік ⟨rik⟩ 'year’. Photograph below.

    Table on page 47 of Pyrtej’s Lemko Dialects. Phonetics and Morphology

    Plural

    CaseLemkoPolishUkrainianRussian
    Nominativeро́кы ⟨rókŷlataро́ки́го́ды, года́, лета́
    Genitiveро́ків ⟨rókivlatро́кі́вгодо́в, ле́т
    Dativeро́кам ⟨rókamblatomро́ка́мгода́м, лета́м
    Accusativeро́кы ⟨rókŷlataро́ки́го́ды, года́, лета́
    Instrumentalрока́ми ⟨rokámylatamiро́ка́мигода́ми, лета́ми
    Locativeро́ках ⟨rókachlatachро́ка́хгода́х, лета́х
    Vocativeро́кы ⟨rókŷlataро́ки́го́ды, года́, лета́
    SourceSource

    References

    ^ Derksen, Rick. (2008). In Lubotsky, A. (Ed.), Leiden Indo-European Etymological Dictionary Series: Vol. 4. Etymological Dictionary of the Slavic Inherited Lexicon. Koninklijke Brill NV. https://brill.com/view/title/12607

    Fontański, H., Chomiak, M.  (2000). Ґраматыка лемківского языка [Grammar of the Lemko Language]. Śląsk.

    ^ Horoszczak, J. (2004). Słownik łemkowsko-polski, polsko-łemkowski [Lemko-Polish and Polish-Lemko Dictionary]. Rutenika.

    ^ Pokorny, Julius. (1959). Indogermanisches Etymologisches Wörterbuch [Indo-Germanic Etymological Dictionary]. A. Francke AG Verlag Bern.

    ^ ^ Pyrtej, P. (2013). Лемківські говірки. Фонетика і морфологія. Об’єднання лемків [Lemko Dialects. Phonetics and Morphology]. Обʼєднання лемків [Lemko Union].

    ^ Vasmer, M. (1955). Russisches Etymologisches Wörterbuch, Zweiter Band: L–Ssuda [Russian Etymological Dictionary, Volume Two: L–Ssuda]. Carl Winter Universitätsverlag.

  • Lemko Demonstrative Pronouns

    Lemko Demonstrative Pronouns

    Please find below the translation, etymology, full declension tables, and references for the Lemko demonstrative pronouns тотtot⟩ meaning „this” or „these”, and тамтотtamtot⟩ meaning „that” or „those”.

    Translation

    The Lemko demonstrative pronoun of dictionary (masculine singular) form тотtot⟩ is translatable into English as „this” in the singular and „these” in the plural. When prefixed with тамtam⟩, (for example, тамтотtamtot⟩), it is translatable as „that” in the singular and „those” in the plural.

    Etymology

    The Lemko demonstrative pronoun of dictionary (masculine singular) form тотtot⟩ derives from reconstructed proto-Slavic *. Further afield, it is related to the English word that and Sanskrit तत्tat⟩ (Vasmer, 1958, p. 128), translatable as „this” and appearing in the famous line तत्त्वमसिtat tvam asi⟩ meaning „That thou art”.

    The entry for the Muscovite Russian demonstrative pronoun тотtot⟩ in Vasmer’s Russisches Etymologisches Wörterbuch, Dritter Band: Sta–Ÿ (1958, p. 128).

    Nearby („this” and „these”)

    Singular („This”)

    All of the following forms are translatable into English as „this”.

    Masculine

    CaseLemkoPolishUkrainianRussian
    Nominativeтот ⟨totatenцейэ́тот
    Genitiveто́го ⟨tóhotegoцього́э́того
    Dativeто́му ⟨tómutemuцьому́э́тому
    Accusative (inanimate)тот ⟨tota
    tenцейэ́тот
    Accusative
    (animate)
    то́го ⟨tóhotegoцього́э́того
    Instrumentalтым ⟨tŷmbtymцимэ́тим
    Locativeтым ⟨tŷmctymцьо́му, цімэ́том

    a Pyrtej (2013) gives той ⟨toj⟩ as an alternative form of the Lemko masculine nominative (as well as accusative inanimate) singular demonstrative pronoun (p. 107). That form is absent in Fontański & Chomiak (2000, p. 97).

    b Pyrtej (2013) gives тим ⟨tym⟩ as the Lemko form of the masculine instrumental singular demonstrative pronoun (p. 107), in contrast to the form тымtŷm⟩ appearing in Fontański & Chomiak (2000, p. 97).

    c Pyrtej (2013) gives тім ⟨tim⟩ as the Lemko form of the masculine locative singular demonstrative pronoun (p. 107), in contrast to the form тымtŷm⟩ appearing in Fontański & Chomiak (2000, p. 97).

    Feminine

    CaseLemkoPolishUkrainianRussian
    Nominativeто́та ⟨tóta⟩ataцяэ́та
    Genitiveтой ⟨toj⟩tejціє́їэ́той
    Dativeтій ⟨tij⟩tejційэ́той
    Accusativeто́ту ⟨tótu⟩bцюэ́ту
    Instrumentalтом ⟨tom⟩ціє́юэ́той, э́тою
    Locativeтій ⟨tij⟩tejційэ́той

    a Pyrtej (2013) gives та ⟨ta⟩ and та́я ⟨tája⟩ as alternative forms of the Lemko feminine nominative singular demonstrative pronoun (p. 107). Those forms are absent in Fontański & Chomiak (2000, p. 97).

    b Pyrtej (2013) gives ту ⟨tu⟩ and ту́ю ⟨túju⟩ as alternative forms of the Lemko feminine accusative singular demonstrative pronoun (p. 107). Those forms are absent in Fontański & Chomiak (2000, p. 97).

    Neuter

    CaseLemkoPolishUkrainianRussian
    Nominativeто́то ⟨tóto⟩atoцеэ́то
    Genitiveто́го ⟨tóho⟩tegoцього́э́того
    Dativeто́му ⟨tómu⟩temuцьому́э́тому
    Accusativeто́то ⟨tóto⟩
    toцейэ́то
    Instrumentalтым ⟨tŷm⟩btymцимэ́тим
    Locativeтым ⟨tŷm⟩ctymцьо́му, цімэ́том

    a Pyrtej (2013) gives то ⟨to⟩ and то́є ⟨tóje⟩ as alternative forms of the Lemko neuter nominative singular demonstrative pronoun (p. 107). Those forms are absent in Fontański & Chomiak (2000, p. 97).

    b Pyrtej (2013) gives тим ⟨tym⟩ as the Lemko form of the neuter instrumental singular demonstrative pronoun (p. 107), in contrast to the form тымtŷm⟩ appearing in Fontański & Chomiak (2000, p. 97).

    c Pyrtej (2013) gives тім ⟨tim⟩ as the Lemko form of the neuter locative singular demonstrative pronoun (p. 107), in contrast to the form тымtŷm⟩ appearing in Fontański & Chomiak (2000, p. 97).

    References
    Fontański & Chomiak (2000, p. 97).
    Pyrtej (2013, p. 107).

    Plural („These”)

    The following forms are used regardless of grammatical gender and are translatable into English as „these”.

    CaseLemkoPolishUkrainianRussian
    Nominativeто́ты ⟨tótŷte/ciціэ́ти
    Genitiveтых ⟨tŷch⟩tychцихэ́тих
    Dativeтым ⟨tŷm⟩tymцимэ́тим
    Accusative (inanimate)то́ты ⟨tótŷteціэ́ти
    Accusative (animate)тых ⟨tŷch
    tychцихэ́тих
    Instrumentalты́ма ⟨tŷma⟩tymiци́миэ́тими
    Locativeтых ⟨tŷch⟩tychцихэ́тих

    Distant („that”, „those”)

    To communicate distance from the speaker, simply prefix all of the above pronouns with Lemko там ⟨tam⟩. This is equivalent to saying „that” instead of „this” or „those” instead of „these” in English.

    Singular („That”)

    CaseMasculineFeminineNeuter
    Nominativeтамтот ⟨tamtotaтамто́та ⟨tamtóta⟩dтамто́то ⟨tamtóto⟩x
    Genitiveтамто́го ⟨tamtóhoтамтой ⟨tamtoj⟩тамто́го ⟨tamtóho⟩
    Dativeтамто́му ⟨tamtómuтамтій ⟨tamtij⟩тамто́му ⟨tamtómu⟩
    Accusative (inanimate)тамтот ⟨tamtota
    тамто́ту ⟨tamtótu⟩eтамто́то ⟨tamtóto
    Accusative
    (animate)
    тамто́го ⟨tamtóhoтамто́ту ⟨tamtótu⟩eтамто́то ⟨tamtóto
    Instrumentalтамтым ⟨tamtŷmbтамтом ⟨tamtom⟩тамтым ⟨tamtŷm⟩b
    Locativeтамтым ⟨tŷmcтамтій ⟨tamtij⟩тамтым ⟨tamtŷmc

    a Pyrtej (2013) gives той ⟨toj⟩ as an alternative form of the Lemko masculine nominative (as well as accusative inanimate) singular demonstrative pronoun (p. 107). That form is absent in Fontański & Chomiak (2000, p. 97).

    b Pyrtej (2013) gives тим ⟨tym⟩ as the Lemko form of the masculine and neuter instrumental singular demonstrative pronoun (p. 107), in contrast to the form тымtŷm⟩ appearing in Fontański & Chomiak (2000, p. 97).

    c Pyrtej (2013) gives тім ⟨tim⟩ as the Lemko form of the masculine and neuter locative singular demonstrative pronoun (p. 107), in contrast to the form тымtŷm⟩ appearing in Fontański & Chomiak (2000, p. 97).

    d Pyrtej (2013) gives та ⟨ta⟩ and та́я ⟨tája⟩ as alternative forms of the Lemko feminine nominative singular demonstrative pronoun (p. 107). Those forms are absent in Fontański & Chomiak (2000, p. 97).

    e Pyrtej (2013) gives ту ⟨tu⟩ and ту́ю ⟨túju⟩ as alternative forms of the Lemko feminine accusative singular demonstrative pronoun (p. 107). Those forms are absent in Fontański & Chomiak (2000, p. 97).

    f Pyrtej (2013) gives то ⟨to⟩ and то́є ⟨tóje⟩ as alternative forms of the Lemko neuter nominative singular demonstrative pronoun (p. 107). Those forms are absent in Fontański & Chomiak (2000, p. 97).

    Plural („Those”)

    CaseLemkoPolishUkrainianRussian
    Nominativeтамто́ты ⟨tamtótŷtamte/tamciтіте
    Genitiveтамтых ⟨tamtŷch⟩tamtychтихтех
    Dativeтамтым ⟨tamtŷm⟩tamtymтимтем
    Accusative (inanimate)тамто́ты ⟨tamtótŷtamteтіте
    Accusative (animate)тамтых ⟨tamtŷch
    tamtychтихтех
    Instrumentalтамты́ма ⟨tamtŷma⟩tamtymiти́мите́ми
    Locativeтых ⟨tamtŷch⟩tamtychтихтех

    References

    1. ^ Fontański, H., Chomiak, M.  (2000). Ґраматыка лемківского языка [Grammar of the Lemko Language]. Śląsk.

    2. ^ Pyrtej, P. (2013). Лемківські говірки. Фонетика і морфологія. Об’єднання лемків [Lemko Dialects. Phonetics and Morphology].

    3. Vasmer, M. (1958). Russisches Etymologisches Wörterbuch, Dritter Band: Sta–Ÿ [Russian Etymological Dictionary, Volume Three: Sta–Ÿ]. Carl Winter Universitätsverlag.

  • Say It Right: AI Neural Machine Translation Empowers New Speakers To Revitalize Lemko

    Say It Right: AI Neural Machine Translation Empowers New Speakers To Revitalize Lemko

    Abstract

    Artificial-intelligence powered neural machine translation might soon resuscitate endangered languages by empowering new speakers to communicate in real time using sentences quantifiably closer to the literary norm than those of native speakers, and starting from day one of their language reclamation journey. While Silicon Valley has been investing enormous resources into neural translation technology capable of superhuman speed and accuracy for the world’s most widely used languages, 98% have been left behind, for want of corpora: neural machine translation models train on millions of words of bilingual text, which simply do not exist for most languages, and cost upwards of a hundred thousand United States dollars per tongue to assemble.

    For low-resource languages, there is a more resourceful approach, if not a more effective one: transfer learning, which enables lower-resource languages to benefit from achievements among higher-resource ones. In this experiment, Google’s English-Polish neural translation service was coupled with my classical, rule-based engine to translate from English into the endangered, low-resource, East Slavic language of Lemko. The system achieved a bilingual evaluation understudy (BLEU) quality score of 6.28, several times better than Google Translate’s English to Standard Ukrainian (BLEU 2.17), Russian (BLEU 1.10), and Polish (BLEU 1.70) services. Finally, the fruit of this experiment, the world’s first English to Lemko translation service, was made available at the web address www.LemkoTran.com to empower new speakers to revitalize their language.

    New speakers are key to language revitalization, and the power to “say it right” in Lemko is now at their fingertips.

    Keywords: Human-Centered AI, Language Revitalization, Lemko.

    Please cite as: Orynycz, P. (2022). Say It Right: AI Neural Machine Translation Empowers New Speakers to Revitalize Lemko. In: Degen, H., Ntoa, S. (eds) Artificial Intelligence in HCI. HCII 2022. Lecture Notes in Computer Science(), vol 13336. Springer, Cham. https://doi.org/10.1007/978-3-031-05643-7_37

    This version of the contribution has been accepted for publication after peer review but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at https://doi.org/10.1007/978-3-031-05643-7_37. Use of this Accepted Version is subject to the publisher’s Accepted Manuscript terms of use: https://www.springernature.com/gp/open-research/policies/accepted-manuscript-terms.

    1 Introduction

    1.1. Problems

    This experiment aims to contribute at the local level to the global challenge of language loss, which may be occurring at the rate of one per day, with as few as one tongue in ten set to survive [1, p. 1329]. At press time, SIL International’s Ethnologue uses Lewis and Simons’ 2010 Expanded Graded Intergenerational Disruption Scale to estimate that 3,018 languages are endangered [2], which is 43% of the 7,001 individual living ones tallied at press time in International Organization for Standardization standard ISO 639-3 [3]. Meanwhile, Google Translate only serves 108 [4], and Facebook, 112 [5], which is a start. Nevertheless, one less language is now underserved, as the fruit of this experiment has been deployed to a web server as a public translation service.

    New, artificial intelligence technologies beckon with the promise of an aid that instantly compensates for language loss via human-computer interaction. In my previous experiment, next-generation neural engines achieved higher quality scores translating from Russian and Polish into English than the human control [6, p. 9]. Meanwhile, Facebook and Google1 have invested enormous resources into delivering better-than-human automatic translation systems at zero cost to consumer.

    1 Disclosure: I work as a paid Russian, Polish, and Ukrainian linguist and translation quality control specialist for the Google Translate project; headquarters are in San Francisco.

    Superhuman artificial intelligence does not come cheap: training neural language models requires bilingual corpora with wordcounts in the hundreds of thousands, and ideally, millions, which would cost hundreds of thousands of dollars to translate, sums beyond the means of most low-resource language communities. Fortunately, this experiment shows that there are more resourceful and effective ways to respond to the challenge of creating translation aids for revitalizing endangered languages in low-resource settings.

    1.2 Work So Far

    I built the world’s first Lemko to English machine translation system and have made it available to the public. Its objective translation quality scores have been improving: the engine achieved a bilingual evaluation understudy (BLEU) score of 14.57 in the summer of 2021, as presented to professionals at the National Defense Industrial Association’s Interservice/Industry Training, Simulation and Education Conference and published in its proceedings [6]. For reference, I scored BLEU 28.66 as a human translator working in field conditions, cut off from the outside world. By the autumn of 2021, the engine had reached BLEU 15.74, as reported to linguists, academics, and the wider community at an unveiling event hosted by the University of Pittsburgh.2

    2 Disclosure: the event was sponsored by the Carpatho-Rusyn Society (Pennsylvania), and I was paid by the University of Pittsburgh for my presentation.

    1.3 System Under Study

    Lemko is a definitively to severely endangered [6, p. 3, 7, pp. 177-178], low-resource [8], officially recognized minority language [9] presumably indigenous to transborder highlands south of the Cracow, Tarnów, and Rzeszów metropolitan areas; historical demarcating isoglosses will hopefully be the topic of a future paper. Poland’s census bureau tallied 6,279 residents for whom Lemko was a language “usually used at home” (even if in addition to Polish) in 2011 [10, p. 3], a 12% increase from the 5,605 for whom Lemko was a “language spoken most often at home” in 2002 [11, p. 6, 12, p. 7]. At press time, the results of a fresh count are being tabulated.

    Lemko is classifiable as an East Slavic language as it fits the customary genetic structural feature criteria, the most significant of which is pleophony [13, p. 20], whereby a vowel is assumed to have arisen in proto-Slavic sequences of consonant C followed by mid or low vowel V (*e, or *o, with which *a had merged [14, p. 366]), followed by liquid R (that is, *l or *r), followed by another consonant C, that is, CVRC > CVRVC. To illustrate, compare the Old English word for “melt”, meltan (CVRC) [15, p. 718] to its putative Lemko cognate mołódyj [16, p. 92, 17, p. 150] (CVRC), meaning “young”. Other East Slavic cognates include Ukrainian mołodýj and Russian mołodój [17], both exhibiting a vowel after the liquid (CVRVC). Meanwhile, West Slavic languages lack a vowel before the liquid; compare Polish młody and Slovak mladý (both CRVC) [17]. Further afield, kinship has been posited for other words translatable as “mild”, including Sanskrit mṛdú (CRC) [18, p. 830] and Latin mollis (CVRC if from *moldvis) [15, 17, 19, p. 323].

    How well Lemko meets customary, modern Ukrainian genetic structural feature criteria was not evaluated in this experiment. However, similarity between Lemko and Standard Ukrainian was quantified, for the first time in print of which I am aware. Below, my Lemko engine scored BLEU 6.28, nearly three times the score of Google Translate’s Ukrainian at BLEU 2.17. Further experiments could be performed for the purposes of quantification of similarity between Lemko, Standard Ukrainian, Polish, and Rusyn as codified in Slovakia, as well as a fresh take on the typological classification of Lemko.

    The quantity and quality of resources have been improving, as has resourcefulness empowered by technology. All known bilingual corpora, comprising fewer than seventy thousand Lemko words, were mustered for this experiment. I have been cleaning a bilingual corpus of transcriptions of interviews conducted with native speakers in Poland and my translations into English, which a United States client paid me to perform and permitted me to use. I am also compiling monolingual corpora, which total 534,512 words at press time.

    1.4 Hypothesis

    Based on my subjective impression as a professional translator that Lemko native speakers interviewed in Poland were more likely to use words with obvious Polish cognates than Standard Ukrainian ones, I hypothesized that, all else being equal, a machine could be configured to translate into Lemko from English and achieve BLEU objective quality scores higher than those of Google Translate’s Ukrainian and Russian services.

    1.5 Predictions

    Lemko Translation System. I predicted that the aforementioned translation system would achieve a BLEU score of 15 translating into Lemko from English against the bilingual corpus.

    Google Translate.

    English to Ukrainian service. I predicted that Google Translate’s English to Ukrainian service would achieve a BLEU score of 10 against the bilingual corpus.

    English to Russian service. I predicted that Google Translate’s English to Russian service would achieve a BLEU score of 1 against the bilingual corpus.

    1.6 Methods and Justification

    In the interest of speed, resource conversation, and ruggedizability, a laptop computer discarded as obsolete by my employer was configured to translate into Lemko and make calls to the Google Cloud Platform Google Translate service, as well as configured to evaluate said translations using the industry standard BLEU metric.

    1.7 Principal Results

    The English to Lemko translation system achieved a cumulative BLEU score of 6.28431824990417. Meanwhile, Google Translate’s Ukrainian service scored BLEU 2.16830846776652, its Russian service BLEU 1.10424105952048, and the control of Polish transliterated into the Cyrillic alphabet BLEU 1.70036447680114.

    2 Materials and Methods

    The above hypothesis was tested by calculating BLEU quality scores for each translation system set up in the manner detailed below.

    2.1 Setup

    Hardware. The experiment was conducted on an HP Elitebook 850 G2 laptop with a Core i7-5600U 2.6GHz processor, and 16 gigabytes of random-access memory. It had been discarded by my employer as obsolete and listed for sale at USD 450 at time of press.

    Configuration. In the basic input/output system (BIOS) menu, the device was configured to enable Virtualization Technology (VTx).

    Operating System. Windows 10 Professional 64 bit had been installed on bare metal. It was ensured that Virtual Machine Platform and Windows Subsystem for Linux Windows features were enabled. Next, the WSL2 Linux kernel update for x64 machines (wsl_update_x64.msi) available from Microsoft at https://aka.ms/wsl2kernel was installed.

    Software. The Docker Desktop for Windows version 4.4.3 (73365) installer was downloaded from https://www.docker.com/get-started and run with the option to Install required Windows components for WSL 2 selected.

    Packages. The experiment depended on the below packages from the Python Package Index.

    SacreBLEU. Version 2.0.0 was installed using the Python package documented at the following universal resource locator (URL):
    https://pypi.org/project/sacrebleu/2.0.0/

    Google Cloud Translation API client library. Version 2.0.1 was installed using the Python package documented at the universal resource locator (URL) https://pypi.org/project/google-cloud-translate/2.0.1/

    The above dependencies were specified in the requirements file as follows:
    google-cloud-translate==2.0.1
    sacrebleu==2.0.0

    Container.

    Build. The experiment was run in a Docker container featuring the latest version of the Python programming language, which was version 3.10.2 at the time, running on the Debian Bullseye 11 Linux operating system of AMD64 architecture, of Secure Hash Algorithm 2 shortened digest bcb158d5ddb6, obtainable via the following command:
    docker pull python@sha256:bcb158d5ddb636fa3aa567c987e7fcf61113307820d466813527ca90d60fedc7

    Runtime. The container was configured to save raw experiment data files to a local bind mounted volume.

    Translation Quality Scoring.
    Translation quality scores were calculated according to the BLEU metric using version 2.0.0 of the SacreBLEU tool invented by Post [20].

    Case sensitivity. The evaluation was performed in a case-sensitive manner.

    Tokenization. Segments were tokenized using version 13a of the Workshop on Statistical Machine Translation standard scoring script metric internal tokenization procedure.

    Smoothing Method. The smoothing technique developed at the National Institute of Standards and Technology by United States Federal Government employees for their Multimodal Information Group BLEU toolkit, being the third technique described by Chen and Cherry [21, p. 363], was employed by default.

    Signature. The above settings produced the following signature:
    nrefs:1|case:mixed|eff:no|tok:13a|smooth:exp|version:2.0.0

    Calibration. Configured as above, the machine produces the following output:

    Segment 1031.
    English sourceEverything was there.
    Lemko reference and transliterationВшытко там было.Všŷtko tam bŷlo.
    Lemkotran.com hypothesis and transliterationВшытко там было.Všŷtko tam bŷlo.
    ScoreBLEU = 100.00 100.0/100.0/100.0/100.0 (BP = 1.000 ratio = 1.000 hyp_len = 4 ref_len = 4)

    Explanation. The hypothesis segment was identical to the reference one and the machine achieved a perfect score of BLEU 100.

    Segment 179.
    English sourceI don't remember what year.
    Lemko reference and transliterationНе памятам в котрым році.Ne pamjatam v kotrŷm roci.
    Lemkotran.com hypothesis and transliterationНі памятам, в котрым році.Ni pamjatam, v kotrŷm roci.
    ScoreBLEU = 43.47 71.4/50.0/40.0/25.0 (BP = 1.000 ratio = 1.167 hyp_len = 7 ref_len = 6)

    Explanation. The hypothesis was different from the reference by two characters. The machine mistranslated the particle negating the verb, using the word for “no” (ni) instead of the expected word for “not” (ne). This has since been largely fixed. The machine also added a comma after pamjatam, which means “I remember”. That dropped the score from what would have been a perfect score of 100 to 43.47.

    Control. As the corpus is based on interviews conducted in Poland, translations into Polish were used as a control. They were transliterated into the Cyrillic alphabet by reversing the rules for transliterating Lemko names established by Poland’s Ministry of the Interior and Administration [22, p. 6564]. Polish nasal vowels were decomposed into a vowel plus a nasal stop, except before approximants, where they were directly denasalized. Word finally, the front nasal vowel /ę/ was simply denasalized, and the back one /ą/ was transliterated as if followed by a dental stop.

    3 Results

    The engine available to the public at www.LemkoTran.com took first place with a cumulative translation quality score of BLEU 6.28, nearly three times that of the runner-up, Google Translate’s English-Ukrainian service (BLEU 2.17). Next was its English-Polish service (BLEU 1.70), with its English-Russian service in last place (BLEU 1.10).

    Table 1. English to Lemko Translation Quality: LemkoTran.com versus Google Translate

    3.1 Results by machine translation service

    Control. When transliterated into the Cyrillic alphabet, Google Translate’s translations into Standard Polish achieved a corpus-level BLEU score of 1.70. Samples of its performances are as follows:

    Segment 2174.
    English sourceWe had still been in Izby, right.
    Lemko reference and transliterationТо мы іщы были в Ізбах, так.To mŷ iščŷ bŷly v Izbach, tak.
    Polish hypothesis and transliterationБилісьми єще в Ізбах, так.Byliśmy jeszcze w Izbach, tak.
    ScoreBLEU = 46.20
    Segment 854.
    English sourceAnd that's what it's all about.
    Lemko reference and transliterationІ о то ходит.I o to chodyt.
    Polish hypothesis and transliterationІ о то власьнє ходзі.I o to właśnie chodzi.
    ScoreBLEU = 32.47
    Segment 217.
    English sourceAnd that's what it's all about.
    Lemko reference and transliterationТак мі повіл.Tak mi povil.
    Polish hypothesis and transliterationТак мі повєдзял.Tak mi powiedział.
    ScoreBLEU = 35.36

    Hybrid English-Lemko Engine. The engine freely available to the public at the URL www.LemkoTran.com achieved a corpus-level BLEU score of 6.28.

    Segment 1031.
    English sourceEverything was there.
    Lemko reference and transliterationВшытко там было.Všŷtko tam bŷlo.
    Lemkotran.com hypothesis and transliterationВшытко там было.Všŷtko tam bŷlo.
    ScoreBLEU = 100.00
    Segment 1445.
    English sourceBut that officer took that medal and said,
    Lemko reference and transliterationАле тот офіцер взял тот медаль і повідат:Ale tot oficer vzial tot medal' i povidat:
    Lemkotran.com hypothesis and transliterationАле тот офіцер взял тот медаль і повіл:Ale tot oficer vzial tot medal' i povil:
    ScoreBLEU = 75.06
    Segment 217.
    English sourceThat's what he said to me.
    Lemko reference and transliterationТак мі повіл.Tak mi povil.
    Lemkotran.com hypothesis and transliterationТак мі повіл.Tak mi povil.
    ScoreBLEU = 100.00

    Ukrainian. Google Translate’s translations into Standard Ukrainian achieved a corpus-level BLEU score of 2.35.

    Segment 2419.
    English sourceWhere and when?
    Lemko reference and transliterationДе і коли?De i koly?
    Ukrainian hypothesis and transliterationДе і коли?De i koly?
    ScoreBLEU = 100.00
    Segment 1096.
    English sourceWe were there for three months.
    Lemko reference and transliterationТам зме были три місяці.Tam zme bŷly try misiaci.
    Ukrainian hypothesis and transliterationМи були там три місяці.My buly tam try misjaci.
    ScoreBLEU = 30.21
    Segment 2513.
    English sourceWell, here to the west.
    Lemko reference and transliterationНо то ту на захід.No to tu na zachid.
    Ukrainian hypothesis and transliterationНу, тут на захід.Nu, tut na zachid.
    ScoreBLEU = 30.21

    Russian. Google Translate’s English to Russian service achieved a corpus-level BLEU score of 1.10.

    Segment 432.
    English sourceNobody knew.
    Lemko reference and transliterationНихто не знал.Nychto ne znal.
    Russian hypothesis and transliterationНикто не знал.Nikto ne znal.
    ScoreBLEU = 59.46
    Segment 2751.
    English sourceWhat did they expel us for?
    Lemko reference and transliterationЗа што нас выгнали?Za što nas vŷhnaly?
    Russian hypothesis and transliterationЗа что нас выгнали?Za čto nas vygnali?
    ScoreBLEU = 42.73
    Segment 2164.
    English sourceBrother went off to war.
    Lemko reference and transliterationБрат пішол на войну.Brat pišol na vojnu.
    Russian hypothesis and transliterationБрат ушел на войну.Brat ušel na vojnu.
    ScoreBLEU = 42.73

    4 Discussion

    The Lemko translation system corpus-level BLEU score of 6.28 indicates that while there is much still to be done, things are on track. The Standard Russian score of BLEU 1.10 indicates that Lemko is less similar to Russian than Polish (BLEU 1.70). Perhaps using pre-revolutionary orthography could boost Russian’s score, but that would be an expensive experiment with little obvious benefit.

    The transliterated Standard Polish control similarity score of BLEU 1.70 indicates less interference from the dominant language in Poland than might be expected. It would be interesting to redesign the experiment where a handful of computationally inexpensive and obvious sound correspondences (for example, denasalization of *ę to /ja/ and *ǫ to /u/, retraction of *i to /y/, and change of *g to /h/ [23]) were applied to Polish to see if it then scored higher than Standard Ukrainian.

    In summary, Lemko has been synthesized in the lab and the power to produce it placed in the hands of speakers both new and native. After a thorough engine overhaul and glossary ramp-up, the next step is to objectively measure, and if feasible, have speakers subjectively rate, the quality of synthetic Lemko versus that produced by native speakers. The day when new speakers of low-resource languages can use machine translation to start communicating in their language overnight is closer, as is the day the Lemko language joins the ranks of those previously endangered, but now revitalized.

    Acknowledgements. I would like to thank my colleague Ming Qian of Peraton Labs for inspiring me to conduct this experiment, and Brian Stensrud of Soar Technology, Inc. for introducing us, as well as his encouragement.

    I would also like to thank my friend Corinna Caudill for her encouragement and personal interest in the project, as well as for introducing me to Carpatho-Rusyn Society President Maryann Sivak of the University of Pittsburgh, whom I would like to thank for the opportunity to present my work.

    I would also like to thank Maria Silvestri of the John and Helen Timo Foundation for conducting interviews with Lemko native speakers and donating the transcripts and my translations of them to research and development.

    I would like to Achim Rabus of the University of Freiburg and Yves Scherrer of the University of Helsinki for their interest in the project and ideas.

    I would also like to thank Myhal’ Lŷžečko of the minority-language technology blog InterFyisa for his early interest in the project and community outreach.

    I would also like to thank fellow son of Zahoczewie Marko Łyszyk for his interest in the project and community outreach.

    Finally, I would like to thank my co-author and Antech Systems Inc. colleague Tom Dobry for his encouragement and guidance.

    References

    1. ^ Graddol, D.: The future of language. Science, 303(5662), 1329-1331 (2004). https://doi.org/10.1126/science.1096546

    2. ^ Eberhard, D. M., Simons, G. F., & Fennig, C. D.: Ethnologue: Languages of the World, SIL International. Twenty-fourth edition. SIL International, Dallas (2021). Online version: How many languages are endangered?, https://www.ethnologue.com/guides/how-many-languages-endangered, last accessed 2022/02/11.

    3. ^ ISO 639 Code Tables, https://iso639-3.sil.org/code_tables/639/data, last accessed 2022/02/11.

    4. ^ Language support, https://cloud.google.com/translate/docs/languages, last accessed 2022/02/11.

    5. ^ Select language, https://m.facebook.com/language.php, last accessed 2022/02/11.

    6. ^ ^ Orynycz, P., Dobry, T., Jackson, A., & Litzenberg, K.: Yes I Speak… AI Neural Machine Translation in Multi-Lingual Training. In: Proceedings of the Interservice/Industry Training, Simulation, and Education Conference (I/ITSEC) 2021, Paper no. 21176. National Training and Simulation Association, Orlando (2021). https://www.xcdsystem.com/iitsec/proceedings/index.cfm?Year=2021&AbID=96953&CID=862

    7. ^ Duć-Fajfer, O.: Literatura a proces rozwoju i rewitalizacja tożsamości językowej na przykładzie literatury łemkowskiej. In: Olko, J., Wicherkiewicz, T., Borges, R. (eds.), Integral Strategies for Language Revitalization, pp. 175–200. First edition. Faculty of „Artes Liberales”, University of Warsaw, Warsaw (2016).

    8. ^ Scherrer, Y., Rabus, A.: Neural morphosyntactic tagging for Rusyn. In: Mitkov, R., Tait, J., Boguraev, B. (eds.), Natural Language Engineering, 25(5), 633–650. Cambridge University Press, Cambridge (2019). https://doi.org/10.1017/S1351324919000287

    9. ^ Reservations and Declarations for Treaty No.148 – European Charter for Regional or Minority Languages (ETS No. 148), https://www.coe.int/en/web/conventions/full-list?module=declarations-by-treaty&numSte=148&codeNature=1&codePays=POL, last accessed 2022/02/11.

    10. ^ Formularz indywidualny, https://stat.gov.pl/download/gfx/portalinformacyjny/pl/defaultstronaopisowa/5781/1/1/nsp_2011_badanie__pelne_wykaz_pytan.pdf, last accessed 2022/02/11.

    11. ^ Narodowy Spis Powszechny Ludności i Mieszkań 2002 r. z 20 maja (formularz A) https://stat.gov.pl/gfx/portalinformacyjny/userfiles/_public/spisy_powszechne/nsp2002-form-a.pdf, last accessed 2022/02/11.

    12. ^ IV Raport dotyczący sytuacji mniejszości narodowych i etnicznych oraz języka regionalnego w Rzeczypospolitej Polskiej – 2013, http://mniejszosci.narodowe.mswia.gov.pl/download/86/14637/TekstIVRaportu.pdf, last accessed 2022/02/11.

    13. ^ Vaňko, J.: The Language of Slovakia’s Rusyns. East European Monographs, New York (2000).

    14. ^ Forston, B., IV: Indo-European Language and Culture. Blackwell Publishing, Oxford (2004).

    15. ^ ^ Pokorny, J.: Indogermanisches etymologisches Wörterbuch, Bern, 1959.

    16. ^ Horoszczak, J.: Słownik łemkowsko-polski, polsko-łemkowski. Rutenika, Warsaw (2004).

    17. ^ ^ ^ ^ Vasmer, M. Russisches etymologisches Wörterbuch. Zweiter Band. Carl Winter, Universitätsverlag, Heidelberg (1955).

    18. ^ Monier-Williams, M.: A Sanskrit-English Dictionary Etymologically and Philologically Arranged with Special Reference to Cognate Indo-European Languages, The Clarendon Press, Oxford (1899).

    19. ^ Derksen, R.: Etymological Dictionary of the Slavic Inherited Lexicon. In: Lubotsky, A. (ed.) Leiden Indo-European Etymological Dictionary Series, vol. 4, Koninklijke Brill, Leiden (2008).

    20. ^ Post, M.: A Call for Clarity in Reporting BLEU Scores. In: Proceedings of the Third Conference on Machine Translation (WMT), vol. 1, pp. 186–191. Association for Computational Linguistics, Brussels (2018). https://aclanthology.org/W18-63

    21. ^ Chen B., Cherry, C.: A Systematic Comparison of Smoothing Techniques for Sentence-Level BLEU. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, pp. 362–367. Association for Computational Linguistics, Baltimore (2014). http://dx.doi.org/10.3115/v1/W14-33

    22. ^ Ministerstwo Spraw Wewnętrznych i Administracji: Rozporządzenie Ministra Spraw Wewnętrznych i Administracji z dnia 30 maja 2005 r. w sprawie sposobu transliteracji imion i nazwisk osób należących do mniejszości narodowych i etnicznych zapisanych w alfabecie innym niż alfabet łaciński. In: Dziennik Ustaw Nr 102, pp. 6560–6573. Rządowe Centrum Legislacji, Warsaw (2005).

    23. ^ Shevelov, G.: On the Chronology of H and the New G in Ukrainian. In: Harvard Ukrainian Studies, vol. 1, no. 2, pp. 137–152. Harvard Ukrainian Research Institute, Cambridge (1977). https://www.jstor.org/stable/40999942

  • Lemko быти ⟨bŷty⟩ 'be’

    Lemko быти ⟨bŷty⟩ 'be’

    To be or not to be? Быти або не быти? That is the question, and now you can conjugate the infinitives made famous by the opening line of Hamlet’s soliloquy in Lemko using the automatic translation service LemkoTran, or craft your own copulae using this handy DIY guide.

    Translations

    The Lemko verb быти (scientific transliteration: ⟨bŷty⟩) means”to be” in English, być in Polish, бути ⟨buty⟩ in Standard Ukrainian, and быть ⟨byt’⟩ in Muscovite Russian.

    EnglishLemkoPolishUkrainianRussian
    beбыти ⟨bŷty⟩byćбутибыть
    Translations of the Lemko verb быти into English, Polish, Ukrainian, and Russian.

    Etymology

    The Lemko infinitive быти ⟨bŷty⟩, meaning „to be”, comes from the Proto-Slavic athematic verb *byti, and is related to Sanskrit भूति ⟨bhūtíṣ⟩ „wellbeing” (Vasmer 1953, p. 159; Pokorny 1959 147), Persian بودن ⟨būdan⟩ „be” (Pokorny, p. 147), Latin futūrus „future” (Vasmer, p. 159, Pokorny, p. 149), and via Old English bēon, English be (Pokorny, p. 149).

    Attestation

    Hamlet’s famous opening line „To be or not to be, that is the question” is alluded to in the following published pieces found in the wild:

    Для дакотрых орґанізаций є то быти або не быти, значыт, без тых грошів не сут в силі нич зреализувати.” (LEM.fm 2021)

    Transcription
    dl'a dakotrŷch organizacyj je to bŷty abo ne bŷty, značŷt, bez tŷch hrošiv ne sut v syl'i nyč zrealyzuvaty.

    Translation
    For some organizations, it's to be or not to be, meaning they will not be able to achieve anything without those funds.

    От нашых діл и нашой віры буде рішатися вопрос: ци нам лемкам быти, ци не быти?….” (Цисляк 1964, p. 162)

    Transliteration
    Ot našŷch dil y našoj virŷ bude rišatysia vopros: cy nam lemkam bŷty, cy ne bŷty?…
    Translation
    Our affairs and our faith will be decide the question of whether we Lemkos are to be or not to be

    Inflection

    Future Tense

    Root: буд– ⟨bud-⟩

    The future tense of the Lemko verb for to be, быти ⟨bŷty⟩, is formed by adding personal endings to the root bud-, equivalent to will in English.

    Etymology

    Lemko bud- comes from the Proto-Slavic root *bǫd-. Compare the suffix -bund in English moribund from Latin moribundus (Pokorny, p. 150, Vasmer, p. 136).

    Conjugation Table

    EnglishLemkoPolishUkrainianRussian
    I willбуду ⟨búdu⟩bęбудубуду
    you willбудеш⟨búdeš⟩będzieszбудешбудешь
    (s)he willбуде ⟨búdet⟩będzieбудебудет
    we willбудеме ⟨budéme⟩będziemyбудемобудем
    you all willбудете ⟨budéte⟩będziecieбудетебудете
    they willбудут ⟨búdut⟩bęбудутьбудут
    Forms of the future tense conjugation of Lemko verb быти ⟨bŷty⟩ translated into English, Polish, Standard Ukrainian, and Russian.
    Reference
    Fontański & Chomiak (2000, p. 106).

    Present Tense

    Root: є– ⟨je-⟩, с– ⟨s-⟩

    In Lemko, the present tense of the verb to be is formed in the singular from the root є- ⟨je-⟩, and in the plural from the root с- ⟨s-⟩.

    Etymology

    All the below forms trace back to the ancestor of the Proto-Slavic root *es-, to which personal endings were affixed. Compare to English is, German ist, Latin est, Ancient Greek ἐστί ⟨estí⟩, Persian است ⟨ast⟩, and Sanskrit अस्ति ⟨ásti⟩ (Pokorny, pp. 340-341; Vasmer, p. 405).

    Conjugation Table

    EnglishLemkoPolishUkrainianRussian
    I amєм ⟨jem⟩jestemєесть
    you areєс ⟨jes⟩jesteśєесть
    (s)he isєст ⟨jest⟩ajestєесть
    we areсме ⟨sme⟩bjesteśmyєесть
    you all areсте ⟨ste⟩cjesteścieєесть
    they areсут ⟨sut⟩єесть
    Forms of the present tense conjugation of the Lemko verb быти ⟨bŷty⟩ translated into English, Polish, Standard Ukrainian, and Russian.

    a The Lemko third-person singular form єст ⟨jest⟩ is now being replaced by є ⟨je⟩, though this is still rare (Fontański & Chomiak 2000, p. 109).

    b Fontański & Chomiak (2000, p. 109) give the Lemko first-person plural form as (єсме)сме/зме ⟨(jesme)sme/zme⟩.

    c Fontański & Chomiak (2000, p. 109) give the Lemko second-person plural form as (єсте)сте ⟨(jeste)ste⟩.

    Reference
    Fontański & Chomiak (2000, p. 106).

    Past Tense

    Root: был- ⟨bŷl-⟩

    The past tense of the verb „to be” is formed in Lemko by adding any appropriate gender and plural markers to the stem был- ⟨bŷl-⟩, translatable into English as was or were.

    Etymology

    Lemko был ⟨bŷl⟩ is undoubtedly the continuation of Proto-Slavic resultative participle *bylŭ. Compare to Ancient Greek φῦλον ⟨phylon⟩ (Vasmer, p. 159), whence English phylum.

    Conjugation Tables

    Masculine

    Use the following to refer to males or mixed parties of males and females, as well as objects of grammatically masculine gender. Male virility is not a grammatical category in Lemko, unlike Polish.

    EnglishLemkoPolishUkrainianRussian
    I wasя былa
    ⟨ja bŷl⟩
    byłemя бувя был
    you wereты былb
    ⟨tý bŷl⟩
    byłeśти бувты был
    he wasвін был
    ⟨vin bŷl⟩
    byłвін бувон был
    we wereмы былиc
    ⟨mŷ bŷly⟩
    byliśmyми булимы были
    you guys wereвы былиd
    ⟨vŷ bŷly⟩
    byliścieви буливы были
    those guys wereони были
    ⟨ony bŷly⟩
    byliвони булиони были
    Forms of the masculine past tense conjugation of the Lemko verb быти ⟨bŷty⟩ translated into English, Polish, Standard Ukrainian, and Russian.

    a Fontański & Chomiak (2000, p. 109) cite был єм ⟨bŷl em⟩ as an alternative masculine first person singular form of the past of the verb „to be”.

    b Fontański & Chomiak (2000, p. 109) cite был єс ⟨bŷl es⟩ as an alternative masculine second person singular form of the past of the verb „to be”.

    c Fontański & Chomiak (2000, p. 109) cite были сме ⟨bŷly sme⟩ as an alternative first person plural form of the past of the verb „to be”.

    d Fontański & Chomiak (2000, p. 109) cite были сте ⟨bŷly ste⟩ as an alternative second person plural form of the past of the verb „to be”.

    Reference
    Fontański & Chomiak (2000, p. 106).
    Feminine

    Use the below to refer to females and objects of grammatically feminine gender.

    EnglishLemkoPolishUkrainianRussian
    I wasя былаa
    ⟨ja bŷla⟩
    byłamя булая была
    you wereты былаb
    ⟨tý bŷla⟩
    byłaśти булаты была
    she wasона была
    ⟨ona bŷla⟩
    byłaвона булаон была
    we wereмы былиc
    ⟨mŷ bŷly⟩
    byłyśmyми булимы были
    you gals wereвы былиd
    ⟨wŷ bŷly⟩
    byłyścieви буливы были
    those gals wereони были
    ⟨ony bŷly⟩
    byłyвони булиони были
    Forms of the feminine past tense conjugation of the Lemko verb быти ⟨bŷty⟩ translated into English, Polish, Standard Ukrainian, and Russian.

    a Fontański & Chomiak (2000, p. 109) cite была єм ⟨bŷla em⟩ and былам ⟨bŷlam⟩ as alternative feminine first person singular forms of the past of the verb „to be”.

    b Fontański & Chomiak (2000, p. 109) cite была єс ⟨bŷla es⟩ and былас ⟨bŷlas⟩ as alternative feminine second person singular forms of the past of the verb „to be”.

    c Fontański & Chomiak (2000, p. 109) cite были сме ⟨bŷly sme⟩ as an alternative first person plural form of the past of the verb „to be”.

    d Fontański & Chomiak (2000, p. 109) cite были сте ⟨bŷly ste⟩ as an alternative second person plural form of the past of the verb „to be”.

    Reference
    Fontański & Chomiak (2000, p. 106).
    Neuter

    Use the below to refer to objects of grammatically neuter gender.

    EnglishLemkoPolishUkrainianRussian
    it wasоно было
    ⟨ono bŷlo⟩
    byłoвоно булооно было
    they wereони были
    ⟨ony bŷly⟩
    byłyвони булиони были
    Forms of the neuter past tense conjugation of the Lemko verb быти ⟨bŷty⟩ translated into English, Polish, Standard Ukrainian, and Russian.
    Reference
    Fontański & Chomiak (2000, p. 106).

    References

    1. Fontański, H., Chomiak, M.  (2000). Ґраматыка лемківского языка [Grammar of the Lemko Language]. Śląsk.
    2. Vasmer, M. (1953). Russisches Etymologisches Wörterbuch, Erster Band: A – K [Russian Etymological Dictionary, Volume One: A – K]. Carl Winter Universitätsverlag.
    3. Pokorny, J. (1959). Indogermanisches etymologisches Wörterbuch, I. Band [Indo-Germanic Etymological Dictionary, Volume One]. A. Francke AG Verlag.
    4. Цисляк, А. (1964). Нашы Родны Бескиды [Our Ancestral Beskid Mountains]. In: Карпаторусский Календарь Лемко-Союза На Год 1964. Типография Лемко-Союза.
    5. Lem.fm (2021). Хто робит, а хто… но власні, што? [He Who Does, and He Who… Well, What?], www.Lem.fm.
    Strona Główna » Nowości

  • New Experiment: Lab-Made Lemko?

    New Experiment: Lab-Made Lemko?

    I will be conducting an experiment this month to see if machines can be made to translate into Lemko better than Google Translate or humans.

    Hypothesis

    A machine can be configured to translate from English into the endangered Slavic language of Lemko and achieve quality scores higher than those of Google Translate’s Ukrainian service, but not yet higher than those of humans.

    Predictions

    • My English to Lemko rule-based machine translation (RBMT) engine will achieve a bilingual evaluation understudy (BLEU) score of 15 against a clean bilingual corpus.
    • The above engine will achieve a BLEU score that is a third higher (e.g. 20) when coupled with an improvised dictionary-based machine translation (DBMT) created from Lemko-Polish unit-test assertion pairs.
    • Google Translate’s English to Ukrainian translation service will achieve a BLEU score of 10 against the above corpus.
    • I, a human, will achieve a higher BLEU score than all the above machines against the above corpus.

    The experiments will be conducted over the next week or so, for subsequent publication.

  • AI Neural Machine Translation in Multilingual Training Presented at I/ITSEC 2021

    AI Neural Machine Translation in Multilingual Training Presented at I/ITSEC 2021

    ORLANDO, Dec 2 (Orynycz.com) – It was an honor to present the breakthroughs in our paper Yes I Speak… AI Neural Machine Translation in Multilingual Training at the National Defense Industrial Association’s (NDIA) I/ITSEC 2021 conference, the world’s largest modeling, simulation, and training event, with 13,000 in-person attendees from 47 countries representing governments, universities, corporations and militaries, including including United States Marine Corps Commandant General David H. Berger and Chief of Naval Operations Admiral Michael Gilday.

    Special thanks to Emerging Concepts and Innovative Technologies (ECIT) Session 7 Chair Brian Stensrud, Ph.D. and Session Deputy Neil Stagner of United States Marine Corps Systems Command for all the support that made it possible.

    Breakthroughs

    On low-end, air-gapped laptops in secure, field conditions, our translation engines achieved:

    • Translation quality BLEU scores 59% better than those of professional linguists for the Russian to English language pair
    • The world’s first usable Lemko to English machine translations
    • 1,170% faster-than-human (real-time) Russian-to-English translation speeds

    For details, check out the full paper.

  • Watch AI Empower New Speakers of Endangered Languages Like Lemko

    Watch AI Empower New Speakers of Endangered Languages Like Lemko

    Engineer Petro Orynycz unveils AI technology that empowers endangered language (Lemko) new speakers to read their language immediately. Watch and follow along with this interactive seminar.

    Watch

    https://www.orynycz.com/show/watch-ai-empower-new-speakers

    On Youtube

    Watch on YouTube

    On Facebook

    Watch on Facebook

    Try It Yourself

    1. Copy Lemko Text Below

    130 років тому вродил ся Теофіль Курилло, передовый представник лемківской інтеліґенциі
    Записал обставины поневоліня в початковым періоді од 14. вересня/септембра 1914 р. до 22. серпня/авґуста 1915 р.
    130 років тому в Розділю під Ґорлицями вродил ся єден з передовых представників лемківской інтеліґенциі поч. ХХ ст. – Теофіль Курилло (1891-1945).

    Source: LEM.FM – 130 років тому вродил ся Теофіль Курилло, передовый представник лемківской інтеліґенциі

    2. Paste Text Into Translator:

    3. Press „Go!” Button Above.

    Description

    In a hands-on demo attended by over 50 worldwide, Natural Language Processing Engineer Petro Orynycz and Carpatho-Rusyn Society President Maryann Sivak unveil hybrid artificial intelligence technology that empowers new speakers of Lemko to read in the language immediately. Implications for endangered, low-resource language revitalization are discussed.

    Promotional Flyer by the University of Pittsburgh

    See here for the official announcement on the website of the Nationality Rooms of the University of Pittsburgh.

    Thank You Sponsors

    University of Pittsburgh
    University of Pittsburgh Center for Russian, East European & Eurasian Studies
    Carpatho-Rusyn Society
    https://www.orynycz.com/show/watch-ai-empower-new-speakers
  • AI Empowers Speakers of Endangered Languages like Lemko Event Watchable On-Demand

    AI Empowers Speakers of Endangered Languages like Lemko Event Watchable On-Demand

    https://www.orynycz.com/lemko/watch-ai-empower-new-speakers

    The October 26, 2021 product launch and interactive demonstration was taped as can now be watched in its entirety here.

    To watch and follow along, visit the event page.

    Interactive Demo: AI Empowers Speakers of Endangered Languages like Lemko
    https://www.orynycz.com/lemko/watch-ai-empower-new-speakers

    YouTube and Facebook

    You can also watch, like, comment, and share the presentation on Facebook or on YouTube.

  • Event: Artificial Intelligence Empowers Speakers of Endangered Languages like Lemko

    Event: Artificial Intelligence Empowers Speakers of Endangered Languages like Lemko

    October 26, 2021 | 6 p.m. EST

    In a demonstration sponsored by the University of Pittsburgh
    Center for Russian, East European, and Eurasian Studies, Pittsburgh
    Nationality Rooms and Intercultural Exchange Programs, and University Center for International Studies, as well as the Carpatho-Rusyn Society, engineer and linguist Petro Orynycz unveils hybrid artificial intelligence technology that empowers new speakers of Lemko to read in the language immediately. Implications for endangered, low-resource language revitalization are discussed.

    See here for the official announcement on the website of the Nationality Rooms of the University of Pittsburgh.

    Information can also be found on Facebook and LinkedIn.

    Official Zoom link:

    https://pitt.zoom.us/meeting/register/tJckdemprTMiGdClNhkIz9n1bexGm6ANZZkF