Peer-Reviewed AI Translation Papers (Lemko)

I am a peer-reviewed author and speaker at the world’s largest defense training conference (I/ITSEC), as well as human-computer interaction (HCI) international conferences published by Springer Nature, one of the most prestigious, highest-impact, and oldest continuously operating academic publishers.

2021

2022

2023

Winning Hearts & Tongues: A Polish to Lemko Case Study

Read Full Text

Download PDF

Language loss isn’t just cultural—it’s operational. This paper builds and evaluates Polish ↔ Lemko machine translation (expert rule-based + Transformer NMT) and benchmarks both directions with DARPA-backed metrics (BLEU, TER), including a Google Translate proxy baseline.

Key contributions

Builds a Polish ↔ Lemko translation stack combining an expert rule-based engine and Transformer NMT for a low-resource setting.
Evaluates both directions (PL→LEM, LEM→PL) with standardized, reproducible metrics (BLEU + TER).
Connects minority-language MT to real outcomes: training effectiveness, access, and resilience in contested information environments.

Key results

PL → LEM: expert system BLEU 29.49 / TER 53.73; reported as ~6.5× a Google Translate Polish→Ukrainian proxy on BLEU.
LEM → PL: expert system BLEU 31.13 / TER 54.10.
Transformer (PL → LEM): BLEU 15.90 (30k steps)—above the proxy baseline, below the expert system in this setup.

BLEU Skies for Endangered Language Revitalization: Lemko Rusyn and Ukrainian Neural AI Translation Accuracy Soars

Read Full Text

Download PDF

Minority-language loss isn’t just cultural—it’s measurable harm. This paper reports a major upgrade to LemkoTran.com, combining rule-based generation with neural MT so Lemko speakers and new learners can read and write instantly. I add morphology-aware noun/verb/adjective generators, expand the lexicon, enforce 9,518 must-pass QC tests, and benchmark translation quality with BLEU, TER, and chrF against multiple Google Translate services.

Key contributions

Upgrades LemkoTran’s hybrid stack (rule-based + neural) to translate into and out of Lemko with morphology-aware generation.
Hardens correctness with 9,518 codification-referenced tests, turning translation quality into enforceable quality control.
Expands linguistic coverage with ~1,585 rule-based vocabulary items plus generators fueled by 877 lemmata + 708 glossary entries.
Benchmarks rigorously using SacreBLEU defaults and three complementary metrics (BLEU / TER / chrF) for reproducible comparisons.
Shows an engineering path to de-interference: rule-based modules enable purging loanwords / dominant-language bleed-through (where desired).

Key results

EN → LEM: BLEU rises to 8.48 (+35% vs prior publication), reported as ~4× Google Translate’s best service on BLEU.
LEM → EN: BLEU reaches 17.95 (+23% vs prior work), reported as ~16% higher than Google Translate’s Ukrainian service (best-performing baseline).
Across metrics: LemkoTran beats Google across BLEU + TER + chrF, with Google often mis-identifying Lemko (frequent Ukrainian/Russian/Belarusian detection).

2022

Say It Right: AI Neural Machine Translation Empowers New Speakers To Revitalize Lemko

Read Full Text

Download PDF

AI can give endangered languages leverage: new speakers can produce sentences closer to the literary norm from day one. Say It Right (2022) presents a low-resource pipeline (transfer learning + rule-based MT), ships a public English→Lemko system, and evaluates quality with BLEU.

Key contributions

Transfer learning + rule-based engine for a low-resource language.
Quantitative evaluation (BLEU), not vibes.
Deployed as a public tool (LemkoTran).

Key results

English→Lemko system: BLEU 6.28 (reported).
Compared against Google’s Ukrainian/Russian/Polish outputs (reported).
Built for resource-constrained execution (laptop/offline-friendly workflow).

2021

Yes I Speak… AI Neural Machine Translation in Multi-Lingual Training (2021)

Read Full Text

Download PDF

This paper shows how neural machine translation (NMT) can crush localization bottlenecks for coalition training: instead of waiting months for human translation, you can deploy multilingual content in days/weeks by using NMT for rapid localization.

Key results

Russian: +1,169.51% faster and +58.37% more accurate vs a professional human linguist baseline.
Polish: +17.29% more accurate and +488.45% faster vs human.
Lemko: “world’s first” engine, BLEU 14.57 reported.

Key contributions

Frames localization as an operational bottleneck and targets “training tonight / next week” timelines.
Builds and evaluates NMT engines on NATO training materials using BLEU as the evaluation metric.
Demonstrates a practical workflow on an inexpensive, air-gapped laptop (realistic deployment constraints).

› Publications

2023

Winning Hearts & Tongues: A Polish to Lemko Case Study

Key contributions

Key results

BLEU Skies for Endangered Language Revitalization: Lemko Rusyn and Ukrainian Neural AI Translation Accuracy Soars

Key contributions

Key results

2022

Say It Right: AI Neural Machine Translation Empowers New Speakers To Revitalize Lemko

Key contributions

Key results

2021

Yes I Speak… AI Neural Machine Translation in Multi-Lingual Training (2021)

Key results

Key contributions