I am a peer-reviewed author and speaker at the world’s largest defense training conference (I/ITSEC), as well as human-computer interaction (HCI) international conferences published by Springer Nature, one of the most prestigious, highest-impact, and oldest continuously operating academic publishers.
2023
Winning Hearts & Tongues: A Polish to Lemko Case Study
Language loss isn’t just cultural—it’s operational. This paper builds and evaluates Polish ↔ Lemko machine translation (expert rule-based + Transformer NMT) and benchmarks both directions with DARPA-backed metrics (BLEU, TER), including a Google Translate proxy baseline.
Key contributions
- Builds a Polish ↔ Lemko translation stack combining an expert rule-based engine and Transformer NMT for a low-resource setting.
- Evaluates both directions (PL→LEM, LEM→PL) with standardized, reproducible metrics (BLEU + TER).
- Connects minority-language MT to real outcomes: training effectiveness, access, and resilience in contested information environments.
Key results
- PL → LEM: expert system BLEU 29.49 / TER 53.73; reported as ~6.5× a Google Translate Polish→Ukrainian proxy on BLEU.
- LEM → PL: expert system BLEU 31.13 / TER 54.10.
- Transformer (PL → LEM): BLEU 15.90 (30k steps)—above the proxy baseline, below the expert system in this setup.
BLEU Skies for Endangered Language Revitalization: Lemko Rusyn and Ukrainian Neural AI Translation Accuracy Soars
Minority-language loss isn’t just cultural—it’s measurable harm. This paper reports a major upgrade to LemkoTran.com, combining rule-based generation with neural MT so Lemko speakers and new learners can read and write instantly. I add morphology-aware noun/verb/adjective generators, expand the lexicon, enforce 9,518 must-pass QC tests, and benchmark translation quality with BLEU, TER, and chrF against multiple Google Translate services.
Key contributions
- Upgrades LemkoTran’s hybrid stack (rule-based + neural) to translate into and out of Lemko with morphology-aware generation.
- Hardens correctness with 9,518 codification-referenced tests, turning translation quality into enforceable quality control.
- Expands linguistic coverage with ~1,585 rule-based vocabulary items plus generators fueled by 877 lemmata + 708 glossary entries.
- Benchmarks rigorously using SacreBLEU defaults and three complementary metrics (BLEU / TER / chrF) for reproducible comparisons.
- Shows an engineering path to de-interference: rule-based modules enable purging loanwords / dominant-language bleed-through (where desired).
Key results
- EN → LEM: BLEU rises to 8.48 (+35% vs prior publication), reported as ~4× Google Translate’s best service on BLEU.
- LEM → EN: BLEU reaches 17.95 (+23% vs prior work), reported as ~16% higher than Google Translate’s Ukrainian service (best-performing baseline).
- Across metrics: LemkoTran beats Google across BLEU + TER + chrF, with Google often mis-identifying Lemko (frequent Ukrainian/Russian/Belarusian detection).
2022
Say It Right: AI Neural Machine Translation Empowers New Speakers To Revitalize Lemko
AI can give endangered languages leverage: new speakers can produce sentences closer to the literary norm from day one. Say It Right (2022) presents a low-resource pipeline (transfer learning + rule-based MT), ships a public English→Lemko system, and evaluates quality with BLEU.
Key contributions
- Transfer learning + rule-based engine for a low-resource language.
- Quantitative evaluation (BLEU), not vibes.
- Deployed as a public tool (LemkoTran).
Key results
- English→Lemko system: BLEU 6.28 (reported).
- Compared against Google’s Ukrainian/Russian/Polish outputs (reported).
- Built for resource-constrained execution (laptop/offline-friendly workflow).
2021
Yes I Speak… AI Neural Machine Translation in Multi-Lingual Training (2021)
This paper shows how neural machine translation (NMT) can crush localization bottlenecks for coalition training: instead of waiting months for human translation, you can deploy multilingual content in days/weeks by using NMT for rapid localization.
Key results
- Russian: +1,169.51% faster and +58.37% more accurate vs a professional human linguist baseline.
- Polish: +17.29% more accurate and +488.45% faster vs human.
- Lemko: “world’s first” engine, BLEU 14.57 reported.
Key contributions
- Frames localization as an operational bottleneck and targets “training tonight / next week” timelines.
- Builds and evaluates NMT engines on NATO training materials using BLEU as the evaluation metric.
- Demonstrates a practical workflow on an inexpensive, air-gapped laptop (realistic deployment constraints).
