Skip to content

CLDR-19605: Add Nogai (nog) Latin keyboard layout#5861

Open
murza-enikeeff wants to merge 1 commit into
unicode-org:mainfrom
murza-enikeeff:nog-latn-keyboard
Open

CLDR-19605: Add Nogai (nog) Latin keyboard layout#5861
murza-enikeeff wants to merge 1 commit into
unicode-org:mainfrom
murza-enikeeff:nog-latn-keyboard

Conversation

@murza-enikeeff

Copy link
Copy Markdown

CLDR-19605

  • This PR completes the ticket.

Sociolinguistic and Technical Justification for Nogai Layout (nog-Latn)

Depends on CLDR-19605 (Core Data)

1. UNESCO Status and Current Linguistic Peril

The Nogai language (nog) is officially classified by the UNESCO Atlas of the World's Languages in Danger as "Definitely Endangered." The language faces severe existential pressure due to a historical lack of institutional support, a critical shortage of native-language schools, and systematic displacement from official and educational spheres. Providing native digital input mechanisms is a critical, non-negotiable step toward preventing total language extinction.

2. Historical Context: Forced Script Transitions as Structural Assimilation

The orthographic history of the Nogai language is a documentation of forced linguistic engineering and voluntary-compulsory Russification of minoritized indigenous peoples:

  • Pre-1928: The Nogai people utilized a highly functional Arabic-based script, maintaining deep cultural and historical ties with their heritage.
  • 1928–1938: The Arabic script was officially replaced by a Latin-based alphabet.
  • 1938–Present: As part of a centralized policy of forced cultural assimilation, the Latin script was abruptly abolished and replaced with a modified Cyrillic alphabet.

These rapid, politically driven script disruptions fractured intergenerational literacy, isolated the population from their historical literature, and acted as structural elements of linguistic ethnocide.

3. Digital Marginalization as Ongoing Assimilation

Currently, major operating systems and input engines (including Google Gboard, iOS, and Windows) completely lack native support for Nogai layouts. This absence forces Nogai speakers into absolute digital dependency on surrogate layouts:

  • Forced Substitution: Speakers are systematically forced to use either standard Russian or Kazakh keyboards.
  • Technical Fragmentation: Using the Russian layout forces users to manually split native digraphs (Аь, Оь, Уь, Нъ) into separate characters. This breaks digital text processing, renders spell-check and predictive text impossible, and corrupts corpus linguistics data.
  • Digital Colonialism: Forcing an endangered language community to adopt the dominant state language's layout (Russian) functions as an ongoing mechanism of digital assimilation, stripping the language of its visual autonomy.

4. Technical Philosophy of the Latin Layout (nog-Latn.xml)

The proposed nog-Latn.xml layout is designed to serve as a functional bridge for future orthographic modernization and seamless integration with the wider Turkic digital space. It heavily aligns with the principles of the Common Turkic Alphabet, ensuring cross-compatibility with neighboring Turkic languages while precisely catering to Nogai phonetics.
Rather than relying purely on longpress modifiers over a standard English QWERTY base, this layout provides a native, extended Turkic typing experience. Key architectural decisions include:

  • Extended Base Layer for Native Phonetics: All critical Nogai-specific characters are elevated to the primary UI. This includes front vowels ('ä', 'ö', 'ü'), specific consonants ('ş', 'ç', 'ğ'), and the nasal 'ñ'. Placing these directly on the main board allows for rapid, fluid typing of native vocabulary without the constant friction of longpress menus.
  • Strict Turkic Vowel Distinction: The layout explicitly separates the dotless 'ı' and the dotted 'i' on the main user interface. This is a fundamental requirement for Turkic languages to prevent semantic ambiguity and ensure correct capitalization behavior.
  • Smart Longpress Fallbacks & Loanword Support: The longpress functionality is strategically reserved for secondary orthographic needs:
  • Alternative Standards: Providing 'ŋ' alongside 'ñ' under the 'n' key to accommodate variations in dialectal transcription or preferences of different linguistic schools.
  • Classical Transcriptions: Including circumflexed vowels ('â', 'î', 'û') under their respective base vowels ('a', 'i'/'ı', 'u') to support the accurate transcription of historical texts or specific Arabic/Persian loanwords.

This architecture ensures that the Nogai Latin layout is not a compromised adaptation, but a fully realized, modern digital tool built for the pan-Turkic web.

Conclusion

By unifying these graphic systems into a cohesive, longpress-accessible architecture, this specification empowers a marginalized speech community to bypass structural barriers, reclaim their graphic history, and democratically determine the future trajectory of their language.

ALLOW_MANY_COMMITS=true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant