ECura Multilingual Parallel Corpus of Yunnan Ethnic Minority Music

Working in collaboration with members of three villages from China’s multi-ethnic Southwest, the research team seeks out ways to empower community members to take up the new digital technologies to become active collectors and curators of their traditional music and dance. The research includes the development of spoken language – oriented database for their Indigenous musical tradition and its associated software program used in digital social media platforms, enabling all villagers to readily record and share their own songs and other musical heritage digitally, and the creation of an online database dedicated to their artistry. The resulting research framework will be transferable to a broad cross-section of endangered cultures globally.

In year 3, ECura project is going to set up the dataset ‘ECura Multilingual Parallel Corpus of Yunnan Ethnic Minority Music’. This dataset aims to utilize modern information technologies such as multimodal data storage, multilingual corpus retrieval and analysis, and natural language processing. The dataset collects the commonly used words which exists in oral musical tradition of Yi, Bai and Miao ethnic minority communities as the main reference (has a great potential to add spoken language oriented from musical tradition of other ethnic minority groups in the near future). The project team uses the existing audio/video recordings of ancient songs sung by Yi, Bai and Miao people (include epics, creation songs, ancestor songs, migration songs, etc.), take the the audios (sound of the words) as a central point, connect the audios to organize and document their corresponding written script (only in Yi and Miao, Bai doesn’t have their written form language), written form of Chinese, English, International Phonetic Alphabet (IPA), along with the related annotations, images, performance videos, and other multimodal content (for instance, the references to the names of the people and locations, the words with special meanings in local tradition, the words with no actually meaning but has theirs specific meanings in traditional singing, etc.). The goal of the dataset is to establish a multimodal, multilingual parallel corpus centered around the theme of ancient songs among Yunnan ethnic minority communities which mainly exist in their oral tradition.

News announced by project PI Lijuan Qian

Literatures and References

Yi Language Dialect Areas and the Standardization of Yi Script

The Yi people, China’s seventh-largest ethnic group, primarily live in Sichuan, Guizhou, Yunnan, and Guangxi provinces, with a population of 6.1 million. The Yi language is characterized by significant dialectal diversity and a historically fragmented writing system, making it under-resourced within China’s linguistic landscape. To address these challenges, the Chinese government launched the Yi Script Standardization Scheme in 1980, selecting the Northern Yi dialect’s Nuosu language as the base. This initiative introduced 1,165 core characters, standardized the Yi writing system from left to right, and incorporated both Arabic and traditional Yi numerals. In 2010, further efforts were made to standardize the Yi script and pronunciation, resulting in a universal script with 5,598 Yi characters, 49 consonants, and 10 vowels.

Through this report, you can access information about:

  • Basic information on the six major Yi dialect areas in China.
  • The detailed process and applicable cases regarding the development of informatization of Yi languages within each division.
Map of the Distribution of the Six Yi Language Dialect Areas
Source: Language Atlas of China (2nd edition): Minority languages volume]. Beijing: The Commercial Press

Written and edited by Jin Dai


Collaborative Efforts Drive the informatization of Yi Script

The informatization of the Yi script has been a collaborative effort, involving contributions from government authorities, academic experts, and public initiatives. This report highlights the combined efforts of these three pathways in advancing the informatization of the Yi script, particularly in the development of input methods and speech recognition technologies.

The report provides a chronological overview of the milestones in the informatization process, showcasing how governmental support, scholarly research, and community involvement have collectively promoted the integration of the Yi script into the digital age. You will gain insight into the three pathways of efforts in preserving and modernizing the Yi script through technological innovation.

A link to some popular Yi script input methods currently trending in China Please visit: http://www.962.net/k/zgywsrf/

Written and edited by Jin Dai


Miao Language: Dialects and Script

Miao (苗) is the Chinese name and the one used by Miao in China. However, Hmong is more familiar in the West, due to Hmong emigration. The Miao language includes several dialects such as the Central, Eastern, and Western, each further divided into numerous sub-dialects. The ECura project focuses on the Northeastern sub-dialect of the Western dialect around Kunming.

This area uses the traditional Old Miao Script (滇东北老苗文), also known as the Pollard Script or Shimenkan (石门坎) Miao Script. This script was devised in 1905 by Samuel Pollard (1864-1915), a British missionary, with the assistance of Han Chinese preacher Sitifan Li (李斯提反) and Miao intellectual Yage Yang (杨雅各). In the Pollard Script, each syllable is represented by a capital letter, which serves as the initial consonant forming the main body of the word, and a smaller letter that indicates the rhyme and is placed above or to the right of the main letter. The position of the smaller letter denotes the tone. For more information, please see this article: Letters, Pronunciation, and Tones of the Pollard Script 

A portrait of Samuel Pollard
Letters of Traditional Old Miao Script

Article by Yunhui Yang, translated and edited by Keyi Liu


Miao Musical Notation

After Samuel Pollard created the Old Miao Script, he developed a Miao musical notation system. Since Christian ceremonies feature a large number of hymns, this notation was designed to help Miao believers quickly learn Western hymns. Pollard based the system on the principles of the British tonic sol-fa system, using the initials and finals from the Old Miao Script. Below is some music theory knowledge related to Miao musical notation. For more details, please click the link below according to your needs.

Hymns with Pollard Script and Miao Musical Notation (photo provided by Yunhui Yang)
Samuel Pollard preaching in the Miao regions of Southwest China in the early 20th century

Article by Yunhui Yang, translated and edited by Keyi Liu


Audiovisual Self-Representation via WeChat: A Report on Minority Music in Yunnan Province

Focusing on Yunnan Province, where 25 of China’s 55 recognized ethnic minorities reside, the study delves into how these minorities, specifically the Bai, Miao, and Yi, utilize WeChat Channels to share their cultural expressions and disseminate their musical heritage.

This report is divided into two main sections:

  • The first section provides a broader context on the use of digital social media in China, highlighting WeChat’s pivotal role in facilitating social interactions and content sharing among ethnic minorities.
  • The second section offers a detailed visual content analysis of music videos created by users from the Bai, Miao, and Yi communities.

Through this analysis, the report uncovers the frequency, themes, and characteristics of the visual representations of these groups, demonstrating how WeChat serves as a platform for both preserving and transforming ethnic minority cultures in the digital age. The findings emphasize the blend of traditional and modern elements in these audiovisual narratives, contributing to the ongoing discourse on ethnic identity and media representation in China. For more information, please see this report via link[]

The logo of WeChat video channels
WeChat video channels and other short video platforms have become important channels for spreading traditional culture of ethnic minorities in China.

Report by Leonardo D’Amico; edited by Jin Dai


Digital Application and Online Data

New Intelligent Collection Feature on YiCorpus: Build a High-Quality, Personalized Monolingual Corpus

YiCorpus has launched a new intelligent collection tool designed to automatically gather and clean large volumes of data based on user needs. Whether you’re navigating vast datasets, searching for precise information, or visualizing the collected data, YiCorpus is here to help you efficiently build a high-quality, personalized monolingual corpus. The intelligent collection tool offers three modes: full web collection, single-site collection, and link list collection, catering to your diverse data-gathering needs. Click the link to learn more details about the newly launched function []

YiCorpus launched new intelligent collection tool to help users efficiently build high-quality, personalized monolingual corpora. Please visit: http://www.yicorpus.com

Written and edited by Jin Dai


China Language Resources Protection Research Center (Yubao 语宝)

Founded in March 2015, the China Language Resources Protection Research Center is a research center under the National Language Commission and is affiliated with Beijing Language and Culture University (BLCU). The center’s development goal is to become China’s leading center for language resources survey and research, preservation and display, development and application, and talent training. The center aims to establish a new type of think tank that serves society and the country.

Through the center, you can access information about:

  • The provided standards for language resources protection under the guidance of language science theory.
  • The latest information on the organized investigation and protection of language resources using modern technology.
  • The research results based on the first phase of the “Language Protection Project” from 2015 to 2020, which includes tasks on:
    • A survey of language resources in China.
    • Construction of the Chinese Language Resources Platform.
    • Chinese Language Resources Protection Research.

Access more information about the center by visiting the official website [].

China Language Resources Protection Research Center (Yubao 语宝)

Written and edited by Jin Dai


Yunnan Minority Languages and Cultures Website/Application

The Yunnan Minority Languages and Cultures Website, operating under the Yunnan Provincial Committee for Ethnic Minority Language Work, is dedicated to promoting and safeguarding minority languages and cultures. It undertakes various roles such as disseminating laws and policies on minority languages, safeguarding language rights, supervising standard language usage, and fostering cross-disciplinary collaborations.

The website undertakes extensive data collection on 23 ethnic minority languages including Yi, Dai, Miao, Bai, and Hani, offering a vast repository of lexical materials, audio recordings, and multimedia content. Based on the achievements on the website, an app mini-program named “Ethnic Language Inheritance and Protection Voice Electronic Dictionary” has been developed. This app provides direct access to these resources through expert-reviewed content.

To access this app and utilize the available resources, you can search for “民族语言传承保护有声电子词典” on WeChat or visit the official website: 云南少数民族语言文化网

Ethnic Language Inheritance and Protection Voice Electronic Dictionary Interface

Written and edited by Keyi Liu


Intellectual Property and ICH

Copyright management of ICH on social media 

The report explores the intersection of intellectual property (IP) rights and intangible cultural heritage (ICH) with a focus on protecting the traditions, knowledge, and cultural expressions of ethnic minorities and Indigenous peoples.

Through the report, you can access information about:

  • Terminology and definitions related to Indigenous peoples and ethnic minorities, highlighting differences between international and Chinese contexts.
  • Examination of efforts by organizations like UNESCO and the challenges of copyright management in protecting traditional knowledge and cultural expressions.
  • Review of China’s strategies for safeguarding intangible cultural heritage, including legal frameworks, registries, protection zones, and the complexities of implementing international conventions on cultural preservation.
UNESCO logo, representing the United Nations Educational, Scientific and Cultural Organization.

Article by Keyi Liu; edited by Jin Dai