ECura Multilingual Parallel Corpus of Yunnan Ethnic Minority Music

Working in collaboration with members of three villages from China’s multi-ethnic Southwest, the research team seeks out ways to empower community members to take up the new digital technologies to become active collectors and curators of their traditional music and dance. The research includes the development of spoken language – oriented database for their Indigenous musical tradition and its associated software program used in digital social media platforms, enabling all villagers to readily record and share their own songs and other musical heritage digitally, and the creation of an online database dedicated to their artistry. The resulting research framework will be transferable to a broad cross-section of endangered cultures globally.

In year 3, ECura project is going to set up the dataset ‘ECura Multilingual Parallel Corpus of Yunnan Ethnic Minority Music’. This dataset aims to utilize modern information technologies such as multimodal data storage, multilingual corpus retrieval and analysis, and natural language processing. The dataset collects the commonly used words which exists in oral musical tradition of Yi, Bai and Miao ethnic minority communities as the main reference (has a great potential to add spoken language oriented from musical tradition of other ethnic minority groups in the near future). The project team uses the existing audio/video recordings of ancient songs sung by Yi, Bai and Miao people (include epics, creation songs, ancestor songs, migration songs, etc.), take the the audios (sound of the words) as a central point, connect the audios to organize and document their corresponding written script (only in Yi and Miao, Bai doesn’t have their written form language), written form of Chinese, English, International Phonetic Alphabet (IPA), along with the related annotations, images, performance videos, and other multimodal content (for instance, the references to the names of the people and locations, the words with special meanings in local tradition, the words with no actually meaning but has theirs specific meanings in traditional singing, etc.). The goal of the dataset is to establish a multimodal, multilingual parallel corpus centered around the theme of ancient songs among Yunnan ethnic minority communities which mainly exist in their oral tradition.

News announced by project PI Lijuan Qian

Click on the communities to have more details