薛念文 【作者简介】 薛念文,美国布兰戴斯大学计算机系及语言学项目副教授,《美国计算机学会亚洲汇刊和低资源语言信息处理》主编,国际计算语言学协会中文处理特别兴趣组(SIGHAN)副主席/主席候选人。曾任科罗拉多大学波尔得分校语言学系助理教授,宾夕法尼亚大学大学认知科学学院和计算机与信息科学系博士后。 主要研究领域为计算机语言学及自然语言处理,致力于开发语法、语义、时间和语篇信息标注语料库。同时主持汉语树库、汉语命题树库和汉语语篇树库的建设。在中文分词、句法和语义解析、共指关系、话语分析、机器翻译以及生物医学自然语言处理领域发表论文100余篇。 From Universal Dependency to Uniform Meaning Representation The Universal Dependency has seen wide adoption in recent years. Maybe it’s just the Stanford magic, but I would argue it reflects a thirst for uniform representations across languages as multilingual processing has become increasingly important. The benefits of universal representations are easy to understand: NLP researchers don’t have to develop different models for each individual language, and linguistic resource developers have a common standard to follow. This is especially important for low-resource languages where expertise for building linguistic resources for NLP research is often lacking, and having a universal scheme to consult or follow to some extent improves the situation. Universal Dependency addresses a need for a multilingual standard for syntactic representations, but there is still a lack of shared standard for meaning representations. In this talk I will argue that with the knowledge accumulated in the past decade or so from annotating semantic components, the time is ripe for developing a uniform meaning representation that is cross-linguistically valid and practical for NLP purposes. The need for uniform representations is magnified in the context of the Belt and Road Initiative, where multiple countries and languages are involved. (责任编辑:admin) |