心理健康前沿科学论坛（七）9月13日 Improving Sentiment Analysis by Using Cognition Grounded Data
报告题目：Improving Sentiment Analysis by Using Cognition Grounded Data
Research in cognitive studies has indicated that not all words are created equal. Some words are more important than others in conveying messages in sentences. Similarly, some sentences are more important than others in a document. Based on these premises, attention models are proposed to give different weights to different words in text and attention models are used in sentiment analysis task as well as many other classifications based NLP tasks. Attention models are also incorporated into deep learning based sentiment analysis models. Previous attention models are built using information embedded in text including users, products and text in local context for sentiment classification. However, attention models using local context based text through distributional similarity lack theoretical foundation to reflect the cognitive basis although attention models are proposed based on its cognitive basis.
In this work, we propose a novel cognition grounded attention (CGA) model for sentiment analysis learned from cognition grounded eye-tracking data.Eye-tracking is the process of measuring either the point of gaze or the motion of an eye relative to the head. Readers indeed fixate longer on words which play significant semantic roles in addition to infrequent words, ambiguous words, and morphological complex words. Since reading time can be learned from an eye-tracking dataset, predicted reading time of words in its context can be used as indicators of attention weights.
Obtaining eye-tracking data is very time consuming and thus, available eye-tracking data has very limited coverage of words. In this work, we first device a reading time prediction model. The model predict the reading time each word in sentiment analysis text by using eye-tracking data as dependent data and traditional text features in the context as independent data. The predicted reading time is then used to build a cognition grounded attention layer for neural sentiment analysis. Our model can capture attentions context at word-to-sentence level as well as sentences-to-document level. Other attention mechanisms can also be incorporated together to capture other aspects of attentions, such as local attention, and affective lexicons.
Evaluation on five sentiment analysis related benchmark datasets show that our proposed model has significant improvement in performance compared to the state-of-the-art attention methods. Evaluation also show that eye-tracking based CGA has higher performance gain than attention models build by other lexical based sentiment resources. This is mainly because our attention models are based on context dependent information whereas lexicon based resources are static, and context information are not included. This work gives insight on how cognition grounded data can be integrated into natural language processing (NLP) tasks to improve performance.
陆勤教授，现为香港理工大学电子计算机系系主任，在美国伊利诺大学（厄本那-香槟分校UIUC）计算机科学系获博士学位（1988年），在北京师范大学物理系无线电系获工学士学位（1982年）。陆勤教授的主要研究方向包括自然语言处理、信息抽取、搭配抽取、本体构建。她尤其重视中文信息处理和中文本体资源建设。近年来，陆勤教授和她的团队特别关注情感分析，扩展观点挖掘和分析的现有研究成果，解决数据不平衡问题对情感分类的影响。目前她是中国中文信息学会的执行编委，也是《中文信息学报》的编委。她曾于2010-2014年是Int. Journal of Computer Processing Of Languages的主编。从2014年起，她是IJCPOL Book Series的主编。陆勤教授作为项目主持人，她曾经成功获得5个香港研究资助局科研基金项目、6个香港创新科技署创新及科技基金资助项目，亦参加多个大陆的和国际的合作研究项目。此外，陆勤教授多年来一直致力于中文编码标准化和软件开发国际化，是该领域的先驱和专家，曾经帮助香港政府筹划第一个为信息技术开发服务的数字二十一世纪策略，并因此获得学校以及政府的荣誉嘉奖。为表彰陆勤教授对大中华地区中文电子通信科技发展的巨大贡献，香港政府2011年授予她荣誉奖章。