DAFS: a domain aware few shot generative model for event detection

Publisher:
SPRINGER
Publication Type:
Journal Article
Citation:
Machine Learning, 2022
Issue Date:
2022-01-01
Full metadata record
More and more, large-scale pre-trained models show apparent advantages in solving the event detection (ED), i.e., a task to solve the problem of event classification by identifying trigger words. However, this kind of model depends heavily on labeled training data. Unfortunately, there is not enough such data for some particular areas, such as finance, due to the high cost of the data annotation process. Besides, the manually labeled training data has many problems like uneven sampling distribution, poor diversity, and massive long-tail data. Recently, some researchers have used the generative model to label data. However, training the generative models needs rich domain knowledge, which cannot be obtained from a Few-Shot resource. Therefore, we propose a Domain-Aware Few-Shot (DAFS) generative model that can generate domain based training data through a relatively small amount of labeled data. First, DAFS utilizes self-supervised information from various categories of sentences to calculate words’ transition probability under different domain and retain key triggers in each sentence. Then, we apply our joint algorithm to generate labeled training data that considers both diversity and effectiveness. Experimental results demonstrate that the training data generated by DAFS significantly improves the performance of ED in actual financial data. Especially when there are no more than 20 training data, DAFS can still ensure the generative quality to a certain extent. It also obtains new state-of-the-art results on ACE2005 multilingual corpora.
Please use this identifier to cite or link to this item: