Abstract
Reading scientific articles is more time-consuming than reading news because readers need to search and read many citations. This paper proposes a citation guided method for summarizing multiple scientific papers. A phenomenon we can observe is that citation sentences in one paragraph or section usually talk about a common fact, which is usually represented as a set of noun phrases co-occurring in citation texts and it is usually discussed from different aspects. We design a multi-document summarization system based on common fact detection. One challenge is that citations may not use the same terms to refer to a common fact. We thus use term association discovering algorithm to expand terms based on a large set of scientific article abstracts. Then, citations can be clustered based on common facts. The common fact is used as a salient term set to get relevant sentences from the corresponding cited articles to form a summary. Experiments show that our method outperforms three baseline methods by ROUGE metric.
Original language | English |
---|---|
Pages (from-to) | 246-252 |
Number of pages | 7 |
Journal | Future Generation Computer Systems |
Volume | 32 |
Early online date | 22 Aug 2013 |
DOIs | |
Publication status | Published - Mar 2014 |
Keywords
- natural language processing
- semantic link network
- summarization