دورية أكاديمية
JADS: A Framework for Self-supervised Joint Aspect Discovery and Summarization ...
العنوان: | JADS: A Framework for Self-supervised Joint Aspect Discovery and Summarization ... |
---|---|
المؤلفون: | Guo, Xiaobo, Desai, Jay, Sengamedu, Srinivasan H. |
بيانات النشر: | arXiv |
سنة النشر: | 2024 |
المجموعة: | DataCite Metadata Store (German National Library of Science and Technology) |
مصطلحات موضوعية: | Artificial Intelligence cs.AI, Computation and Language cs.CL, FOS Computer and information sciences |
الوصف: | To generate summaries that include multiple aspects or topics for text documents, most approaches use clustering or topic modeling to group relevant sentences and then generate a summary for each group. These approaches struggle to optimize the summarization and clustering algorithms jointly. On the other hand, aspect-based summarization requires known aspects. Our solution integrates topic discovery and summarization into a single step. Given text data, our Joint Aspect Discovery and Summarization algorithm (JADS) discovers aspects from the input and generates a summary of the topics, in one step. We propose a self-supervised framework that creates a labeled dataset by first mixing sentences from multiple documents (e.g., CNN/DailyMail articles) as the input and then uses the article summaries from the mixture as the labels. The JADS model outperforms the two-step baselines. With pretraining, the model achieves better performance and stability. Furthermore, embeddings derived from JADS exhibit superior ... : preprint ... |
نوع الوثيقة: | article in journal/newspaper report |
اللغة: | unknown |
DOI: | 10.48550/arxiv.2405.18642 |
الإتاحة: | https://doi.org/10.48550/arxiv.2405.18642Test https://arxiv.org/abs/2405.18642Test |
حقوق: | Creative Commons Attribution 4.0 International ; https://creativecommons.org/licenses/by/4.0/legalcodeTest ; cc-by-4.0 |
رقم الانضمام: | edsbas.5C032AD |
قاعدة البيانات: | BASE |
DOI: | 10.48550/arxiv.2405.18642 |
---|