مؤتمر
A study on the relevance of generic word embeddings for sentence classification in hepatic surgery
العنوان: | A study on the relevance of generic word embeddings for sentence classification in hepatic surgery |
---|---|
المؤلفون: | Oukelmoun, Achir, Semmar, Nasredine, de Chalendar, Gaël, Habran, Enguerrand, Vibert, Eric, Goblet, Emma, Oukelmoun, Mariame, Allard, Marc-Antoine |
المساهمون: | Laboratoire Analyse Sémantique Textes et Images (LASTI), Département Intelligence Ambiante et Systèmes Interactifs (DIASI (CEA, LIST)), Laboratoire d'Intégration des Systèmes et des Technologies (LIST (CEA)), Direction de Recherche Technologique (CEA) (DRT (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Direction de Recherche Technologique (CEA) (DRT (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay-Laboratoire d'Intégration des Systèmes et des Technologies (LIST (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay, Chaire innovation Bloc OPératoire Augmenté (BOPA), Institut Mines-Télécom Business School (IMT-BS), Institut Mines-Télécom Paris (IMT)-Institut Mines-Télécom Paris (IMT), Department of Laboratory Medicine, Medical laboratory of Cheikh Zaid hospital, Abulcasis International University of Health Sciences, Morocco |
المصدر: | Proceedings of the 20th ACS/IEEE International Conference on Computer Systems and Applications (AICCSA 2023) ; AICCSA 2023 - 20th ACS/IEEE International Conference on Computer Systems and Applications ; https://cea.hal.science/cea-04559674Test ; AICCSA 2023 - 20th ACS/IEEE International Conference on Computer Systems and Applications, Dec 2023, Gizeh, Egypt. ⟨10.1109/AICCSA59173.2023.10479342⟩ ; https://www.computer.org/csdl/proceedings/aiccsa/2023/1VOAizMiU2kTest |
بيانات النشر: | HAL CCSD |
سنة النشر: | 2023 |
مصطلحات موضوعية: | Natural Language Processing, Word embeddings, Gradient Boosting, hepatic, surgery, transformers, classifiers, supervised learning, [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] |
جغرافية الموضوع: | Gizeh, Egypt |
الوصف: | International audience ; While the fine-tuning process of extensive contextual language models often demands substantial computational capacity, utilizing generic pre-trained models in highly specialized domains can yield suboptimal results. This paper aims to explore an innovative approach to derive pertinent word embeddings tailored to a specific domain with limited computational resources (The introduced methodologies are tested within the domain of hepatic surgery, utilizing the French language.). This exploration takes place within a context where computational limitations prohibit the fine-tuning of large language models. A new embedding (referred to as FTW2V) that combines Word2Vec and FastText is introduced. This approach addresses the challenge of incorporating terms absent from Word2Vec’s vocabulary. Furthermore, a novel method is used to evaluate the significance of word embeddings within a specialized corpus. This evaluation involves comparing classification scores distributions of classifiers (Gradient Boosting) trained on word embeddings derived from benchmarked Natural Language Processing (NLP) models. As per this assessment technique, the FTW2V model, trained from scratch with limited computational resources, outperforms generic contextual models in terms of word embeddings quality. Additionally, a computationally efficient contextual model rooted in FTW2V is introduced. This modified model substitutes Gradient Boosting with a transformer and integrates Part Of Speech labels. |
نوع الوثيقة: | conference object |
اللغة: | English |
ردمك: | 979-83-503-1943-9 |
العلاقة: | cea-04559674; https://cea.hal.science/cea-04559674Test; https://cea.hal.science/cea-04559674/documentTest; https://cea.hal.science/cea-04559674/file/AICCSA_2023_Paper_IEEE_Achir_Oukelmoun_NoteIEEE.pdfTest |
DOI: | 10.1109/AICCSA59173.2023.10479342 |
الإتاحة: | https://doi.org/10.1109/AICCSA59173.2023.10479342Test https://cea.hal.science/cea-04559674Test https://cea.hal.science/cea-04559674/documentTest https://cea.hal.science/cea-04559674/file/AICCSA_2023_Paper_IEEE_Achir_Oukelmoun_NoteIEEE.pdfTest |
حقوق: | info:eu-repo/semantics/OpenAccess |
رقم الانضمام: | edsbas.E17E0E18 |
قاعدة البيانات: | BASE |
ردمك: | 9798350319439 |
---|---|
DOI: | 10.1109/AICCSA59173.2023.10479342 |