PyGraft: Configurable Generation of Synthetic Schemas and Knowledge Graphs at Your Fingertips

التفاصيل البيبلوغرافية
العنوان: PyGraft: Configurable Generation of Synthetic Schemas and Knowledge Graphs at Your Fingertips
المؤلفون: Hubert, Nicolas, Monnin, Pierre, D’aquin, Mathieu, Monticolo, Davy, Brun, Armelle
المساهمون: Equipe de Recherche sur les Processus Innovatifs (ERPI), Université de Lorraine (UL), Building artificial Intelligence between trust, Responsibility and Decision (BIRD), Department of Complex Systems, Artificial Intelligence & Robotics (LORIA - AIS), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), Web-Instrumented Man-Machine Interactions, Communities and Semantics (WIMMICS), Inria Sophia Antipolis - Méditerranée (CRISAM), Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Scalable and Pervasive softwARe and Knowledge Systems (Laboratoire I3S - SPARKS), Laboratoire d'Informatique, Signaux, et Systèmes de Sophia Antipolis (I3S), Université Nice Sophia Antipolis (1965 - 2019) (UNS)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UniCA)-Université Nice Sophia Antipolis (1965 - 2019) (UNS)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UniCA)-Laboratoire d'Informatique, Signaux, et Systèmes de Sophia Antipolis (I3S), Université Nice Sophia Antipolis (1965 - 2019) (UNS)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UniCA)-Université Nice Sophia Antipolis (1965 - 2019) (UNS)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UniCA), K team (Data Science, Knowledge, Reasoning and Engineering), Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), ANR-22-CMAS-0004,EFELIA Côte d'Azur,Ecole Française de l'Intelligence Artificielle - Site Côte d'Azur(2022)
المصدر: Lecture notes in computer science ; ESWC 2024 - 21st International Conference on Semantic Web ; https://inria.hal.science/hal-04491258Test ; ESWC 2024 - 21st International Conference on Semantic Web, May 2024, Hersonissos, Greece. ⟨10.5281/zenodo.10243209⟩
بيانات النشر: HAL CCSD
سنة النشر: 2024
المجموعة: Université de Lorraine: HAL
مصطلحات موضوعية: Knowledge Graph, Schema, Semantic Web, Synthetic Data Generator, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
جغرافية الموضوع: Hersonissos, Greece
الوصف: International audience ; Knowledge graphs (KGs) have emerged as a prominent data representation and management paradigm. Being usually underpinned by a schema (e.g., an ontology), KGs capture not only factual information but also contextual knowledge. In some tasks, a few KGs established themselves as standard benchmarks. However, recent works outline that relying on a limited collection of datasets is not sufficient to assess the generalization capability of an approach. In some data-sensitive fields such as education or medicine, access to public datasets is even more limited. To remedy the aforementioned issues, we release PyGraft, a Python-based tool that generates highly customized, domain-agnostic schemas and KGs. The synthesized schemas encompass various RDFS and OWL constructs, while the synthesized KGs emulate the characteristics and scale of real-world KGs. Logical consistency of the generated resources is ultimately ensured by running a description logic (DL) reasoner. By providing a way of generating both a schema and KG in a single pipeline, PyGraft's aim is to empower the generation of a more diverse array of KGs for benchmarking novel approaches in areas such as graph-based machine learning (ML), or more generally KG processing. In graph-based ML in particular, this should foster a more holistic evaluation of model performance and generalization capability, thereby going beyond the limited collection of available benchmarks. PyGraft is available at: https://github.com/nicolas-hbt/pygraftTest.
نوع الوثيقة: conference object
اللغة: English
العلاقة: hal-04491258; https://inria.hal.science/hal-04491258Test; https://inria.hal.science/hal-04491258/documentTest; https://inria.hal.science/hal-04491258/file/Hubert_et_al-ESWC2024-PyGraft.pdfTest
DOI: 10.5281/zenodo.10243209
الإتاحة: https://doi.org/10.5281/zenodo.10243209Test
https://inria.hal.science/hal-04491258Test
https://inria.hal.science/hal-04491258/documentTest
https://inria.hal.science/hal-04491258/file/Hubert_et_al-ESWC2024-PyGraft.pdfTest
حقوق: http://creativecommons.org/licenses/byTest/ ; info:eu-repo/semantics/OpenAccess
رقم الانضمام: edsbas.B7F0B6AD
قاعدة البيانات: BASE