LLM-based Extraction of Contradictions from Patents

التفاصيل البيبلوغرافية
العنوان: LLM-based Extraction of Contradictions from Patents
المؤلفون: Trapp, Stefan, Warschat, Joachim
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computation and Language, I.2.7, H.3.1
الوصف: Already since the 1950s TRIZ shows that patents and the technical contradictions they solve are an important source of inspiration for the development of innovative products. However, TRIZ is a heuristic based on a historic patent analysis and does not make use of the ever-increasing number of latest technological solutions in current patents. Because of the huge number of patents, their length, and, last but not least, their complexity there is a need for modern patent retrieval and patent analysis to go beyond keyword-oriented methods. Recent advances in patent retrieval and analysis mainly focus on dense vectors based on neural AI Transformer language models like Google BERT. They are, for example, used for dense retrieval, question answering or summarization and key concept extraction. A research focus within the methods for patent summarization and key concept extraction are generic inventive concepts respectively TRIZ concepts like problems, solutions, advantage of invention, parameters, and contradictions. Succeeding rule-based approaches, finetuned BERT-like language models for sentence-wise classification represent the state-of-the-art of inventive concept extraction. While they work comparatively well for basic concepts like problems or solutions, contradictions - as a more complex abstraction - remain a challenge for these models. This paper goes one step further, as it presents a method to extract TRIZ contradictions from patent texts based on Prompt Engineering using a generative Large Language Model (LLM), namely OpenAI's GPT-4. Contradiction detection, sentence extraction, contradiction summarization, parameter extraction and assignment to the 39 abstract TRIZ engineering parameters are all performed in a single prompt using the LangChain framework. Our results show that "off-the-shelf" GPT-4 is a serious alternative to existing approaches.
Comment: 10 pages, 2 tables
نوع الوثيقة: Working Paper
الوصول الحر: http://arxiv.org/abs/2403.14258Test
رقم الانضمام: edsarx.2403.14258
قاعدة البيانات: arXiv