LOLAMEME: Logic, Language, Memory, Mechanistic Framework

التفاصيل البيبلوغرافية
العنوان: LOLAMEME: Logic, Language, Memory, Mechanistic Framework
المؤلفون: Desai, Jay, Guo, Xiaobo, Sengamedu, Srinivasan H.
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
الوصف: The performance of Large Language Models has achieved superhuman breadth with unprecedented depth. At the same time, the language models are mostly black box models and the underlying mechanisms for performance have been evaluated using synthetic or mechanistic schemes. We extend current mechanistic schemes to incorporate Logic, memory, and nuances of Language such as latent structure. The proposed framework is called LOLAMEME and we provide two instantiations of LOLAMEME: LoLa and MeMe languages. We then consider two generative language model architectures: transformer-based GPT-2 and convolution-based Hyena. We propose the hybrid architecture T HEX and use LOLAMEME framework is used to compare three architectures. T HEX outperforms GPT-2 and Hyena on select tasks.
Comment: https://openreview.net/pdf?id=73dhbcXxtVTest
نوع الوثيقة: Working Paper
الوصول الحر: http://arxiv.org/abs/2406.02592Test
رقم الانضمام: edsarx.2406.02592
قاعدة البيانات: arXiv