RetroGFN: Diverse and Feasible Retrosynthesis using GFlowNets

التفاصيل البيبلوغرافية
العنوان: RetroGFN: Diverse and Feasible Retrosynthesis using GFlowNets
المؤلفون: Gaiński, Piotr, Koziarski, Michał, Maziarz, Krzysztof, Segler, Marwin, Tabor, Jacek, Śmieja, Marek
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Machine Learning
الوصف: Single-step retrosynthesis aims to predict a set of reactions that lead to the creation of a target molecule, which is a crucial task in molecular discovery. Although a target molecule can often be synthesized with multiple different reactions, it is not clear how to verify the feasibility of a reaction, because the available datasets cover only a tiny fraction of the possible solutions. Consequently, the existing models are not encouraged to explore the space of possible reactions sufficiently. In this paper, we propose a novel single-step retrosynthesis model, RetroGFN, that can explore outside the limited dataset and return a diverse set of feasible reactions by leveraging a feasibility proxy model during the training. We show that RetroGFN achieves competitive results on standard top-k accuracy while outperforming existing methods on round-trip accuracy. Moreover, we provide empirical arguments in favor of using round-trip accuracy which expands the notion of feasibility with respect to the standard top-k accuracy metric.
نوع الوثيقة: Working Paper
الوصول الحر: http://arxiv.org/abs/2406.18739Test
رقم الانضمام: edsarx.2406.18739
قاعدة البيانات: arXiv