يعرض 1 - 10 نتائج من 1,016 نتيجة بحث عن '"Hamilton, William L"', وقت الاستعلام: 1.15s تنقيح النتائج
  1. 1
    تقرير

    الوصف: Despite substantial progress in machine learning for scientific discovery in recent years, truly de novo design of small molecules which exhibit a property of interest remains a significant challenge. We introduce LambdaZero, a generative active learning approach to search for synthesizable molecules. Powered by deep reinforcement learning, LambdaZero learns to search over the vast space of molecules to discover candidates with a desired property. We apply LambdaZero with molecular docking to design novel small molecules that inhibit the enzyme soluble Epoxide Hydrolase 2 (sEH), while enforcing constraints on synthesizability and drug-likeliness. LambdaZero provides an exponential speedup in terms of the number of calls to the expensive molecular docking oracle, and LambdaZero de novo designed molecules reach docking scores that would otherwise require the virtual screening of a hundred billion molecules. Importantly, LambdaZero discovers novel scaffolds of synthesizable, drug-like inhibitors for sEH. In in vitro experimental validation, a series of ligands from a generated quinazoline-based scaffold were synthesized, and the lead inhibitor N-(4,6-di(pyrrolidin-1-yl)quinazolin-2-yl)-N-methylbenzamide (UM0152893) displayed sub-micromolar enzyme inhibition of sEH.

    الوصول الحر: http://arxiv.org/abs/2405.01616Test

  2. 2
    تقرير

    الوصف: Graph are a ubiquitous data representation, as they represent a flexible and compact representation. For instance, the 3D structure of RNA can be efficiently represented as $\textit{2.5D graphs}$, graphs whose nodes are nucleotides and edges represent chemical interactions. In this setting, we have biological evidence of the similarity between the edge types, as some chemical interactions are more similar than others. Machine learning on graphs have recently experienced a breakthrough with the introduction of Graph Neural Networks. This algorithm can be framed as a message passing algorithm between graph nodes over graph edges. These messages can depend on the edge type they are transmitted through, but no method currently constrains how a message is altered when the edge type changes. Motivated by the RNA use case, in this project we introduce a graph neural network layer which can leverage prior information about similarities between edges. We show that despite the theoretical appeal of including this similarity prior, the empirical performance is not enhanced on the tasks and datasets we include here.

    الوصول الحر: http://arxiv.org/abs/2109.09432Test

  3. 3
    تقرير

    الوصف: RNA 3D architectures are stabilized by sophisticated networks of (non-canonical) base pair interactions, which can be conveniently encoded as multi-relational graphs and efficiently exploited by graph theoretical approaches and recent progresses in machine learning techniques. RNAglib is a library that eases the use of this representation, by providing clean data, methods to load it in machine learning pipelines and graph-based deep learning models suited for this representation. RNAglib also offers other utilities to model RNA with 2.5D graphs, such as drawing tools, comparison functions or baseline performances on RNA applications. The method and data is distributed as a fully documented pip package. Availability: https://rnaglib.cs.mcgill.caTest

    الوصول الحر: http://arxiv.org/abs/2109.04434Test

  4. 4
    دورية أكاديمية

    المؤلفون: Ashford, Fiona, Best, Angus, Dunn, Steven J, Ahmed, Zahra, Siddiqui, Henna, Melville, Jordan, Wilkinson, Samuel, Mirza, Jeremy, Cumley, Nicola, Stockton, Joanne, Ferguson, Jack, Wheatley, Lucy, Ratcliffe, Elizabeth, Casey, Anna, Plant, Tim, Aggarwal, Dinesh, Blane, Beth, Brooks, Ellena, Carabelli, Alessandro M, Churcher, Carol M, Galai, Katerina, Girgis, Sophia T, Gupta, Ravi K, Hadjirin, Nazreen F, Leek, Danielle, Ludden, Catherine, McManus, Georgina M, Palmer, Sophie, Peacock, Sharon J, Smith, Kim S, Allara, Elias, Bibby, David, Bishop, Chloe, Bosworth, Andrew, Bradshaw, Daniel, Chalker, Vicki, Chand, Meera, Dabrera, Gavin, Ellaby, Nicholas, Gallagher, Eileen, Groves, Natalie, Harrison, Ian, Hartman, Hassan, Hopes, Richard, Hubb, Jonathan, Hutchings, Stephanie, Lackenby, Angie, Ledesma, Juan, Lee, David, Manesis, Nikos, Manso, Carmen, Mbisa, Tamyo, Miah, Shahjahan, Muir, Peter, Myers, Richard, Osman, Husam, Patel, Vineet, Pearson, Clare, Platt, Steven, Pymont, Hannah M, Ramsay, Mary, Robinson, Esther, Schaefer, Ulf, Thornton, Alicia, Twohig, Katherine A, Vipond, Ian B, Williams, David, Hamilton, William L, Warne, Ben, Aigrain, Louise, Alderton, Alex, Amato, Roberto, Ariani, Cristina V, Barrett, Jeff, Bassett, Andrew R, Beale, Mathew A, Beaver, Charlotte, Bellis, Katherine L, Betteridge, Emma, Bonfield, James, Bronner, Iraad F, Chapman, Michael HS, Danesh, John, Davies, Robert, Dorman, Matthew J, Drury, Eleanor, Durham, Jillian, Farr, Ben W, Foulser, Luke, Goncalves, Sonia, Goodwin, Scott, Gourtovaia, Marina, Harrison, Ewan M, Jackson, David K, James, Keith, Jamrozy, Dorota, Johnston, Ian, Kane, Leanne, Kay, Sally, Keatley, Jon-Paul

    المصدر: Journal of Clinical Microbiology. 60(4)

    الوصف: Genome sequencing is a powerful tool for identifying SARS-CoV-2 variant lineages; however, there can be limitations due to sequence dropout when used to identify specific key mutations. Recently, ThermoFisher Scientific has developed genotyping assays to help bridge the gap between testing capacity and sequencing capability to generate real-time genotyping results based on specific variants. Over a 6-week period during the months of April and May 2021, we set out to assess the ThermoFisher TaqMan mutation panel genotyping assay, initially for three mutations of concern and then for an additional two mutations of concern, against SARS-CoV-2-positive clinical samples and the corresponding COVID-19 Genomics UK Consortium (COG-UK) sequencing data. We demonstrate that genotyping is a powerful in-depth technique for identifying specific mutations, is an excellent complement to genome sequencing, and has real clinical health value potential, allowing laboratories to report and take action on variants of concern much more quickly.

    وصف الملف: application/pdf

  5. 5
    تقرير

    المصدر: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 8523-8527

    مصطلحات موضوعية: Computer Science - Machine Learning

    الوصف: Graph neural networks (GNNs) have achieved remarkable success as a framework for deep learning on graph-structured data. However, GNNs are fundamentally limited by their tree-structured inductive bias: the WL-subtree kernel formulation bounds the representational capacity of GNNs, and polynomial-time GNNs are provably incapable of recognizing triangles in a graph. In this work, we propose to augment the GNN message-passing operations with information defined on ego graphs (i.e., the induced subgraph surrounding each node). We term these approaches Ego-GNNs and show that Ego-GNNs are provably more powerful than standard message-passing GNNs. In particular, we show that Ego-GNNs are capable of recognizing closed triangles, which is essential given the prominence of transitivity in real-world graphs. We also motivate our approach from the perspective of graph signal processing as a form of multiplex graph convolution. Experimental results on node classification using synthetic and real data highlight the achievable performance gains using this approach.
    Comment: Submitted to a special session of IEEE-ICASSP 2021

    الوصول الحر: http://arxiv.org/abs/2107.10957Test

  6. 6
    تقرير

    الوصف: Conventional representation learning algorithms for knowledge graphs (KG) map each entity to a unique embedding vector. Such a shallow lookup results in a linear growth of memory consumption for storing the embedding matrix and incurs high computational costs when working with real-world KGs. Drawing parallels with subword tokenization commonly used in NLP, we explore the landscape of more parameter-efficient node embedding strategies with possibly sublinear memory requirements. To this end, we propose NodePiece, an anchor-based approach to learn a fixed-size entity vocabulary. In NodePiece, a vocabulary of subword/sub-entity units is constructed from anchor nodes in a graph with known relation types. Given such a fixed-size vocabulary, it is possible to bootstrap an encoding and embedding for any entity, including those unseen during training. Experiments show that NodePiece performs competitively in node classification, link prediction, and relation prediction tasks while retaining less than 10% of explicit nodes in a graph as anchors and often having 10x fewer parameters. To this end, we show that a NodePiece-enabled model outperforms existing shallow models on a large OGB WikiKG 2 graph having 70x fewer parameters.
    Comment: Accepted to ICLR 2022

    الوصول الحر: http://arxiv.org/abs/2106.12144Test

  7. 7
    تقرير

    مصطلحات موضوعية: Computer Science - Machine Learning

    الوصف: In recent years, the Transformer architecture has proven to be very successful in sequence processing, but its application to other data structures, such as graphs, has remained limited due to the difficulty of properly defining positions. Here, we present the $\textit{Spectral Attention Network}$ (SAN), which uses a learned positional encoding (LPE) that can take advantage of the full Laplacian spectrum to learn the position of each node in a given graph. This LPE is then added to the node features of the graph and passed to a fully-connected Transformer. By leveraging the full spectrum of the Laplacian, our model is theoretically powerful in distinguishing graphs, and can better detect similar sub-structures from their resonance. Further, by fully connecting the graph, the Transformer does not suffer from over-squashing, an information bottleneck of most GNNs, and enables better modeling of physical phenomenons such as heat transfer and electric interaction. When tested empirically on a set of 4 standard datasets, our model performs on par or better than state-of-the-art GNNs, and outperforms any attention-based model by a wide margin, becoming the first fully-connected architecture to perform well on graph benchmarks.
    Comment: Accepted in Proceedings of NeurIPS 2021

    الوصول الحر: http://arxiv.org/abs/2106.03893Test

  8. 8
    تقرير

    المصدر: Artificial Intelligence in the Life Sciences (2022): 100036

    الوصف: Knowledge Graphs (KG) and associated Knowledge Graph Embedding (KGE) models have recently begun to be explored in the context of drug discovery and have the potential to assist in key challenges such as target identification. In the drug discovery domain, KGs can be employed as part of a process which can result in lab-based experiments being performed, or impact on other decisions, incurring significant time and financial costs and most importantly, ultimately influencing patient healthcare. For KGE models to have impact in this domain, a better understanding of not only of performance, but also the various factors which determine it, is required. In this study we investigate, over the course of many thousands of experiments, the predictive performance of five KGE models on two public drug discovery-oriented KGs. Our goal is not to focus on the best overall model or configuration, instead we take a deeper look at how performance can be affected by changes in the training setup, choice of hyperparameters, model parameter initialisation seed and different splits of the datasets. Our results highlight that these factors have significant impact on performance and can even affect the ranking of models. Indeed these factors should be reported along with model architectures to ensure complete reproducibility and fair comparisons of future work, and we argue this is critical for the acceptance of use, and impact of KGEs in a biomedical setting.

    الوصول الحر: http://arxiv.org/abs/2105.10488Test

  9. 9
    تقرير

    الوصف: Adversarial attacks expose important vulnerabilities of deep learning models, yet little attention has been paid to settings where data arrives as a stream. In this paper, we formalize the online adversarial attack problem, emphasizing two key elements found in real-world use-cases: attackers must operate under partial knowledge of the target model, and the decisions made by the attacker are irrevocable since they operate on a transient data stream. We first rigorously analyze a deterministic variant of the online threat model by drawing parallels to the well-studied $k$-secretary problem in theoretical computer science and propose Virtual+, a simple yet practical online algorithm. Our main theoretical result shows Virtual+ yields provably the best competitive ratio over all single-threshold algorithms for $k<5$ -- extending the previous analysis of the $k$-secretary problem. We also introduce the \textit{stochastic $k$-secretary} -- effectively reducing online blackbox transfer attacks to a $k$-secretary problem under noise -- and prove theoretical bounds on the performance of Virtual+ adapted to this setting. Finally, we complement our theoretical results by conducting experiments on MNIST, CIFAR-10, and Imagenet classifiers, revealing the necessity of online algorithms in achieving near-optimal performance and also the rich interplay between attack strategies and online attack selection, enabling simple strategies like FGSM to outperform stronger adversaries.
    Comment: ICLR 2022

    الوصول الحر: http://arxiv.org/abs/2103.02014Test

  10. 10
    تقرير

    المصدر: Briefings in Bioinformatics, 2022

    مصطلحات موضوعية: Computer Science - Artificial Intelligence

    الوصف: Drug discovery and development is a complex and costly process. Machine learning approaches are being investigated to help improve the effectiveness and speed of multiple stages of the drug discovery pipeline. Of these, those that use Knowledge Graphs (KG) have promise in many tasks, including drug repurposing, drug toxicity prediction and target gene-disease prioritisation. In a drug discovery KG, crucial elements including genes, diseases and drugs are represented as entities, whilst relationships between them indicate an interaction. However, to construct high-quality KGs, suitable data is required. In this review, we detail publicly available sources suitable for use in constructing drug discovery focused KGs. We aim to help guide machine learning and KG practitioners who are interested in applying new techniques to the drug discovery field, but who may be unfamiliar with the relevant data sources. The datasets are selected via strict criteria, categorised according to the primary type of information contained within and are considered based upon what information could be extracted to build a KG. We then present a comparative analysis of existing public drug discovery KGs and a evaluation of selected motivating case studies from the literature. Additionally, we raise numerous and unique challenges and issues associated with the domain and its datasets, whilst also highlighting key future research directions. We hope this review will motivate KGs use in solving key and emerging questions in the drug discovery domain.

    الوصول الحر: http://arxiv.org/abs/2102.10062Test