Differential filtering of genetic data

التفاصيل البيبلوغرافية
العنوان: Differential filtering of genetic data
Patent Number: 9,798,855
تاريخ النشر: October 24, 2017
Appl. No: 12/986986
Application Filed: January 07, 2011
مستخلص: Computer software products, methods, and systems are described which provide functionality to a user conducting experiments designed to detect and/or identify genetic sequences and other characteristics of a genetic sample, such as, for instance, gene copy number and aberrations thereof. The presently described software allows the user to interact with a graphical user interface which depicts the genetic information obtained from the experiment. The presently disclosed methods and software are related to bioinformatics and biological data analysis. Specifically, provided are methods, computer software products and systems for analyzing and visually depicting genotyping data on a screen or other visual projection. The presently disclosed methods and software allow the user conducting the experiment to differentially filter complex genetic data and information by varying genetic parameters and removing or highlighting visually various regions of genetic data of interest (CytoRegions). These differential filters may be applied by the user to the entire set of genetic data and/or only to the specific CytoRegions of interest.
Inventors: Dowds, Carl A. (Sunnyvale, CA, US); McIntyre, Jody C. (Pacifica, CA, US); Erwin, Edgar E. (Berkeley, CA, US); Wilson, Garret D. (San Francisco, CA, US); Parmar, Pragna B. (Cupertino, CA, US); Ohlson, Breck S. (Gilroy, CA, US); Shippy, Richard D. (Scottsdale, AZ, US); Cifuentes, Francisco J. (San Francisco, CA, US)
Assignees: Affymetrix, Inc. (Santa Clara, CA, US)
Claim: 1. A computer implemented method of differentially filtering genetic data, which comprises: accessing, by a computer comprising a processor and a memory, data of intensity measurements corresponding to hybridization of target nucleic acids to an array of single nucleotide polymorphism nucleic acid probes and copy number nucleic acid probes, wherein the single nucleotide polymorphism nucleic acid probes are designed to identify one or more single nucleotide polymorphisms in the target nucleic acids and the copy number nucleic acid probes are designed to identify one or more copy number variations in the target nucleic acids; applying, by the processor, one or more algorithms stored in memory to the data of the intensity measurements to analyze and convert the data of the intensity measurements to genetic data selectable by a user on at least one input interface window; displaying the genetic data on a visual display device connected to the computer, wherein the computer comprises a computer program configured to visually display the genetic data to a user on the visual display device in a display area comprising a plurality of user configurable windows displayed on the visual display device, wherein the plurality of user configurable windows comprises a multiple chromosomes view, a selected chromosome view, and a segment filters view, wherein each view comprises at least one genomic map obtained from the genetic data; presenting the user with an input interface via the computer, wherein the input interface is presented on the visual display device and receives input from the user via an interface including the at least one genomic map to modify the views of the user configurable windows; receiving, by the computer and from the input interface, a selection of a subset of genetic data designated as one or more regions of genetic data characterized by loss of heterozygosity, long continuous stretches of homozygosity, copy number mosaicism, or copy number variation; receiving, by the computer and from the user through the at least one input interface window, a first set of parameters selected by the user for filtering the one or more regions of genetic data; differentially filtering, by a processor of the computer, the subset of genetic data in response to receiving the first set of parameters by determining a first filtered subset of the genetic data that corresponds to the first set of parameters; and displaying on the visual display device, the first filtered subset of the genetic data corresponding to the first set of parameters nearly simultaneously as the first set of parameters is received by the computer, wherein the first filtered subset of genetic data is visualized using various colors, icons or other visual markers in the user configurable windows to distinguish from the genetic data that does not correspond to the first set of parameters.
Claim: 2. The method according to claim 1 , which further comprises: receiving, by the computer and from the user, a second set of parameters for the computer program, which is different from the first set of parameters, wherein the second set of parameters pertains to all genetic data other than the selected subset of one or more regions of genetic data characterized by loss of heterozygosity, long continuous stretches of homozygosity, copy number mosaicism, or copy number variation; differentially filtering the genetic data by determining a second filtered subset of the genetic data that corresponds to the second set of parameters, wherein the second set of parameters are entered in the computer program by the user through the at least one input interface window; and displaying on the visual display device nearly instantaneously as the second set of parameters is received from the computer and from the user, the second filtered subset of genetic data corresponding to the second set of parameters, wherein the second filtered subset of genetic data is visualized on the visual display device using various colors, icons or other visual markers in the user configurable windows to distinguish from the genetic data that does not correspond to the second set of parameters.
Claim: 3. The method according to claim 1 , wherein the first set of parameters are entered into the computer program by the user.
Claim: 4. The method according to claim 1 , wherein the genetic data comprises genetic copy number data.
Claim: 5. The method of claim 1 further comprising differentially filtering the subset of genetic data by at least one of a number of markers, length of genetic sequence, overlap map, or confidence value.
Claim: 6. The method of claim 1 further comprising identifying signal intensities from a label associated with the hybridization of the target nucleic acids to the array.
Claim: 7. The method of claim 1 further comprising displaying, on the visual display device, the filtered subset of genetic data corresponding to the first set of parameters in multiple colors in order to signify different functionalities, genetic features, and alleles of the genetic data.
Claim: 8. The method of claim 1 further comprising displaying the filtered subset of genetic data corresponding to the first set of parameters on at least one of a computer screen, visual projector, screen, or board.
Claim: 9. The method of claim 1 further comprising displaying the filtered subset of genetic data corresponding to the first set of parameters in multiple windows on a computer screen.
Claim: 10. The method of claim 1 further comprising displaying the filtered subset of genetic data corresponding to the first set of parameters as a genetic map, wherein the genetic map represents a genome of a human, mouse, insect, plant, or bacteria.
Claim: 11. The method of claim 1 further comprising displaying the filtered subset of genetic data corresponding to the first set of parameters as a genetic map having a user interface accessible to the user, wherein the genetic map indicates different characteristics of the genetic data to the user.
Claim: 12. The method of claim 1 further comprising displaying the filtered subset of genetic data corresponding to the first set of parameters as a genetic map depicted in the form of chromosomes.
Claim: 13. The method of claim 1 further comprising displaying the filtered subset of genetic data corresponding to the first set of parameters as a genetic map depicted as a chromosome in a window on a computer screen in a computer software application or browser environment.
Claim: 14. The method of claim 1 further comprising displaying one or more subsets of genetic data in multiple windows on a computer screen in a computer software application or browser environment, wherein each window depicts data selected by the user.
Claim: 15. The method of claim 1 further comprising displaying the filtered subset of genetic data corresponding to the first set of parameters as segments of a genome on a computer screen in a computer software application or browser environment, wherein the segments of the genome comprise visual or tabular representations of genetic events.
Claim: 16. The method of claim 1 further comprising: displaying the filtered subset of genetic data corresponding to the first set of parameters as a selected segment of a chromosome on a computer screen in a computer software application or browser environment, wherein the segment is selected by the user through the input interface, the segment comprising a visual or tabular representation of genetic events.
Claim: 17. The method of claim 1 , wherein the genetic data comprises genotypes, and wherein applying one or more algorithms further comprises: applying a dynamic modeling algorithm to the data of the intensity measurements by fitting values of the intensity measurements to dynamic models and determining the genotypes by a best fit of the values of the intensity measurements for each dynamic model.
Claim: 18. A computer program product embedded in a non-transitory computer readable medium comprising instructions executable by a computer processor to perform differential filtering of genetic data and interactively display the genetic data to a user, the instructions comprising: accessing, by a computer comprising a processor and a memory, data of intensity measurements corresponding to hybridization of target nucleic acids to an array of single nucleotide polymorphism nucleic acid probes and copy number nucleic acid probes, wherein the single nucleotide polymorphism nucleic acid probes are designed to identify one or more single nucleotide polymorphisms in the target nucleic acids and the copy number nucleic acid probes are designed to identify one or more copy number variations in the target nucleic acids; applying, by the processor, one or more algorithms stored in memory to the data of the intensity measurements to analyze and convert the data of the intensity measurements to genetic data selectable by a user on at least one input interface window displayed on the visual display device; displaying the genetic data on a visual display device connected to the computer, wherein the computer comprises a computer program configured to visually display the genetic data to a user on the visual display device in a display area comprising a plurality of user configurable windows displayed on the visual display device, wherein the plurality of user configurable windows comprises a multiple chromosomes view, a selected chromosome view, and a segment filters view wherein each view comprises at least one genomic map obtained from the genetic data; presenting the user with an input interface via the computer, wherein the input interface is presented on the visual display device and receives input from the user via an interface including the at least one genomic map to modify the views of the user configurable windows; receiving, by the computer and from the input interface, a selection of a subset of genetic data designated as one or more regions of genetic data characterized by loss of heterozygosity, long continuous stretches of homozygosity, copy number mosaicism, or copy number variation; receiving, by the computer and from the user through the at least one input interface window, a first set of parameters selected by the user for filtering the one or more regions of genetic data; differentially filtering, by a processor of the computer, the subset of genetic data in response to receiving the first set of parameters by determining a first filtered subset of the genetic data that corresponds to the first set of parameters; and displaying on the visual display device, the first filtered subset of genetic data corresponding to the first set of parameters nearly simultaneously as the first set of parameters is received by the computer, wherein the filtered subset of genetic data is visualized on the visual display device using various colors, icons or other visual markers in the user configurable windows to distinguish from the genetic data that does not correspond to the first set of parameters.
Claim: 19. The computer program product of claim 18 , the instructions further comprising: receiving, by the computer and from the user, a second set of parameters for the computer program, which is different from the first set of parameters, wherein the second set of parameters pertains to all genetic data other than the selected subset of one or more regions of genetic data characterized by loss of heterozygosity, long continuous stretches of homozygosity, copy number mosaicism, or copy number variation; differentially filtering the genetic data by determining a second filtered subset of genetic data that corresponds to the second set of parameters, wherein the second set of parameters are entered in the computer program by the user through the at least one input interface window; and displaying on the visual display device nearly instantaneously as the second set of parameters is received from the computer and from the user, the filtered subset of genetic data corresponding to the second set of parameters, wherein the second filtered subset of genetic data is visualized on the visual display device using various colors, icons or other visual markers in the user configurable windows to distinguish from the genetic data that does not correspond to the second set of parameters.
Claim: 20. A genetic data differential filtering system, comprising: a visual display device; a network-enabled computer connected to the visual display device, the computer comprising: a memory storing data of intensity measurements corresponding to hybridization of target nucleic acids to an array of single nucleotide polymorphism nucleic acid probes and copy number nucleic acid probes, wherein the single nucleotide polymorphism nucleic acid probes are designed to identify one or more single nucleotide polymorphisms in the target nucleic acids and the copy number nucleic acid probes are designed to identify one or more copy number variations in the target nucleic acids; and a processor configured for applying one or more algorithms stored in memory to the data of the intensity measurements to analyze and convert the data of the intensity measurements to genetic data selected by a user on at least one input interface window; wherein the computer further comprises a computer program configured to: visually display the genetic data to a user on the visual display device in a display area comprising a plurality of user configurable windows displayed on the visual display device, wherein the plurality of user configurable windows comprises a multiple chromosomes view, a selected chromosome view, and a segment filters view wherein each view comprises at least one genomic map obtained from the genetic data; and receive input from a user via an input interface, wherein the input interface is presented via the computer on the visual display device and is configured to: receive input from the user via an interface including the at least one genomic map to modify the views of the user configurable windows; receive a selection of a subset of genetic data designated as one or more regions of genetic data characterized by loss of heterozygosity, long continuous stretches of homozygosity, copy number mosaicism, or copy number variation; and receive, by the computer and from the user through the at least one input interface window, a first set of parameters selected by the user for filtering the one or more regions of genetic data; differentially filter, by a processor of the computer, the subset of genetic data in response to receiving the first set of parameters by determining a filtered subset of the genetic data that corresponds to the first set of parameters; and display on the visual display device, the filtered subset of genetic data corresponding to the first set of parameters nearly simultaneously as the first set of parameters is received by the computer, wherein the filtered subset of genetic data is visualized using various colors, icons or other visual markers in the user configurable windows to distinguish from the genetic data that does not correspond to the first set of parameters.
Patent References Cited: 5593839 January 1997 Hubbell et al.
5733729 March 1998 Lipshutz et al.
5795716 August 1998 Chee
5858659 January 1999 Sapolsky et al.
5974164 October 1999 Chee
6066454 May 2000 Lipshutz et al.
6090555 July 2000 Fiekowsky et al.
6185561 February 2001 Balaban et al.
6188783 February 2001 Balaban et al.
6223127 April 2001 Berno
6308170 October 2001 Balaban
6309822 October 2001 Fodor
6420108 July 2002 Mack et al.
6611767 August 2003 Fiekowsky et al.
6687692 February 2004 Balaban et al.
6816867 November 2004 Jevons et al.
6829376 December 2004 Bartell
6954699 October 2005 Jevons et al.
7031846 April 2006 Kaushikkar et al.
7130458 October 2006 Bartell
7280922 October 2007 Mei et al.
7451047 November 2008 Jevons et al.
7992098 August 2011 Kaushikkar et al.
8027823 September 2011 Barrett et al.
8392355 March 2013 Kennedy et al.
8855935 October 2014 Myres et al.
2002/0183936 December 2002 Kulp et al.
2003/0100995 May 2003 Loraine et al.
2003/0157545 August 2003 Jevons et al.
2004/0006431 January 2004 Bartell et al.
2004/0126840 July 2004 Cheng et al.
2004/0138821 July 2004 Chiles
2004/0199544 October 2004 Balaban et al.
2004/0220897 November 2004 Bernhart et al.
2005/0123971 June 2005 Di et al.
2005/0287575 December 2005 Di et al.
2006/0111849 May 2006 Schadt
2006/0142949 June 2006 Helt
2006/0184038 August 2006 Smith et al.
2006/0241868 October 2006 Sun et al.
2008/0287308 November 2008 Hubbell et al.
2009/0098547 April 2009 Ghosh
2009/0137417 May 2009 Fu
2009/0222400 September 2009 Kupershmidt
2010/0281401 November 2010 Tebbs
2011/0250602 October 2011 Rosenow et al.
2012/0214704 August 2012 Huang et al.
2013/0169645 July 2013 Mack et al.







Other References: Parks, D. H. Genome Research 2009 vol. 19 pp. 1896-1904. cited by examiner
Wang et al., “Large-Scale Identification, Mapping and Genotyping of Single-Nucleotide Polymorphisms in the Human Genome,” Science, vol. 280, May 15, 1998, pp. 077-1082. cited by applicant
Gingeras, et al., “Simultaneous Genotyping and Species Identification Using Hybridization Pattern Recognition Analysis of Generic Mycobacterium DNA Arrays,” Genome Research, 8, Feb. 17, 1998, pp. 435-448. cited by applicant
Halushka, et al., “Patterns of single-nucleotide polymorphisms in candidate genes for blood-pressure homeostasis,” Nature Genetics, vol. 22, Jul. 1999, pp. 239-247. cited by applicant
Eddy, Sean R., “What is a hidden Markov model?,” Nature Biotechnology, vol. 22, No. 10, Oct. 2004, pp. 1315-1316. cited by applicant
Rabiner, L., “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,” Proceedings of the IEEE, vol. 77, Feb. 1989, pp. 257-286. cited by applicant
Mei et al., “Genome-wide Detection of Allelic Imbalance Using Human SNPs and High-density DNA Arrays,” Genome Research, 10, Jun. 2000, pp. 1126-1137. cited by applicant
Lindblad-Toh et al., “Loss-of-heterozygosity analysis of small-cell lung carcinomas using single-nucleotide polymorphism arrays,” Nature Biotechnology, vol. 18, Sep. 2000, pp. 1001-1005. cited by applicant
Primary Examiner: Zeman, Mary
Attorney, Agent or Firm: Mauriel Kapouytian Woods LLP
Lee, Elaine
Mauriel, Michael
رقم الانضمام: edspgr.09798855
قاعدة البيانات: USPTO Patent Grants