HAP Webserver: Early step towards personalized medicine

by Grace Shaw

Udpated August 1, 2005

Science magazine recently published "125 big questions that face scientific inquiry over the next quarter-century." The question, "To What Extent Are Genetic Variation and Personal Health Linked?" is raised among the top 25.

Genetic variation explains individuals’ response to drugs and susceptibility to specific diseases, including cancer and several mental illnesses. Researchers suggest that genetic variation can explain why some people develop a disease while others do not. Carriers of certian genetic variants may have up to twice the risk of disease compared to non-carriers. Some of these variants occur in genes that increases the risk of type 2 diabetes or the risk of Alzheimer’s.

By identifying variation that affects drug response, doctors could potentially prescribe personalized medicine based on an individual’s genetic variation, which would be more beneficial to each individual than standardized medicine. Exposure to dangerous side effects or even ineffectiveness of the drug would likely decrease for the individual.

Genetic association studies determine the relations between diseases and genetic variations. These studies involve the determiniation of variant frequencies among healthy and diseased populations. However, these studies currently face several limitations including costly expenses for collecting data and the lack of tools for analysis. To confront these limitations, groups including the International HapMap Project are developing techniques for genotyping analysis. In addition, several biotech companies including Perlegen Sciences, Inc., Illumina, and Affymetrix are developing highthroughput genotyping technologies.

Researchers at UCSD are working on these tools, including the HAP webserver, funded by the California Institute for Telecommunications and Technology (Calit2). These researchers include graduate and undergraduate students under the direction of Dr. Eleazar Eskin, assistant professor in the Computer Science and Engineering (CSE) Department and a Calit2 researcher.

The students, although working in the CSE Department, come from many disciplines including computer science, bioinformatics, biochemistry and other biology-related fields.

Whole genome association studies are more effective than candidate gene analysis in its analysis of variation and relation to disease. Dr. Eskin coauthored a paper regarding the HapMap and human variation with other researchers at UC Berkeley and Perlegen Sciences, Inc. The paper, "Whole Genome Patterns of Common DNA Variation in Three Diverse Human Populations," published on the front cover of the February 28, 2005 issue of Science, describes genetic variation in European Americans, African Americans, and Han Chinese. The study analyzed genetic variation between the different population groups and also within each population group. Additionally, it showed that a whole-genome association study is possible.

“HAP” is derived from the term haplotype. A haplotype is a set of single nucleotide polymophims (SNPs) on a chromosome that aid the investigation of diseases and disease susceptibility. Haplotypes are usually inherited as a unit. A SNP is a small genetic change, that can occur within a person’s DNA sequence. SNPs are important, because they can change a protein’s biological function. However, haplotype analysis often gives a clearer picture of thevariation than SNPs. Whole genome association studies can show which haplotypes are associated with disease or disease response.

The HAP webserver would provide preliminary work in the identification of human variation related to disease. Hyun Min Kang, a PhD student in Eskin’s lab, describes the project as “an intergrated tool for genetic association analysis which leverages interplay between haplotype reconstruction, statistical analysis, and predictions of functional SNPs”. The user inputs genotype, phenotype, and optional SNP data into the webserver. The data is partitioned into haplotype blocks based on the SNPs, where haplotype predictions will be returned to the user.

The haplotype map deduced from technologies including the webserver would, as Science claims, “further accelerate the search for disease genes.” Kang additionally commented that the webserver “is efficient enough to be scaled up to high-density genome wide association studies.” Most tools already developed are usually used for a single candidate gene analysis and not applicable to the current large datatsets.

Researchers at UCSD anticipate the release of the HAP webserver by then end of Summer, 2005.