A combined bioinformatics and molecular biological approach for the identification of post-transcriptional recoding events in the human transcriptome.

Team Maas/Lopresti

Prof. Stefan Maas, Biological Sciences

Prof. Dan Lopresti, Computer Science and Engr.

Written By: Emaan Abdul-Majid

The human genome has only recently been discovered and with it has come many questions. One question being researched by the Maas lab is how prevalent is RNA editing in the human genome. A to I RNA editing is the process by which an adenosine nucleotide (A) is changed to an inosine nucleotide (I) in mRNA by enzymes called adenosine deaminase acting on RNA (ADARs). The significance of this research is that editing sites may explain the complexity of the highly intelligent species known as the human. RNA editing causes gene splicing and may explain the growth and change of mRNA in the genetic coding of humans. Some research has found RNA editing to cause such disorders as schizophrenia,amyotrophic lateral sclerosis, and depression. Researching editing sites can be very helpful in attaining a better understanding of genetic coding in humans.

The human genome is so extensive thatbioinformatics tools must be used to filter through available data. The human genome has proved to be a mysterious and complex mechanism. In order to gain a better understanding of it bioinformatics allows research to be done in the most time efficient way. We used a number of bioinformatics tools such as websites on the internet- UCSC Genome Browser Home, MFOLD, Oligocalc and Nucleotide Blast- as well as a computer program RNA Editing Data Flow Systems, REDS. These programs were used to evaluate sites that would be researched further in lab. Using the tools we generated a list of top 24 hits to evaluate. The data bases on the internet contained all the up to date information regarding the human genome and prevented unnecessary lab work. It also allowed us to try and focus on genetic editing and be sure that the site being found was not an SNP (single nucleotide polymorphism).

Once we found our top 24 hits we began to research them . Using the bioinformatics websites we designed primers to use to amplify our DNA. Our next step was to amplify the target or hit by doing a PCR. If the PCR was successful and amplified the data correctly, according to our knowledge from the data bases we used, we continued to further evaluate the specific site by completing a phenol chloroform extraction and precipitation thereby purifying theRNA . This was followed by an agarose gel extraction of the product allowing us to isolate the DNA strand that we needed.The sights were sub cloned to determine how many colonies could be produced and also as a means to discover if editing sites could be found successfully. Once this was all completed the DNA could be sent for sequencing. The sequences were then evaluated to find out if the right DNA strand had been sequenced. If the DNA sequence was correct the sequence could be evaluated for editing sites. If double peaks were found in or around predicted editing sites we knew that the data we had accumulated and our lab work was completed successfully.

The first sites we searched for were in SMAP, SNX, FZD, and VO. In SMAP the correct gene was not found after returning from sequencing. SNX was subcloned and 58 inserts were sent for sequencing but none of the sites found were edited. FZD had 12 inserts sent for sequencing and returned with the correct DNA sequence but no editing sites. VO was foundto have 5 editing sites in the cDNA and genomic DNA.

I contributed by partaking in every step of this work. I evaluated and chose sites using the bioinformatics tools available to me. Then I continued by taking part in the lab work. REDS was used and evaluated during the course of the research. Many improvements were made. Our goal was to make REDS as conducive as possible to finding editing sites and foldbacks in RNA. In the future REDS should be a tool to help improve the time spent in lab and make finding editing sites faster and more accurate. The human genome is so extensive that a tool such as REDS will help filterthrough the immense amount of data and prevent the need of shuffling through several different tools to find the information needed to research the human genome. During the research, a folding heuristic was developed and added to REDS. REDS folding heuristic was used to find known sites as well as others. MFOLD is a folding heuristic which uses thermodynamics to identify foldbacks and is more accurate than REDS. It was used to evaluate REDS. Although REDS did not always prove to be accurate it was still helpful in allowing us to gain additional knowledge regarding sequencing and editing.