Names: ______
Group: ______
Connect DNA to DISEASE Using BLAST
Introduction
We’ve learned that DNA is the genetic material that organisms inherit from their parents, but have you ever thought about what exactly this DNA encodes for? How do our cells use DNA as a set of instructions for life? How is the information in our DNA/genes used by our bodies? And what happens when the DNA is mutated or not used properly?
Materials (per group)
DNA sequence
Computer with an internet connection
Setup
Using DNA chips a researcher inferred that a gene sequence is differentially expressed in a patient’s tissue. In order to figure out which disease is affecting the patient, you must first figure out which disease is associated with the gene. To do this you will first use the program BLAST to identify the protein associated with the gene. Next, you will perform a Google search to find out the disease associated with that protein.
Procedure
1. Obtain your DNA sequence from your teacher.
2. Convert your DNA sequence into a complementary mRNA sequence.
EXAMPLE: DNA: T A C G G C T A G
↓
mRNA: A U G C C G A U C
Your DNA sequence:
mRNA sequence:
3. Determine the codons.
EXAMPLE: mRNA: A U G C C G A U C
↓
Codons: AUG CCG AUC
Codons:
______
4. Translate the codon sequence into an amino sequence. Use the chart provided.
Codons: AUG CCG AUC
↓
Amino Acids: Methionine Proline Isoleucine
Amino Acid Sequence:
______
______
______
5. Write out the one-letter abbreviations for the amino acids in the sequence. Use the chart provided.
______
6. Go to http://www.ncbi.nlm.nih.gov/BLAST/ and choose Protein-Protein BLAST (top of the second column).
7. Enter the one-letter abbreviations for your amino acid sequence in the SEARCH box – be sure to enter them in the correct order!
8. Click on the “BLAST” button.
9. At the next page, click on the “FORMAT” button. It may take a few minutes to process your sequence.
10. At the next page, scroll down to the list of proteins that matched your sequence. Choose one that matches one on the list of possible proteins that was given to you.
11. The protein our DNA sequence encodes is (should be in the list provided): ______
12. Now search www.google.com with the name of your protein to find out the disease your protein is involved in.
12. This protein is involved in the following disease: ______
13. Write a brief paragraph explaining the disease caused by this protein or a mutation in this protein.
14. List 3 things you learned in this activity (either technical concepts, such as using the computer or scientific concepts).
(1)
(2)
(3)
15. You can also identify nucleotide sequences directly using the BLAST server, and then use the gene name to find the associated disease. See if you can identify the disease associated with the following nucleotide sequence:
ATGGCGACCCTGGAAAAGCTGATGAAGGCCTTCGAGTCCCTCAAGTCCTTCC
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC
(1) Abbreviation of gene:
(2) Genetic disease associated with defective gene:
(3) Chromosome number (location of gene):
(4) Describe the effects/symptoms (phenotype) of the genetic disease:
Challenge Questions
1. Explain how changes in DNA coding sequence can lead to disease.
2. What do you think the ramifications are for insurance companies knowing DNA sequences of individuals?
3. Why do you think pharmaceutical companies are patenting gene sequences?
4. If you were a scientist working with mice and discovered a gene that had something to do with obesity in mice, describe how you might find out if there is a similar gene that is known to exist in humans?
5. If you had more nucleotides in your sequence to enter into BLAST (say 1000 instead of 100), do you think it would find more specific or less specific matches? Explain your answer. How would you conduct an experiment using the sequences you’ve been given and the BLAST server to provide evidence for your answer.
6. How would scientists all over the world check to see what a newly sequenced region of DNA is similar to? What do you think they do with the new DNA sequence if it is unknown? Explain
7. Describe how mutations affect BLAST results. How would you conduct an experiment using the sequences you’ve been given and the BLAST server to provide evidence for your answer. Why is this important? Explain.
8. How could a scientist use BLAST to get a rough estimate of how closely related two organisms are?
9. Does running BLAST using nucleotides or amino acids yield more specific matches? Explain.
AMINO ACID CHARTS AND PROTEIN NAMES
AMINO ACID / abbreviationAlanine / A
Arginine / R
Asparagine / N
Aspartic acid / D
Cysteine / C
Glutamine / Q
Glutamic acid / E
Glycine / G
Histidine / H
Isoleucine / I
Leucine / L
Lysine / K
Methionine / M
Phenylalanine / F
Proline / P
Serine / S
Threonine / T
Tryptophan / W
Tyrosine / Y
Valine / V
Possible proteins
Presenilin 2
Synuclein
Laforin
Leptin
BRCA 2
Dystrophin
Apolipoprotein E