BIOL 495S/ CS 490B/ MATH 490B/ STAT 490B
Lab session of Feb. 1, 2002
Objectives: Familiarize NCBI database. Get sequences from Entrez, search similarity by BLAST, and search a structure in PDB.
- Go to NCBI database We use the DNA sequence L04459 as an example. We should first check the option in the search field to specify that this search is for a DNA sequence.
- Read the file we retrieved for this sequence. What is the accession number(s)? What is the locus number? What organism is this sequence from? What other information do we get of this sequence?
- In the search field choose “Protein”, then type HBA_HUMAN in the second search field. What information do we obtain for this sequence? What is the accession number? What is the locus number? What organism is this sequence from? What is the primary sequence? In the “display” options, choose FASTA, and click “display”. The FASTA format of the sequence will be displayed.
- Go to BLAST page in NCBI. We will align HBA_HUMAN to a protein sequence database. There are many specific tools in BLAST. To align a protein to a protein database, we should choose BLASTp.
- The sequence HBA_HUMAN is called query sequence. It is inputted in the search field. We can either copy and paste the FASTA format of the sequence in the search field, or simply type the accession number of the sequence. Some options are available in the “Choose database” field. We will use the default “nr”, which is the non-redundant protein database.
- Go through all the other options in this BLAST search page. We may leave the default values unchanged. What is the substitution score matrix we use? What are the gap penalties? When you are ready, click “BLAST!”. Your alignment submission is then sent to the NCBI web server. A page describing your submission will be shown. You have to click “Format” to display the alignments.
- Read the results of BLAST. Check your query and database. What results are shown in this page? List several similar sequences to the query. What are the scores and E-values of them? Where are the matches and gaps in the alignments of the similar sequences to the query?
- Go to PDB (Protein Data Bank) at We use TRP repressor of E. coli as an example to view the 3D-structure of proteins. In the search field type “TRP repressor”, we obtain 10 items which have their 3D-structure available. Go down the list, we can find the item with PDB ID 2WRP. Click “Explore”, then the pages of structure information for this sequence are displayed. View the structure picture of this sequence. How many helices does it have?