CPN60 Gene Conservancy Via Protein-Sequence Comparison Across Extremophilic Microorganisms

Richard Barrera

Department of Biological Sciences

Saddleback College

Mission Viejo, CA 92692.

Microorganisms can survive and flourish in environments which are detrimental to the majority of life on the planet. Research focused on final polypeptide modification in extremophilic microorganisms, comparing protein production as in more advanced, eukaryotic cells. I hypothesize that effective protein modification-management is an integral step in early cellular evolution and that therefore the sample organisms will demonstrate a high level of cpn60 gene converservancy; this will provide evidence for the shared ancestry and interrelatedness of organisms separated by millions of years of evolution; my data and recently published literature also hint at how prion and neurodegenerative diseases may be propagated. Specifically, I searched for the presence of the groEL protein across twelve sub-classes of extremophiles. The groEL protein is a 60 kilo-dalton protein, which makes up the large subunit of the groES/L complex and is the primary Type I chaperonin used for final polypeptide modification and maintenance in thousands of organisms. I searched for the presence of the groEL protein, and its exon: cpn60, via amino acid comparison using the National Center for Biotechnology Information’s (NCBI) Basic Local Alignment Search Tool (BLAST) server. Homo sapiens groEL amino acid sequence was used as target query in order to demonstrate similarity between divergent species. N = 24 extremophiles were identified as candidates for protein comparison, selecting one to three members per sub-class; twenty are sequenced and obtainable via public record; seventeen organisms demonstrate ≥ 40% positive match, fourteen demonstrate ≥ 70% positive match, and three organisms use a separate Type I chaperonin. Data suggests homologous toroid chambers for facilitation of post-translational polypeptide modification are extremely prevalent among the majority of organisms on the planet.

Introduction

It is now known that molecular chaperones participate in a large variety of cellular functions. They assist in de novo protein folding, stabilize proteins under duress and maintain polypeptide chain components in a loosely folded state for translocation across organelle membranes (Kumarevel, et al. 1998). Research focused on the presence of the groEL protein, contained in the cpn60 exon; the 60 kD protein is the large subunit of the groEL/S complex, in extremophilic microorganisms. The groEL/S complex is one of the primary Type I chaperonins used for final polypeptide modification, and among other responsibilities, handles the folding of monomeric mitochondrial rhodanese (Mendoza, et al. 1991). The target query was the five hundred and four character sequence of the groEL protein in Homo sapiens. The target region is believed to be a universal target of about five hundred and fifty five bp, and has been found to be a robust target for species-level characterization of bacteria, archaea, and eukaryotes (Hill, et al. 2012). The presence of the protein sequence among examined microorganisms illustrates a shared requirement among divergent species for post-translational polypeptide modification in order to sustain basic cellular function. This is significant because in recent years the scientific community has discovered life across a multitude of environments that were heretofore believed to be uninhabitable: these chaperones, in conjunction with stress-induced shock proteins, act as an efficient protein management system, preventing the aggregation of denatured proteins within the cell and programmed cell death. This has led researchers to reconsider the pervasiveness of life and its ability to adapt, colonize, and thrive in extraordinarily demanding environments. Conversely, we are beginning to understand that prion and neurodegenerative diseases are often the result of malfunctioning chaperones within the cytoplasm or intermitochondrial matrix respectively, which results in protein aggregation and cell death (Schon and Manfredi, 2003).

Materials and Methods

Research was conducted via the following: (1) Thirty microorganisms across twelve sub-classes of extremophiles were identified using publicly available online databases. (2) Identified microorganisms were vetted using the Kyoto Encyclopedia of Genes & Genomes and GenBank in order to identify whether or not the complete genome has been sequenced and published; incomplete sequences are not available for comparison and were eliminated from the population. (3) Genome identification continued until a population of N ≥ 20 was reached. (4) Amino acid examination was conducted via protein comparison using the National Center for Biotechnology Information’s (NCBI) Basic Local Alignment Search Tool (BLAST) server. (5) Percentage conversancy of the amino acid sequence was calculated between the species using the NCBI Graphic Representation Tool. (6) A cladogram was constructed using the NCBI Phylogenetic Tool in order to visualize the divergence between species.

Results

Twenty four species of extremophiles across twelve sub-classes were identified as candidates for comparison. Of the twenty four species identified, twenty microorganisms representing ten groups are sequenced and available on the National Center for Biotechnology Information’s database. I could not locate sequenced organisms belonging to piezophiles or xerophiles. The targeted amino acid sequence of the groEL protein was obtained from Homo sapiens, and contains five hundred and four characters:

MLRLPTVFRQMRPVSRVLAPHLTRAYAKDVKFGADARALMLQGVDLLADAVAVTMGPKGRTVIIEQSWGSPKVTKDGVTVAKSIDLKDKYKNIGAKLVQDVANNTNEEAGDGTTTATVLARSIAKEGFEKISKGANPVEIRRGVMLAVDAVIAELKKQSKPVTTPEEIAQVATISANGDKEIGNIISDAMKKVGRKGVITVKDGKTLNDELEIIEGMKFDRGYISPYFINTSKGQKCEFQDAYVLLSEKKISSIQSIVPALEIANAHRKPLVIIAEDVDGEALSTLVLNRLKVGLQVVAVKAPGFGDNRKNQLKDMAIATGGAVFGEEGLTLNLEDVQPHDLGKVGEVIVTKDDAMLLKGKGDKAQIEKRIQEIIEQLDVTTSEYEKEKLNERLAKLSDGVAVLKVGGTSDVEVNEKKDRVTDALNATRAAVEEGIVLGGGCALLRCIPALDSLTPANEDQKIGIEIIKRTLKIPAMTIAKNAGVEGSLIVEKIMQSSSEVGYDAMAGDFVNMVEKGIIDPTKVVRTALLDAAGVASLLTTAEVVVTEIPKEEKDPGMGAMGGMGGGMGGGMF

Homo sapiens amino acid sequence was selected for comparison in order to punctuate evolutionary conservancy between divergent species. Species were entered into the database, and seventeen organisms were found to contain a similar target sequence. The remaining three organisms, members of the archaea domain, utilize the DnaJ Type I chaperonin in order to complete post-translational polypeptide modification. The seventeen species of extremophiles containing the cpn60 protein sequence demonstrated a ≥ 40% positive match, fourteen demonstrated a ≥ 70% positive match, as shown in Figure 1. Figure 1A gives a graphic representation of the matching sections of genome, generated by the NCBI. A visual representation of genetic divergence is depicted in Figure 2. The expect value (E-value) was ≤ 3.00x10-11.

Discussion

The E-Value represents background noise, or the percent likelihood that a false positive will be encountered in the query sequence. The sub-classes included in the research were: acidophiles, alkaliphiles, cryptoendoliths, osmophiles, lithoautotrophs, metallophiles, oligotrophs, piezophiles, psychrophiles, radiophiles, thermophiles, and xerophiles. This data is significant because it confirms that life is predicated on the proper function, maintenance, and destruction of proteins. Cells cannot function without a form of intermediary chamber which allows polypeptide chains the chance to assume , resume, or degrade their tertiary structures. As such, the evolution of chaperonins was an integral and promethean step in the evolution of life on Earth. Additionally, any chaperonin mutation which alters its interaction with hydrolysable ATP binding, or alters the protein-modification chamber in such a way as to produce a renegade protein, may result in significant havoc and ultimately cell death (Walters et al. 2002). For example, research has found that the malfunctioning of oxidative phosphorylation pathways in mitochondria leads to the excess generation of reactive oxygen species. These species decimate the mitochondria, altering the structure of whatever they come in contact with, including chaperonins. These altered chaperonins can no longer fulfill their duties of protein maintenance, and as a result, the mitochondria self-destructs (Mukherjee and Chakrabarti, 2013).

Identical Match % / Positive Match % / Identical Match / Expect value
Sphingopyxis alaskensis / 56.16 / 78.36 / 233 / 0
Saccharomyces cerevisiae / 56.98 / 75.23 / 224 / 0
Wallemia ichthyophaga / 56.5 / 72.74 / 235 / 0
Debaryomyces hansenii / 56.91 / 76 / 231 / 0
Pelagibacter ubique / 54.7 / 76.69 / 238 / 0
Nitrosomonas sp. AL212 / 52.37 / 74.19 / 250 / 0
Cupriavidus metallidurans / 53.77 / 74.15 / 244 / 0
Acidithiobacillus ferrooxidans / 51.04 / 74.38 / 257 / 0
Thiobacillus denitrificans / 53.02 / 73.4 / 247 / 1.00E-180
Flavobacterium psychrophilum JIP02/86 / 51.7 / 73.3 / 251 / 1.00E-179
Bacillus subtilis / 51.33 / 73.19 / 253 / 1.00E-179
Amphibacillus xylanus / 50.57 / 72.62 / 257 / 9.00E-179
Deinococcus radiodurans / 48.7 / 70.26 / 264 / 1.00E-163
Pyrolobus fumarii 1A / 23.36 / 44.86 / 340 / 1.00E-017
Methanopyrus kandleri AV19 / 23.23 / 42.47 / 323 / 5.00E-017
Methanococcoides burtonii DSM 6242 / 23.03 / 43.07 / 340 / 2.00E-015
Sulfolobus solfataricus P2 / 22.02 / 41.74 / 352 / 3.00E-011

Figure 1. Hit table generated by BLAST data analysis. Percent match is in relation to Homo sapiens template code. Identical match is number of correct chemical and spatial amino acid matches. Chart is ordered by E Value (most certain match to least).


Figure 1A. Graphical representation of hit table from Figure 1. Red areas indicate identical matches, grey areas indicate positive matches. Organisms are listed in the same order as Figure 1.


Figure 2. A phylogenetic tree based on genetic divergence of 504 character amino acid sequence; present are the seventeen microorganisms sampled which contain the cpn60 gene.