Materials and methods

Metabolic Network

Following the work in [1], weused a comprehensive collection of the yeast S. cerevisiaemetabolic reactions as reported in [2]. Traditionally, metabolic networks are described by using metabolites as vertexes and enzymes as edges. To account for the connectivity of each enzyme, traditional metabolic networks were converted into networks in which enzymes are defined as vertexes and metabolites as links between the nodes [3]. Since one specific reaction is directional or reversible in a pathway, a directed link from enzyme A to enzyme B is added if the product of enzyme A is the substrate of enzyme B. The loops and multiple links are removed.Removal of loops will neither affect calculation of betweenness nor influence conclusions to be drawn. A reversible reaction was treated as two irreversible reactions with different directions. Additional data file 2(available online) shows examples of metabolic connectivity.

Having establishedthe directed graph for enzymes, we used the Pajek tool to calculate various parameters of a node in metabolic networks, including indegree, outdegree, connectivity, and betweenness [4]. In order to include only functional relationships in the calculation of the enzyme connectivities, we excluded the 14 highlyconnected metabolites and co-factors (ATP, H, ADP, pyrophosphate, orthophosphate, CO2, NAD,glutamate, NADP, NADH, NADPH, AMP, NH3, and CoA). As a result, a small proportion of enzymes became disconnected from the network, that is, they have zero connectivity. These enzymes were excluded from further analysis. In addition, large macromolecular complexes (containing severalORFs) were represented by single enzymatic nodes in the calculationof network parameters for other metabolic enzymes and excluded from the analysis as in [1].The betweenness of a nodev isdefined as the number ofshortest pathsgoing through thatnodeas in [5].

(1)

where:V is the set of nodes and |V| represents the number of nodes in V; is the number of shortest paths from nodes to nodet; is the number of shortest paths from nodes to nodet lying on nodev.

In a directed graph, if indegree or outdegree of a node was zero, its betweenness becomes uninformative and thus was assigned a value of zero. So, the nodes with value 0 for indegree or outdegree were not included in the main analysis. Furthermore, the enzymes unassigned to known open reading frames (ORFs) were excluded although they were used to calculate the network parameters. This leads to the final dataset that contains 512 enzymatic genes with valid network parameters.

We first investigated significance of modularity of the yeast metabolic network by considering the network as a directed graph with enzymes as vertices and metabolites as edgesaccording to the method proposed in [6], which defined a modularity measure as with and eij beingthe proportion of edgesin the network that connect vertices in group i to thosein group j.In fact, Q represents the fraction of edges in the network that connectvertices of the same type minus the expected value of the same quantity whenever edges fallat random without regard for the community structure.If a particular division gives no more within-communityedges than would be expected by chance,Q takes a value of 0. Values other than 0 indicate deviations fromrandomness, and in practice anQ value greater than0.3 indicates significant modular structure of the network under question.

Molecular evolution

The Ka/Ks between orthologous sequences of S. cerevisiaeandS. paradoxusused in the analysis were obtainedfrom the work of Kellis et al. [7]. The essential gene data werecollected from the Saccharomyces genome database (SGD) [8], which was built on a large scale gene deletion experiment by Giaever et al. [9]. Essential genes were defined to be those whose elimination in one or more laboratory environments was effectively lethal. The duplicate genes were identified in the yeast genome using BLASTP [10]. Any putative duplicate gene pairs were rejected when their amino acid sequences had either a similarity of < 40% or less than 100 aligned amino acid residues. The mRNA expression level data were obtained from the work of Holstege et al. [11]. The estimated number of mRNA molecules per cell was used in the analysis.

References

  1. Vitkup D, Kharchenko P, Wagner A:Influence of metabolic network structure and function on enzyme evolution. Genome Biology2006,7:R39.
  2. Forster J, Famili I, Fu P, Palsson BO, Nielsen J: Genome-scale reconstruction of the Saccharomyces cerevisia metabolic network. Genome Res.2003, 13:244-253.
  3. Wagner A, FellDA:The small world inside large metabolic networks. Proc. Biol. Sci. 2001.268:1803-1810.
  4. Batagelj V, Mrvar A:Pajek – Analysis and Visualization of Large Networks. Connections 1998,21:47-57.
  5. Brandes U: A faster algorithm for betweenness centrality. J. Math. Sociol.2001,25:163-177.
  6. Newman MEL, Girvan M: Finding and evaluating community structure in networks.Physics Review E. 2003, 68: 0308217.
  7. Kellis M, Patterson N, EndrizziM, BirrenB, LanderES:Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature2003,423: 241-254.
  8. Dwight SS, Balakrishnan R, Christie KR, et al.(23 co-authors):Saccharomyces genome database: Underlying principles and organization.Briefings in Bioinformatics2004,5: 9-22.
  9. Giaever G, Chu AM, Ni L, Connelly C, Riles L, et al.:Functional profiling of the Saccharomyces cerevisiae genome. Nature2002,418:387-391.
  10. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, LipmanDJ: Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acid Res.1997,25:3389-3402.
  11. Holstege FC, JenningsEG, WyrickJJ, LeeTI, HengartnerCJ, GreenMR, GolubTR, LanderES,YoungRA:Dissecting the regulatory circuitry of a eukaryotic genome. Cell1998, 95:717-728.

1