RA via Graph ColoringHM

Register Allocation via Graph Coloring(6/1/2010)

The register allocation (RA) method outlined here assumes object code has been generated, regardless of whether optimization has been applied or not. Object exists in some intermediate form, AKA intermediate language (IL) or intermediate representation (IR). RA does not improve object code. Instead, it attempts to map virtual registers into real registers. The section shows one register allocation method. RA may make object code worse, if spilling and subsequent reloading become necessary.

Synopsis

  • Introduction
  • Definitions
  • Background
  • Outline of Algorithm
  • Graph Coloring Samples
  • Register Allocation
  • References

Introduction

At conclusion of the code generation phase that uses virtual registers it is necessary to map the virtual into real ones. That phase is called the Register Allocation (RA) step. The problem is that sometimes the number of architectural registers provided is less than the number of registers needed to emit correct object code. When that situation arises, spill code is emitted.

The general method is outlined here, but with broad limitations. First, all registers are assumed to be of the same type. For example, no distinction is made between architectural registers that allow floating-point computations and those that perform integer arithmetic. Secondly, the size of all registers is assumed to be constant. For example, this outline does not cover that registers may hold 80-bit extended floating-point values, while others only hold 64 bits. Thirdly, register allocation is attempted only for a single basic block at a time, not for multiple BBs, in which registers would retain their values from one BB to a successor BB. These simplifications reduce the size of the interference graph: No added constraints between different register types have to be provided initially, and no separate graphs have to be handled for register allocation.

Definitions

Basic Block:

Sequence of instructions I with a single point of entry and a single point of exit.

Dead:

An object is dead across a range R of instructions, if it is no longer needed in R or any successor of R. In the context of RA, dead refers to register usage. After its last use (for example in a Basic Block) a register is dead. A dead register can be reused for other purposes. Antonym: live.

Chromatic Number C of Graph G:

The chromatic number C of a graph G is the smallest number of colors with which G can be colored. Computing C of G is NP-complete for general graphs G.

Degree D(N):

The degree D(N) of node N in a graph G isthe number of edges associated with N. Thus the smallest degree is 0, the largest is x, where x is the total number of nodes in G; the edges are undirected, and multiple edges between two identical nodes are subsumed into a single one without loss of information.

Graph:

In the context of register allocation a graph G is a data structure consisting of nodes and undirected edges. A Node represents a live register in use, while an edge from node N1 to N2 indicates that both registers, represented by N1 and N2, are live at the same time.

Graph Coloring:

Coloring a graph G means to associate a color with each node such that no connected nodes N1 and N2 (i.e. nodes that have an edge between N1 and N2) have the same color. Obviously, in a graph G with x nodes the coloring problem is trivially solvable with at least x colors. The challenge is to identify the smallest possible number that allows coloring.

The key idea is as follows. If the smallest degree of all nodes N in graph G is D, then G is not D-colorable. This is obvious, since all nodes have at least D neighbors. Otherwise, if G has at least one node N with < D edges, then G is D-colorable if and only if G’ is D-colorable, with G’ being the graph reduced by removing N and all its edges.

Interference Graph:

The interference graph G is a graph of nodes and edges, in which an edge between N1 and N2 indicates that the life times of the two objects associated with N1 and N2 are simultaneous i.e. they interfere with one another. A physical consequence of this is that any pair of nodes connected via an edge cannot be mapped onto the same register.

Live:

An object (specifically a register) is live after definition (after being set), if its value will be needed in the future. Across its live range a register cannot be used for other purpose. Antonym: dead.

Local RA:

Register Allocation restricted to a single Basic Block. Antonym: Global RA.

NP-Complete:

NP stands for Nondeterministic Polynomial. This is a measure of computational complexity. NP-complete characterizes a class of problems that is so complex that the time required to complete is too high to compute in reasonable time for problems with general-sized data. For a formal definition, see ref: Machtey and Young.

Spill Code:

When the RA algorithms needs another register r, but all machine registers R are currently live, then one of R’s registers needs to store its value into a temporary memory location, generally allocated by the compiler. Later that value can be retrieved via a load. The combination of storing and loading to re-use a register is called spill code.

Virtual Register:

A code generator frequently emits object code that uses registers in a way as if an unbounded supply of them were available. A later phase then must map these fictitious registers into real ones. The fictitious registers are called virtual registers. Antonym: physical register, or architectural register.

Background

At conclusion of the code generation phase that uses virtual registers it is necessary to map the virtual into real ones. That phase is called the Register Allocation (RA) step. The problem is that sometimes the number of architectural registers provided is less than the number of registers needed. In that case spill code is emitted, to temporarily free one more register. One of the more efficient methods of RA is the method via graph coloring. The observation that these 2 problems, the mathematical problem of graph coloring and register allocation are equivalent, had been recognized in the early 1970s by Cocke and Schwartz, but not implemented until the 1980s by Chapin.

Historical highlights of register allocation:

  • 1950searly RA by Freiburghouse
  • 1960sBliss compiler RA, by William Wulf
  • 1970sSethi & Ullman, and Cocke & Schwartz published RA algorithms
  • 1981Chaitin implemented RA via Graph Coloring for IBM
  • 1990Preston Briggs, Rice,improved RA via GC over Chaitin’s method

Outline of Algorithm

Register Allocation via Graph Coloring is expressed by the5 steps below, with much detail skipped. That detail is providedfurther down.

1.Initial state: Code generation or the optimizer created object in intermediate form that can be manipulated. Specifically, spill code can be added, and virtual register numbers can be replaced by real registers. This intermediate language (IL) object is input to the graph coloring RA algorithm. The scheme explained here applies to only one basic block (BB). All register are assumed to be dead at BB exit. There is a total of n virtual registers in use.

2.Build Interference Graph G: Constructing the Interference Graph G proceeds as follows. Nodes in G are the virtual registers vri, i=1..n. Edges in G between any pair of nodes Ni and Njand i /= j indicate that both associated virtual register vri and vrj are live simultaneously. They interfere with one another, which means, it is not possible to map both onto the same physical machine register. These edges are undirected. G can be represented via a square bit matrix or rank n, where n is the number of virtual registers used. Only one half of the matrix is actually needed, and the main diagonal can be omitted, as itis not informative that a register conflicts with itself. Any‘1’ bit in row i and column j indicates that nodes i and j have a connecting edge, i.e. the corresponding nodes are interfering. The duplicate information of the ‘1’ bit for column i and row j is not stored, if we just use a triangular half of the square.

3.R-Coloring: Perform R-coloring of G, with R being the number of real machine registers available. If all n nodes have degree >=R, then R-coloring is not possible. In that case, generate spill code for one of the virtual registers. Select a register with a long life time (high interference rate), effectively breaking the long range into 2 shorter ranges (more nodes with less interference). Construct a new graph and try R-coloring again. If necessary, repeat the spill method. Eventually we shall have an interference graph of degree R plus possibly generated spill code.Degree R means there are at least some nodes in G that have degree less than R; other nods may have a way higher degree.

Now there is at least one node N with D(N) < R. For each node N with degree D(N) R do the following: If there are multiple candidates, select the one with highest degree R; this restrictionreduces the number of edges swiftly. If there are multiplenodes of the same highest degree, select the one with the lowest node number; this restriction is arbitrary, butrenders the algorithm deterministic. Remove the selected node Nfrom G, remember N, thuscreating a simpler graph G’ with one less node and all edges emanating fromN removed as well.

The key mathematical property exploited, and not proven here, is: Graph G is R-colorable if and only if G’ is R-colorable as well. But coloring G’ is simpler. For a proof of this key, fundamental observation, please see the Chaitin paper from 1981.

Repeat node removal until a final G’ is produced that is empty, or until all nodes have D(N) >= R. If the latter happens, apply the spill code again and remove nodes then. Eventually G’ results in an empty graph.

4.Reconstruction of G: Reconstruct G by inserting the removed nodes in the reverse order of deletion. During that process, associate one of the R colors with each node being re-inserted. The first color r=1 picked is arbitrary. Every subsequent color r with 1 < r <= R, should be chosen such that colors selected earlier are re-used as soon as possible. If not possible, pick the next higher numbered color. The method guarantees that one does not exceed R colors.

5.Code Generation: Re-generate the code. But this time, instead of referencing virtual registers, use the registers associated with the colors. Since there are only R colors in each of the n nodes, only a total of R machine registers will be used. However, the spill code emitted early to have a starting graph G’ or later to reduce G’s degree to < R has to be added.

Graph Coloring Samples

Let us analyze a few sample graphs with 4, 5, and 6 nodes. With graphs of such a low complexity the NP-completeness of the coloring algorithm is not detrimental, since computation for optimality is possible and enumeration is feasible. We use the colors of the rainbow in the order ROYGBIV, for red, orange, yellow, green, blue, indigo, and violet.

Sample 4-node graphs: We analyze a completely connected graph with 4 nodes first, then a less complex version.

The graph to the right has 4 nodes, and all nodes are connected to all others. Since each node N has degree D(n)= 3, 3-coloring is not possible. However, coloring with 4 selected rainbow choices is possible, as shown inside the nodes: ROGY. If we had to generate machine code for a sequence of instructions, whose virtual registers are represented by the interference graph to the left and we had only 3 registers, we’d have to generate spill code for at least one of the 4 virtual registers.

The next graph is less connected. There is no connectivity edge between nodes 2 and 4. This simpler graph ends up being 3-colorable. We apply the graph coloring algorithm; there is no need to first generate spill code, since 2 candidates with degree < 3 edgesare present; these are nodes 2 and 4. We can remove either, but select the one with the lower index, thus leaving nodes 1, 3, and 4, shown below.

Next we can remove any node, as all 3 have the same degree; again we pick the one with lowest index, node 1. This leaves just the nodes 3 and 4. Both of these have just degree 1 and we select 3 then 4, and end up with an empty graph.

All nodes have been removed, and in the order 2, 1, 3, and 4. We re-insert them in order 4, 3, 1, and 2, giving them associated colors: 4 red, and 3 orange. We try to be greedy with node 3 and re-use the same color previously mapped to 4.

But, since nodes 3 and 4 are connected, they cannot have the same color; thus orange is picked for node 3. Similarly, node 1 isre-inserted next. Being greedy with the number of colors, we try to re-use red or orange. But since node 1 is connected with both of the others, we must pick a third color, yellow is the next candidate in the ROYseries. Finally node 2 is re-inserted. This one is connected to nodes 1 and 3, having colors yellow and orange, hence neither of these two can be used, but red can be used again. The result is shown on the left, displaying a 3-colored graph of 4 nodes. We selected ROY.

Sample 5-node graphs: If we imagine a completely connected graph with 5 nodes, we know the chromatic number will again be 5. But we shall apply the graph coloring method to a simpler version, a graph G having 5 nodesnot completely connected.Is this simpler graph, shown to the left below, 3-colorable?

Since one of the nodes has degree < 3,node 2, we have a starting chance. That means we don’t have to start generating spill code, but still may have to do so later, if after removal of the first nodes, only nodes of degree >= 3 would remain. As it turns you, that is not the case.Node 2 is the removal candidate; this eliminates also 2 edges, the edges to nodes 1 and 3. The simplified graph G’ is shown below the original graph G with the 5 nodes.

For the next removal we have 2 options, nodes 1 of degree 2 and node 3, also of degree 2. Thus we select 1. Nodes 4 and 5 are not candidates, as they have degree 3, equal to our targeted chromatic number R.

Then a new subgraph G’ of 3 completely connected nodes is left, the nodes being 3, 4, and 5. They are all equal candidates, thus we remove them in the order 3, 4, and 5, 3 being the one with lowest node number and of degree 2.

Now we re-insert all 5 nodes. The order is 5, 4, 3, 1, and 2. The first 3 colors must all be disjoint, as the first 3 re-inserted nodes are all connected. We choose the colors ROY, our complete spectrum. Then orange node 1 is inserted, being re-connected to nodes 4 and 5. Node 4 is yellow, 5 is red, thus we must pick orange. The only other node having orange is 3, not connected to node 1. Thus that color selection works. Finally node 2 is re-inserted. Its neighbors are node 1 orange, and node 3, also orange. Hence we have 2 color choices, and arbitrarily select red. Hence the original G is 3-colorable.

Sample 6-node graphs: The next Interference Graph G has 6 nodes, and we shall try first, whether G can be 3-colored without spill code. Then we explore 4-coloring.




/ We’ll be greedy again and see, whether the 6-node graph G is 3-colorable without spill. We start aiming for 3, since there are nodes N with degree D(N) < 3. Node 1 qualifies. The resulting simplified graph G’ after removal of node 1 is shown next below.
G’ has one candidate with D(1) < 3, which is node 6. Hence we remove that node as well, resulting in the simpler 4-node graph below.
But the resulting 4-node graph is not 3-colorable without spill. Hence we have to give up, or else generate spill code. Instead, we try to 4-color the original graph now.
To 4-color the original G above, we identify nodes N with degree D(N) < 4. Of those we select the ones with highest degree, of those the one with lowest index, and there is one such candidate, resulting in removal of node 3. This also eliminates 3 edges. The simpler G’ is shown left.
The next removal candidate with D < 4 and lowest node index is node 5. The simpler graph with 3 further edges eliminated is displayed next. Now we remove further nodes with degree 3 and lower. The next removal candidate is node 2, also eliminating 3 edges.
This leaves an unconnected graph with one edge. Subsequently we remove nodes 1, and then 4 and 6. The total removal sequence is: 3, 5, 2, 1, 4, and 6. To reconstruct G and color the nodes with ROYG, we insert nodes in the order 6, 4, 1, 2, 5, and 3.
/

Nodes 6 and 4 will be inserted first. As they are not connected, they can receive the same color. We’ll pick red, our spectrum being ROYG. Then comes node 1, connected to 6, so its color will be different from red; we pick orange. Node 2 is connected to 1, 4, and 6. We may color it yellow, not interfering with red (nodes 4 and 6) or orange, node 1. Node 5 is colored green, and finally 3 orange.

Register Allocation

Now we analyze a sample instruction stream that references virtual registers vri, i=1..5. We also view the corresponding source and construct the interference graph G for the virtual registers. Then we color the graph and finally map real registers into the instructions. The target architecture is assumed to be three-address and limited CISC, able to perform operations in memory for 1 operand. Also, common sub-expressions have been eliminated by the optimizer (CSE) and been mapped into proper register usage. All registers are dead after the code stream, which is one basic block.

L1:
c:= a + b
f := d * e
g := a + b – d * e
L2: next BB / ld vr1, [a]
add vr2, vr1, [b]
st vr2, [c]
ld vr3, [d]
mult vr4, vr3, [e]
st vr4, [f]
sub vr5, vr2, vr4
st vr5, [g]

/ Virtual register vr1 interferes only with vr2. But vr2 has a long live, spanning across vr3, vr4, reaching into vr5. Vr3 only interferes with vr4, and vr4 conflicts with vr5. The conflict of vr5 with vr2 has already been established when analyzing vr2. 2-coloring G fails, even though there is one candidate node to start with: node 1. Thereafter all nodes have degree 2 or greater. Hence without spill code there is no 2-coloring. But 3-coloring might work? We search for nodes with D < 3. The lowest so indexed is node 3. Reduced G’ with node 3 removed is shown left. Now we select node 2, also of D = 2.
The resulting G’ is a partly connected graph with nodes 1, and 4 connected to 5. Now we eliminate 4, 1, and 5. To reconstruct G and giving each node colors, we must select nodes 5, 1, 4, 2, and 3 in that order.
Node 5 receives red. The next node 1, is not connected with 5, hence is also be colored red. Then we insert 4, adjacent to 5. Hence it must have a different color than red, we pick orange. Node 2 is connected to 1, 4, and 5, and receives yellow, different from all 3 neighbors. Finally node 3 receives red again; thus we have successfully 3-colord G.

We now can map colors to physical registers: red -> r1, orange -> r2, and yellow -> r3. Each of the 5 virtual registers is associated with a node that has one of 3 colors, thus can be mapped into 3 physical registers. The resulting object code is shown next: