ECE 734 Final Project
On the Time Scheduling Problem of Uniform Recurrence Equations
Chin, Tai-Lin and Lin, Wei-Yang
May 14, 2004
Content
1.Introduction
2.Fundamental Formulations
2.1.Uniform Recurrence Equations
2.2.Linear Schedule
2.3.Uniform Affine Schedule
2.4.Affine Schedule
3.Multidimensional Schedule
3.1.The Computability of UREs
3.2.Calculating Multidimensional Schedule
3.3.Multidimensional Affine Schedule
4.Scheduling Real Applications
4.1.Convolution
4.1.1.First Dependence Relationship
4.1.2.Another Dependence Relationship
4.2.Selection Sort
4.2.1.Uniform Affine Schedule
4.2.2.Affine Schedule
4.2.3.Multi-dimensional Schedule
4.3.2D FIR Filter
4.3.1.Affine Schedule
4.4.A Complex System of UREs
4.4.1.Multidimensional Schedule of URE1
4.4.2.Multidimensional Affine Schedule of URE1
5.Conclusion
1.Introduction
Time scheduling is a very important issue on parallel computing applications. It serves an important role in the design of parallel processor arrays and parallel compilers. Recently, a large number of applications such as digital signal processing, computer graphics, computer vision, weather forecasting, and virtual reality need high speed computing capabilities to process their massive computational tasks. As a result, high performance computing will become one of the most crucial issues of computer technology in the near future. In this report, we focus on the time scheduling problem of Uniform Recurrence Equations[1]. We first study the affine schedule[7] and multidimensional schedule[8], and then, based on these two scheduling methods, formulate a more efficient scheduling method named multi-dimensional affine schedule to solve more complex systems of uniform recurrence equations. All these scheduling problems are formulated as linear programs. Finally a complex example of uniform recurrence equation is illustrated. The quality of the different scheduling methods can be explored.
2.Fundamental Formulations
In this section, we define the Uniform Recurrence Equations and some terminologies. Afterward, we will study three effective time scheduling methods based on hyperplanes including linear schedule, uniform affine schedule, and affine schedule.
2.1.Uniform Recurrence Equations
Many algorithms are comprised of a set of uniformdependence computations which can be arranged on ann dimensional lattice. The data dependencies can be represented bydependence vectors. If a vector between any two lattice points is equal to one of the dependence vectors, the computations on one of the lattice points would depend on the results of the other regardless of the positions of the two lattice points in the iteration space. We call this relationship uniform relation. The Uniform Recurrence Equations are formally defined as follows.
Def 1. Uniform Recurrence Equation(URE)[1]: Zn is all the lattice points in n dimensional space. S is a subset of Zn. Any lattice point in S can be represented as an n dimensional integer vector. In the space S, we can define the URE as follows.
where
is the data variable where .
is the iteration index.
is a function which needs parameters that are .
is a n dimensional vector called dependence vector.
S is the iteration space and can be represented as .
In Zn, for any point and , is defined as input; for any point and , is defined as output.
From Def 1, we can illustrate all the dependence relations of a set of UREs on the iteration space. However, the iteration space is usually large and irregular. Thus, the dependence relations are illustrated in a reduced form.
Def 2. Reduced dependence graph (RDG)[1]: A RDG G is a directed graph (V,E), in which Vis s set of vertices associated with data variables, and E is a set of edges representing the dependences. In UREs, if depends on , then there is an arc from to .
RDG can briefly show the dependences among data variables.
2.2.Linear Schedule
Linear schedule is the basic scheduling scheme for UREs. The operations on a lattice are executed in an order according to the scheduling vector X. For instance, Fig 1 shows a set of UREs and the dependence graph. The scheduling vector can be [1,1].
Fig 1 The dependence graph of the UREs.
The dependence graph of this example is very simple, so we can investigate the optimal scheduling vector is [1,1]. However, if the dependence graph is sophisticated or is a high dimensional graph, the optimal scheduling vector is not easy to be recognized. An approach to calculate the optimal linear scheduling vector is proposed by A. Darte[6] using linear programming. The objective is to minimize the longest
execution path and the problem can be formulated as
where X is the scheduling vector and D is the dependence matrix. The later maximization problem can be transformed to a minimization problem using duality property in linear programming.
Thus, the problem is to minimize the longest execution path can be transformed to a problem with only minimization objective. Including the constraint XD≥1, the problem to calculate the optimal linear scheduling vector can be expressed as
The optimal execution time is
2.3.Uniform Affine Schedule
If there are multiple variables to be executed on a lattice point in the iteration space, using linear schedule would not exploit the potential parallelism among the variables in the same lattice point. Uniform affine schedule separates the executions of different variables on the same lattice point by a constant time. More specifically, the scheduling time for executing variable vi on lattice point p is , where ci is a constant. The schedule must satisfy the following two constraints:
A data variable depends on the data produced by itself.
Suppose is the dependence vector. According to the execution sequence, we can obtain , whereis the scheduling vector.
A data variable depends on the data produced by other variables.
Suppose represents that variable depends on variable. We can obtain . Thus, .
The longest execution path is . Ignore the constant term because it dose not depend on the size of the iteration space. Exploiting the duality property, we can obtain the following linear program for calculating uniform scheduling vectors:
The optimal execution time is
.
2.4.Affine Schedule
The fundamental concept of affine schedule is that every data variable has its own scheduling vector to conduct the execution sequence. For example in Fig1, if are computed using the same scheduling vector, the optimal scheduling vector is [1,1]. The least execution time is . If are executed with different scheduling vectors, the execution time can be reduced to N. For instance, is executed in [1,0] direction, and is executed in [0,1] direction. The execution time is . Obviously, affine schedule is more efficient than theschedules using the same scheduling vector for all data variables. To calculate affine scheduling vectors for the variables in an algorithm is not easy, and the schedule should follow the two constrains below:
A data variable depends on the data produced by itself.
Suppose is the dependence vector. According to the execution sequence, we can obtain , whereis the scheduling vector of variable i.
A data variable depends on the data produced by other variables.
Suppose represents that variable depends on variable. We can obtain .
By these two constrains, A. Darte and Y. Robert proved the optimal affine schedule can be obtained by solving the following linear programming problem[7]:
3.Multidimensional Schedule
In this section, we will define the multidimensional schedule[8]. We first describe how to test the computability of UREs[1], and then formulate the linear program to calculate the uniform form of multidimensional schedule[8]. In the next chapter, we will extensively formulate the affine form of multidimensional schedule.
Def 3. Multidimensional Schedule
A multidimensional schedule is a function. The order of the function value is defined by lexicographic order that is symbolized by . If depends on, then must satisfy the following relation:
We can extend the expression as:
where .
In other words, the multidimensional schedule can be formulated as follows:
Since we discuss the uniform form of mutidimensional schedule in this section, the scheduling function can be written as:
3.1.The Computability of UREs
Karp, Miller, and Winograd first studied the computability of UREs in their paper[1] in 1967. The computability of a URE system can be determined by whether there is a null weight cycle in the RDG. If there is a null weight cycle in the RDG, the system is not computable because of self-dependence in the UREs. Karp, Miller, and Winograd proposed an algorithm[1] to determine the computability of UREs.
Bool KMW()
{
- Find null weight multi-cycles .
- Suppose are the strongly connected components(SCC) in .
- if s=1, is a SCC, the system of UREs is not computable.
- If s=0, is empty, return FALSE.
- Else, return KWM()
}
This algorithm can be implemented by algebra and linear programming. First, denoteto be the connection matrix. Each row of the matrix is associated with a vertex in the RDG and each column associated with an edge. If an edge in RDG is from vertex i to vertex j, then, and. Meanwhile, letbe the dependence matrix and ; we can use Eq.1 to find the edges of null weight multi-cycles in RDG.
……………………………………Eq.1
If Eq.1 has a solution, the system is computable. The edges associated withare the edges of null weight multi-cycles in RDG. In fact, Eq 1. can be modified to the following linear programming problem:
……………………..Eq.2
From Eq.2, we can have the following properties:
Because =0 is equivalent to , we can recognize that all the edges associated with=0 form the null weight multi-cycles in RDG.
3.2.Calculating Multidimensional Schedule
Exploiting the procedure of detecting the computability of UREs, A. Darte and F. Vivien formulated the multi-dimensional schedule[8] using duality theorem of linear programming[9]. They transferred Eq.2 to the following form[8]:
…………….Eq.3
Suppose
Then Eq.3 can be changed to .
According to duality, …….…..Eq.4
Suppose, then, in Eq.4 ,
and
In RDG, if there is an edge from vertex to vertex and the dependence vector is , can be expanded to . Therefore, the duality of Eq.3 can be formulated as Eq.5.
……………..Eq.5
By complementary slackness theorem[9], we can obtain the following important properties:
In cubic iteration space, we use to approach the minimal latency. The scheduling vector of dimension kcan be found by the following linear program:
…………….Eq.5
Applying Eq.6 to each SCC of G' recurrently, the scheduling vector of each dimension can be found.
3.3.Multidimensional Affine Schedule
Suppose the RDG of a problem is G=. If G' is the subgraph of null weight multi-cycles in G, the following constrains exist as in multidimensional schedule:
(i)
(ii) v
Suppose is full rank and the iteration space is . For (i)(ii) above, the cost function can be separately discussed as follows:
(i) : The following constrains can be derived fromFarkas’ lemma (in affine form)[9].
(ii) : According to, let,
. We can obtain Eq.6.
………………….………………….Eq.6
Because is equivalent toand , using Farkars’ lemma (in affine form) again, Eq.6 can be discussed separately as in (1) and (2).
(1)
(2)
Because is full ranked, from (1)(2) we obtain . According to, we can obtain. Combining (1) with (2), the result is
(iii) Now, consider the cost function. On a variable , if the iteration space is cubic, isapproximately equal tothe minimal execution time of variable , where is theth component of scheduling vector .
Suppose
Since, would minimize every . For this reason, we choose as the cost function.
From the discussion(i)(ii)(iii), we can find the multidimensional affine schedule by the following linear program.
4.Scheduling Real Applications
In this Chapter, we use several real examples to compare the scheduling methods. The examples include convolution, selection sort, and a complex computation system. Different scheduling methods are applied on each example to compare the scheduling methods.
4.1.Convolution
Convolution is a fundamental transformation in digital signal processing, it is usually written as
We can rewrite equation above into
By observing equation above, we find that u(k) is used in computing y(i,k) where . Therefore, we can again rewrite convolution into following format.
And the initial values are
From this uniform recurrence system, we can find its dependence matrix D.
According to dependence relationship, we can draw dependence graph.
After analyzing dependence relationship, we can perform different scheduling on convolution.
4.1.1.First Dependence Relationship
In this section, we will find linear scheduling, uniform affine scheduling and affine scheduling. In next section, we will modify dependence relationship and find corresponding scheduling vector. Then, we can compare the results.
Linear and uniform affine scheduling
In linear scheduling, iteration space must be bounded. So, we assume and then iteration space could be written as , where
let m = 7, n = 19, , , . We can write linear programming for linear scheduling,
By solving this problem, we can find optimal solution , , and corresponding execution time is 27 time units.
Similarly, we can find uniform affine scheduling for convolution.
, , ,
In this example, linear scheduling and uniform affine scheduling have the same result. The reason is that dependence relationship is quite simple in this case. Actually, we can find feasible scheduling vector by observation. But in complex dependence relationship, we need rely on linear programming to find scheduling vector.
Affine scheduling
We can find affine scheduling by solving linear programming,
,
,
,
the total execution time is 27 time unit. Affine scheduling has different execution direction from linear scheduling and uniform affine scheduling. We can derive its equation,
,,
the total execution time is n+m+1 time unit.
4.1.2.Another Dependence Relationship
We can rewrite convolution into different uniform recurrence equation as follows,
Its dependence graph is shown below.
Then, we can find different scheduling
- Linear scheduling:, execution time is 23 time units.
- Uniform affine scheduling:, , execution time is 23 time units.
- Affine Scheduling: execution time is 23 time units.
,
,
,
Here, we only modify the dependence relationship. The output sequence of convolution is still the same but execution time is improved. We can reduce execution since more operations could be executed at the same time in second dependence relationship. The hyper planes for these two dependence relationships are show in following figure.
4.2.Selection Sort
Given a sequence, selection sort algorithm repeatedly looks through remaining items to find the largest one and moves it to its first location. Its uniform recurrence equation is as follows:
and its dependency matrix is
.
4.2.1.Uniform Affine Schedule
Because one of the dependency vector is [0 0]T, we won’t be able to find scheduling vector using linear schedule. However, we can still find uniform affine schedule. The reason is that we have parameter for each operation in affine schedule. It is possible to arrange two operations in different time slot. In this section, we will use the methods introduced in section 2.3 to find uniform affine schedule for selection sort.
The selection sort has iteration space , and
,.
Let N = 18, , , . We can have linear programming formulation for uniform affine schedule,
By solving this problem, we can find optimal uniform affine scheduling vector , , and total execution time is 36 time units.
4.2.2.Affine Schedule
In affine schedule, let the schedule vector of x be and the schedule vector of m be . After substituting them into linear programming, we can find optimal affine schedule, , , and total execution time is 36time units.
4.2.3.Multi-dimensional Schedule
From reduced dependency graph, we can find connection matrix
First, we need check the computability of this system. Let , D is dependency matrix. In order to find zero-multiple loop, we need solve
from Bq = 0,,
since , that means . In this problem, G’ is empty set so that this problem is computable. Then, we can compute scheduling vector by solving
let scheduling vector X = [x y]’, be constant. We can find minimal execution time by solving
The resulting solution is , , . And corresponding execution order is shown below.
Timeoperation
3
4
5
6
7
8
4.3.2D FIR Filter
A 2D FIR filter can be expressed as
.
It can be written in a regular iterative algorithm form:
In this example, the dependence relations are very sophisticated. Moreover, it contains the opposite dependences [1 0 0]T and [-1 0 0]T that make it hard to inspect the optimal scheduling vector. However, we still can get the affine schedule to reduce the execution time.
4.3.1.Affine Schedule
Assume n=7 and m=19. The iteration space can be described as {p|Ap≤b}, where
.
Use the linear program mention in section 2.4, the affine schedule can be obtained as
The total execution time is . The execution order is as follows:
Time Operation Time Operation Time Operation
8 u7,1,* , v7,1,* 23 u7,2,* , v7,2,* 278 u7,19,* , v7,19,*
9 u6,1,* , v6,1,* 24 u6,2,* , v6,2,* 279 u6,19,* , v6,19,*
… … …
14 u1,1,* , v1,1,* 29 u1,2,* , v1,2,* 284 u1,19,* , v1,19,*
15 ─ 30 ─ 285 ─
16 w1,1,* 31 w1,2,* 286 w1,19,*
17 w2,1,* 32 w2,2,* 287 w2,19,*
… … …
22 w7,1,* 37 w7,2,* 292 w7,19,*
4.4.A Complex System of UREs
Consider the following complex system of UREs:
URE 1
Fig 2 is the dependence graph and Fig 3 is the RDG of URE 1.
Fig 2 The dependence graph of URE 1
Fig 3 The RDG of URE 1
4.4.1.Multidimensional Schedule of URE1
We could find the multidimensional schedule by two steps. 1. Detect whether the system is computable. 2. Find the multidimensional scheduling vectors. First, we should produce the connection matrix and the dependence matrix .
- Detect whether URE 1 is computable.
Suppose . According to the KMW algorithm, we can solve the following equations.
We can obtain the null weight subgraph as Fig 4. There are three SCCs,,and. Obviously, these three SCCs are not null weight cycles. Consequently, this system is computable.
Fig 4 The null weight multi-cycles in Fig 3.
- Find the multidimensional schedule of URE 1
(i)We can find the first dimension scheduling vector by the following constrains.
Suppose scheduling vector is are constants. The linear program can be formulated as follows.
Solving the linear program by GAMS, we can obtain .
(ii) Find the second dimension scheduling vectors:
(1)For: obviously, the scheduling vector is .
(2)For: obviously, the scheduling vector is .
(3)For: Suppose the scheduling vector is.
We obtain the following linear programming problem:
Solving the linear program by GAMS, the solution is
.
Combining the solution of dimension one with that of dimension two, the scheduling vectors are shown below:
Data variable / Scheduling vector / 1st dimension / 2nd dimension/ Linear part
constant /
0 /
0
/ Linear part
constant /
0 /
0
/ Linear part
constant /
0 /
1
/ Linear part
constant /
-1 /
0
4.4.2.Multidimensional Affine Schedule of URE1
From section 4.2, we know that the system is computable. We will find the multidimensional affine scheduling vectors of URE 2 in this section. Suppose the iteration space is . The iteration space can be represented as, where