SHORT-CIRCUIT EVALUATION OF BOOLEAN EXPRESSIONS

Let's consider this in connection with generating code for the IF statement. A suitable grammar for this statement is:

IF_STATEMENT → SIMPLE_IF | SIMPLE_IF_ELSE_BLOCK

The production for the if statement without ELSE

SIMPLE_IF → IF_PREFIX TRUE_BLOCK ENDIF

Productions for the if statement including ELSE

SIMPLE_IF_ELSE_BLOCK → SIMPLE_IF_ELSE FALSE_BLOCK ENDIF

SIMPLE_IF_ELSE → IF_PREFIX TRUE_BLOCK ELSE

Productions employed for both cases

IF_PREFIX → IF BOOL_EXPRESSION

BOOL_EXPRESSION → BOOL_EXPRESSION OR BOOL_FACTOR

I BOOL_FACTOR

BOOL_FACTOR → BOOL_FACTOR AND BOOL_SECONDARY

| BOOL SECONDARY

BOOL_SECONDARY → NOT BOOL_PRIMARY I BOOL_PRIMARY

BOOL_PRIMARY → ARITH_EXPRESSION = ARITH_EXPRESSION

I ARITH_EXPRESSION ARITH_EXPRESSION

I ARITH_EXPRESSION > ARITH_EXPRESSION

I ARITH_EXPRESSION >= ARITH_EXPRESSION

I ARITH_EXPRESSION ARITH_EXPRESSION

I ARITH_EXPRESSION <= ARITH_EXPRESSION

I ( BOOL_EXPRESSION )

For the purpose of readibility let's employ the macros:

#define head_of_false_chain kind_of_Location

#define head_of_true_chain location

The code required for the various productions is described below

BOOL_PRIMARY → ARITH_EXPRESSION = ARITH_EXPRESSION

For productions such as these we generate code to compare the arithmetic expressions, followed by a conditional jump instruction which will be executed if the comparison involved is true. In the present case we would generate in code_s:

JE 0000

coded as

0f 84 0000

and we would set $$.head_of_true_chain = code_i - 2

and $$.head_of_false_chain = 0

As we carry out the present algorithm, for each boolean expression that we generate code for, we will produce an entry on symbol stack whose head_of_true chain points to a chain of (the offset fields within) conditional branch instructions that branch if the boolean expression as a whole is true, and whose head_of_false_chain points to a similar chain for when the boolean expression is false. In the code above, there are no branches for if the comparison is false, so $$.head_of_false_chain is 0. There is a single branch for if the condition is true. Since it's the only one in the chain, it is itself the end of the chain. So the offset field of the JE instruction is 0. $$.head_of_true chain here points to this offset field.

BOOL SECONDARY → NOT BOOL PRIMARY

Clearly all the code needs to do here is to set $$.head_of_false_chain to $2.head_of_true_chain, and $$.head_of_true_chain to $2.head_of_false_chain.

BOOL_FACTOR → BOOL_FACTOR AND BOOL_SECONDARY

See steps 3 and 5 in the example below

Consider the conditional branch right at the end of the code for $1 (ie. BOOL_FACTOR). If this were a branch for if $1 is true, then it would have to branch to the start of the code for evaluating $3. On the other hand if it were a branch for if $1 is false, then it would constitute a branch for if $$ was false, and the code would here avoid having to evaluate $3. So the first thing to do is to ensure that the conditional branch at the end of the code for $1 is a branch for if $1 is false. You can tell which it is by checking which of $1.head_of_false_chain or $1.head_of_true_chain is larger. If the conditional branch instruction is a branch for if $1 is true, then you need to change it to a branch for the opposite condition to make it a branch for if $1 is false. It so happens that this can be done in all cases by merely reversing the low_order bit of the opcode involved (i.e. of the byte following the 0f byte). This can be done by means of an exclusive-or with 1. E.g. if x is the value of $1.head_of_true_chain, one could use code such as:

code_s[x - 1] = code_s[x-1] ^ 0x'01'

You will now need to fixup the false chain, since it has been given a new head, and

amend the values of $1.head_of_true_chain and $1.head_of_false_chain accordingly. After all this is done, you can make all the members of the true chain for $1 (conditional) branches to the start of the code for $3. If y is the larger value of $1.head_of_false_chain and $1.head_of_true_chain, then the code for $3 starts at code_s[y+2]. Now join together the false chains for $3 and $1, since these all are branches for if $$ is false. Finally set $$.head_of_true_chain to $3.head_of_true_chain, and $$.head_of_false_chain to the head of the combined false chains referred to above. Be careful to allow for the case where the false chain for $3 is empty.

BOOL EXPRESSION → BOOL EXPRESSION OR BOOL FACTOR

See steps 7 and 9 in the example below.

The code for this production follows similar principles to that described above for AND. In this case, in order that the last conditional branch instruction in the code for $1 be a branch that, if taken, avoids having to evaluate $3, it needs to be a branch for if $1 is true. If it isn't then you should change it by the means described above, and adjust the true chain for $1, and the values of $1.head_of_false_chain and $1.head_of_true_chain accordingly. Make,the members of the false chain for $1 into branches to the start of the code for $3. Then join together the true chains for $3 and $1, since these all are branches for if $$ is true. Finally set $$.head_of_false_chain to $3.head_of_false_chain, and $$.head_of_true_chain to the head of the combined true chains referred to above. Again be careful to allow for the case where the true chain for $3 is empty.

IF_PREFIX → IF BOOLEAN_EXPRESSION

As in the code for AND, ensure that the last branch in the code for $2 is a branch if $2 is false, as this is the case where we do not evaluate the block of statements following. Adjust the false chain for $2 and $2.head_of_false_chain and $2.head _of_true_chain accordingly. Now make all the members of the true chain pointed to by $2.head_of_true_chain into branches to code_i. Set $$.head_of_false_chain to $2.head_of_false_chain.

Code for the IF statement without ELSE

SIMPLE_IF → IF_PREFIX TRUE_BLOCK ENDIF

Make all the members of the false chain pointed to by $1.head_of_false_chain into branches to code i.

Code for the IF statement including ELSE

SIMPLE_IF _ELSE → IF_PREFIX TRUE_BLOCK ELSE

Generate an unconditional branch with offset 0000, and set $$.head_of_true_chain to point to this offset field. Make all the member of the false chain pointed to by $1.head_of_false_chain into branches to code_i.

SIMPLE_IF _ELSE_BLOCK → SIMPLE_IF _ELSE FALSE_BLOCK ENDIF

Make the branch pointed to by $1.head_of_true_chain into a branch to code_i.

NOTE

Yacc reports a shift-reduce conflict for most grammars written for the IF statement.

For example, with the statement:

If A then if B then C else D

the shift-reduce conflict occurs in the state reached when the next input symbol

is “else”. If the shift action is chosen, the statement will be interpreted as:

if A then (if B then C else D)

whereas if the reduce action is taken, it will be interpreted as:

if A then (if B then C) else D

As you can see, if A is false in the first case, then D will not be executed (nor will C), whereas if A is false in the second case, then D will be executed.

Nearly all programming languages adopt the first interpretation (obtained taking the shift action). Since the default action that Yacc takes in shift-reduce conflicts is in fact the shift action, the above conflict can be ignored for such languages, and while it is possible to produce grammars for the If statement that avoid the conflict, this is not necessary.

EXAMPLE

Let’s consider the problem of evaluating the short-circuit code for:

if (a and b and c) or (d and e and f) or (g and h and i)

go to place1

else

go to place2

where the letters a – i are conditions such as X > Y+2.

Let the letters A – I denote the code for evaluating a – i respectively.

By e.g. Cjt300h we mean the code for evaluating condition c followed by a jump to offset 300h in the case where c is true. For example, if c is X > Y + 2, then Cjt300h could represent the following code:

add y, 2

mov ax, x

cmp ax, y

je 300h[1]

By Cjf300h we mean the code for the comparison as above, followed by a jump to offset 300h in the case where c is false, ie. the last instruction should be changed to jne 300h

Using our productions for the if statement, the following is a derivation of the example’s if statement.


DERIVATION OF THE EXAMPLE’S IF STATEMENT

The “bool_” prefixes in the nonterminals have been omitted for brevity.

The reductions are numbered in the order they occur in a parse using an LR(1) parsing machine.

The derivation of “go to true-place” from true-block, and of “go to false-place” from false-block

are not shown.

if_statement

46

simple_if_else_block

45

simple_if_else false-block endif

44

if_prefix true-block else

43

if expression

42

expression or factor

28 41

expression or factor secondary

14 27 40

factor secondary primary

13 26 39

secondary primary ( expression)

12 25 38

bool_primary ( expression ) factor

11 24 37

( expression ) factor factor and secondary

10 23 34 36

factor factor and secondary factor and secondary primary

9 20 22 31 33 35

factor and secondary factor and secondary primary secondary primary i

6 8 17 19 21 30 32

factor and secondary primary secondary primary f primary h

3 5 7 16 18 29

secondary primary c primary e g

2 4 15

primary b d

1

a


SNAPSHOTS OF THE PARSE OF THE EXAMPLE IF STATEMENT

Our snapshots show relavant parts of the content of the code segment and symbol stack after selected productions, identified by the numbering given in the derivation above. After a production such as

bool_factor → bool_factor and bool_secondary

with our original method, in which the entries in symbol_stack where grammar symbols, we replaced the top three members of the stack by bool_factor. Now we are employing a stack in which each entry is a pair of numbers, that we have named
head_of_true_chain and head_of_false_chain, and our algorithm is designed to make the head_of_true_chain point[2] to a back chain in the code segment of jumps that will occur if the code derived from bool_factor is true, and head_of_false_chain point to a back chain of such jumps that will occur if the code involved is false. For illustration purposes, our snapshots will also depict the source code corresponding to the code that has been generated

1. The situation after reduction 1: primary → a

primary

symbol stack (abbrev. ss) t=n1 f=0

code segment (abbrev. cs) AJt0

n1

source code involved (abbrev. sc) ( a

Here the code for comparison a, followed by a jump if true with a zero operand has been generated in the code segment and n1 is the offset of the operand field involved. In symbol stack the “t=n1 f=0” means the head of the true chain for $$ has been set to n1, and the head of its false chain has been set to zero

2.  After reduction 5: secondary → primary

factor and secondary

ss t =n1 f=0 t=n2 f=0

cs AJt0 BJt0

n1 n2

sc ( a and b

3. After reduction 6: factor → factor and secondary

factor

ss t=n2 f=n1

cs AJf0 BJt0

n1 n2

sc ( a and b

4. After reduction 8: secondary → primary

factor and secondary

ss t=n2 f=n1 t=n3 f=0

cs AJf0 BJt0 CJt0

n1 n2 n3

sc ( a and b and c

5. After reduction 9: factor → factor and secondary

Note that we have changed the BJt to BJf , since we want to avoid evaluated c in the case where b is false. If b is true,

we would need to evaluate c to determine whether a and b and c is true or not. So the code BJt

would have to be a condition branch to C , and C would thus be evaluated whether or not B were true.

factor

ss t=n3 f=n2

cs AJf0 BJfn1 CJt0

n1 n2 n3

sc ( a and b and c

6. After reduction 27: factor → secondary

expression or factor

ss t=n3 f=n2 t=n6 f=n5

cs AJf0 BJfn1 CJt0 DJf0 EJfn4 FJt0

n1 n2 n3 n4 n5 n6

n1 n2 n3

sc ( a and b and c ) or ( d and e and f )

7. After reduction 28: expresssion → expression or factor

expression

ss t=n6 f=n5

cs AJfn3+2 BJfn3+2 CJt0 DJf0 EJfn4 FJt0

n1 n2 n3 n4 n5 n6

sc ( a and b and c ) or ( d and e and f )

8. After reduction 41: factor → secondary

expression factor

ss t=n6 f=n5 t=n9 f=n8

cs AJfn3+2 BJfn3+2 CJt0 DJf0 EJfn4 FJt0 GJf0 HJfn1 IJt0

n1 n2 n3 n4 n5 n6 n7 n8 n9

sc ( a and b and c ) or ( d and e and f ) or ( g and h and i )

9. After reduction 42: expression → expression or factor

if expression

ss t=n9 f=n8