Two Fundamental Abstraction Facilities

Subprograms

Introduction

•Two fundamental abstraction facilities

–Process abstraction

•Emphasized from early days

–Data abstraction

•Emphasized in the1980s

Fundamentals of Subprograms

•Each subprogram has a single entry point

•The calling program is suspended during execution of the called subprogram (i.e. there is only one subprogram in execution of any given time).

•Control always returns to the caller when the called subprogram’s execution terminates

Basic Definitions

•A subprogram definition describes the interface to and the actions of the subprogram abstraction

•A subprogram call is an explicit request that the subprogram be executed. Active subprogram, program that starts execution but not yet completed.

•A subprogram header is the first part of the definition, including the name, the kind of subprogram (procedure, function), and the formal parameters (optional).

•Examples:

In Fortran: Subroutine Adder (parameters)

Ada: Procedure Adder (parameters)

C-based: void adder (parameters)

•The parameter profile (aka signature) of a subprogram is the number, order, and types of its parameters

•The protocol is a subprogram’s parameter profile and, if it is a function, its return type

•Function declarations in C and C++ are often called prototypes. While Java and C# do not need declarations of their methods, because there is no requirement in those languages that methods be defined before they are called.

•A subprogram declaration provides the protocol, but not the body, of the subprogram

•A formal parameter is a dummy variable listed in the subprogram header and used in the subprogram

•An actual parameter represents a value or address used in the subprogram call statement

Actual/Formal Parameter Correspondence

•Positional

–The binding of actual parameters to formal parameters is by position: the first actual parameter is bound to the first formal parameter and so forth

–Safe and effective method, as long as the parameter lists are relatively short.

•Keyword: in case of lists are long

–The name of the formal parameter to which an actual parameter is to be bound is specified with the actual parameter

–Parameters can appear in any order

•Ada procedures can be called using keyword parameters, as in :

Sumer (LengthMy_Length, List My_Array, SumMy_Sum)

–Where the definition of Sumer has the formal parameters Length, List, and Sum.

•Disadvantages of using keyword, is the user of the subprogram must know the named of formal parameters.

•Ada, Fortran95 allow a mixed of both (positional and keyword)

Ex: Sumer (My_Length, SumMy_Sum, ListMy_Array);

Restriction with this approach, is after a keyword parameter appears in the list, all remaining parameters must be keyworded. The reason is that, because a position may no longer be well defined after a keyword parameter has appeared.

Formal Parameter Default Values

•In certain languages (e.g., C++, Ada, Fortran95, and PHP), formal parameters can have default values (if not actual parameter is passed)

•Consider the following Ada function header

function Compute (Income: Float; Exemptions: Integer:=1; Tax_Rate:Float)return Float;

•The Exemptions formal parameter can be absent in a call to Compute;

Pay:=Compute (2000.0, Tax_Rate 0.15);

–In C++, which has no keyword, default parameters must appear last because parameters are positionally associated

–A C++ function header:

float Compute (float income, float tax_rate, int exemptions=1)

call

pay=Compute(2000.0, 0.15

•C# methods can accept a variable number of parameters as long as they are of the same type

Procedures and Functions

•There are two categories of subprograms

–Procedures are collection of statements that define parameterized computations.

–Procedures can produce results in the calling program unit by two methods.

If there are variables that are not formal parameters but are still visible in both the procedure and the calling program unit, the procedure can change them.
If the subprogram has formal parameters that allow the transfer of data to the caller, those parameters can be changed.

–Functions structurally resemble procedures but are semantically modeled on mathematical functions

•They are expected to produce no side effects (i.e. modify neither its parameter nor any variables defined out-side the function)

•In practice, program functions have side effects

–Functions are called by appearances of their names in expressions, along with the required actual parameters.

–The value produced by a function’s execution is returned to the calling code, replacing the call itself.

•Functions define new user-defined operators. Ex: if language does not have an exponentiation operator, a function can be written for it. In C++ could be

float power (float base, float exp)

which could be called with

result=3.4* power(10.0, x);

•Fortran and Ada, provide both function and procedures. C-based have only functions (and/ or methods).

Design Issues for Subprograms

•What parameter passing methods are provided?

•Are parameter types checked? (Actual against formal)

•Are local variables static or dynamic?

•Can subprogram definitions appear in other subprogram definitions (nested)?

•If subprograms can be passed as parameters and subprograms can be nested, what is the referencing environment of a passed subprogram?

•Can subprograms be overloaded? i.e. has the same name as another subprogram in the same referencing environment.

•Can subprogram be generic? Is one whose computation can be done on data of different types in different calls.

Local Referencing Environments

•Subprograms can define their own variables, thereby defining local referencing environments.

•Local variables can be either static or stack-dynamic.

•Local variables can be stack-dynamic (bound to storage where the subprogram designs execution and are unbound from storage when that execution terminates)

–Advantages

•Support for recursion

•Storage for locals in an active subprogram is shared among some inactive subprograms (good when memory is small)

–Disadvantages

•Allocation/de-allocation, initialization time cost

•Indirect addressing (because the place in the stack where a particular local variable will reside can be determined only during execution)

•Subprograms cannot be history sensitive

•OrLocal variables can be static

* Advantages: - More efficient (no indirection)

- No run-time overhead, allow subprograms to be history-sensitive

* Disadvantages: - Cannot support recursion

- Storage cannot be stored with the local variables of other inactive subprograms

•In most contemporary languages, local variables in a subprogram are by default stack-dynamic.

•In C and C++ functions, locals are stack-dynamic unless specifically declared to be static.

•Example:

•Ada subprograms and the methods of C++, Java, and C# have only stack-dynamic local variables.

Parameter Passing Methods

•Ways in which parameters are transmitted to and/or from called subprograms

•Semantics models of parameter passing

–Formal parameters are characterized by one of three distinct semantics models

They can receive data from the corresponding actual parameters (in mode).
They can transmit data to the actual parameters (out mode).
They can do both (inout mode).

•Implementation models of parameter passing

–A variety of models has been developed by language designers to guide the implementation of the three basic parameter transmission modes.

–Pass-by-value

–Pass-by-result

–Pass-by-value-result

–Pass-by-reference

–Pass-by-name

Models of Parameter Passing

Pass-by-Value (In Mode)

•The value of the actual parameter is used to initialize the corresponding formal parameter, which then acts as a local variable in the subprogram.

–Normally implemented by copying

–Can be implemented by transmitting an access path but not recommended (require that the value be in a write-protected call (one that can only be read) enforcing write protection is not easy)

* Disadvantages:

–When copies are used, additional storage is required

–Storage and copy operations can be costly (such as array with many elements)

Pass-by-Result (Out Mode)

•When a parameter is passed by result, no value is transmitted to the subprogram; the corresponding formal parameter acts as a local variable; its value is transmitted to caller’s actual parameter when control is returned to the caller

–Require extra storage location and copy operation

•Potential problem: there can be actual parameter collision, such as sub(p1, p1);

whichever formal parameter is copied back (last) will represent the current value of p1

•Thus the order in which the actual parameters are copied determines their value.

•Another problem: The implementer may be able to choose between two different times to evaluate the address of the actual parameters: at the time of the call or at the time of the return.

•Ex: suppose a subprogram is called with the parameter list[index]. If index is changed by the subprogram, either through global access or as a formal parameter, then the address of list[index]will change between the call and the return.

•The implementer must choose the time at which the address to which to return the value will be determined, at the time of the call or at the time of the return. Makes programs unportable between implementations.

Pass-by-Value-Result (inout Mode)

•A combination of pass-by-value and pass-by-result

•Sometimes called pass-by-copy (actual values are copied to formal parameters at subprogram entry and copied back at termination)

•Formal parameters have local storage and the value of the actual parameter is used to initialize the corresponding formal parameters.

•At subprogram termination, the value of the formal parameter is transmitted back to the actual parameter.

•Disadvantages:

–Those of pass-by-result (problems associated with the order in which actual parameters are assigned)

–Those of pass-by-value (requiring multiple storage for parameters and time for copying values)

Pass-by-Reference (Inout Mode)

•Pass an access path to the called subprogram

•Also called pass-by-sharing

•Advantage: Passing process is efficient in terms of time and space. (no copying and no duplicated storage)

•Disadvantages

–Slower accesses (compared to pass-by-value) to formal parameters (indirect addressing).

–Potentials for un-wanted side effects in case of one-way communication to the called subprogram.

–Un-wanted aliases (access broadened to nonlocal variables)

•There are several ways alias can be created when parameters are passed by reference.

•Collisions can occur between actual parameters.

•Ex: void fun(int &first, int &second)

•If the call to fun happens to pass the same variable twice, as in fun(total, total) the first and second in fun will be aliases.

•Fun(list[i], list[j])

•If these two parameters are passed by reference and i happens to be equal to j, then first and second are aliases.

•Problem with aliasing is harmful to readability and reliability.

Pass-by-Name (Inout Mode)

•Actual parameter is textually substituted for the corresponding formal parameter in all its occurrences in the subprogram.

•Formal parameters are bound to actual values or addresses at the time of the subprogram call.

•Formals are bound to an access method at the time of the subprogram call, but actual binding to a value or address takes place at the time of a reference or assignment

•Allows flexibility in late binding

Parameter Passing Methods of Major Languages

•Fortran

–Always used the inout semantics model (but does not specify whether pass-by-reference or pass-by-value-result should be used).

–Before Fortran 77: pass-by-reference

–Fortran 77 and later: scalar (simple) variables are often passed by value-result

•C

–Pass-by-value

–Pass-by-reference (input mode) is achieved by using pointers as parameters

•C++

–A special pointer type called reference type for pass-by-reference

–Void fun(const int &p1, int p2, int &p3) {…}

–Where p1 is pass-by-reference but cannot be changed in the function fun, parameter p2 is pass-by-value, and p3 is passed-by-reference.

•Java

–All parameters are passed by value

–Object parameters are passed by reference (because objects can be passed only through reference variables)

•Ada

–Three semantics modes of parameter transmission: in, out, in out; in is the default mode

–Formal parameters declared out can be assigned but not referenced; those declared in can be referenced but not assigned; in out parameters can be referenced and assigned.

–Consider the following Ada subprogram header:

Procedure Adder (A:in out Integer; B: in Integer; C: out Float)

•Fortran95, similar to Ada, but we use Intent attribute.

•C#

–Default method: pass-by-value

–Pass-by-reference is specified by preceding both a formal parameter and its actual parameter with ref

–void sumer(ref int oldsum, int newone) {…}

–Call sumer(ref sum, newvalue); //1st by reference, 2nd by value

Type Checking Parameters

•Considered very important for reliability

•FORTRAN 77 and original C: none (neither the number of parameters nor their types were checked)

•Pascal, FORTRAN 90, Java, and Ada: it is always required

•ANSI C and C++: choice is made by the user: either in original C, the names of the parameters are listed in parentheses and the type declarations for them follow, as in:

Alternative prototypes

-in which the formal parameter types are included in list, as in

-if this version of sin is called with the same call, i.e.

value=sin(count);

-it is also legal. The type of the actual parameter (int) is checked against that of the formal parameter (double)

-Although they do not mach, int is coerced to double, and it will work.

-If conversion is not possible (ex actual parameter is an array), or the number of parameters is wrong. Then a syntax error is detected.

-All functions in C++, C99, must have their formal parameters in prototype form.

-However, type checking can be avoided for some of the parameters by replacing the last part of the parameter list with an ellipsis, as in:

int printf(const char *format_string, …);

•Relatively new languages Perl, JavaScript, and PHP do not require type checking

Multidimensional Arrays as Parameters

•If a multidimensional array is passed to a subprogram and the subprogram is separately compiled, the compiler needs to know the declared size of that array to build the storage mapping function (index function)

Multidimensional Arrays as Parameters: C and C++

•Programmer is required to include the declared sizes of all but the first subscript in the actual parameter because it uses row-major order, so for z-dimensional array , the index function

so we need the number of columns (), but not ()

•Ex in C:

•The problem: Disallows writing flexible subprogramsi.e. does not allow writing a function that accepts matrixes with different numbers of columns. We need a new one for every matrix with different number of columns.

•Solution: pass a pointer to the array and the sizes of the dimensions as other parameters; the user must include the storage mapping function in terms of the size parameters. Consider the following prototype

void fun(float *mat_ptr, int num_rows, int num_cols);

Multidimensional Arrays as Parameters: Pascal and Ada

•Pascal

–Not a problem; declared size is part of the array’s type

Multidimensional Arrays as Parameters: Fortran

•Formal parameter that are arrays have a declaration after the header

–For single-dimension arrays, the subscript is irrelevant

–For multi-dimensional arrays, the subscripts allow the compiler to build the storage-mapping function

Multidimensional Arrays as Parameters: Java and C#

•Arrays are objects; they are all single-dimensioned, but the elements can be arrays

•Each array inherits a named constant (length in Java, Length in C#) that is set to the length of the array when the array object is created

Design Considerations for Parameter Passing

•Two important considerations

–Efficiency

–One-way or two-way data transfer

•But the above considerations are in conflict

–Good programming suggests limited access to variables, which means one-way whenever possible. Example, when a large array needs to be passed to a subprogram that does not modify it.

–However, pass-by-value would require that the entire array be moved to a local storage area of the subprogram, costly in time and space.

–But pass-by-reference is more efficient to pass structures of significant size (large arrays)

Parameters that are Subprogram Names

•It is sometimes convenient to pass subprogram names as parameters to other subprograms.

•Issues:

Are parameter types checked?
What is the correct referencing environment for a subprogram that was sent as a parameter?

Parameters that are Subprogram Names: Parameter Type Checking

•C and C++: functions cannot be passed as parameters but pointers to functions can be passed; parameters can be type checked

•FORTRAN 95 type checks mechanism

•Later versions of Pascal and

•Ada does not allow subprograms to be passed as parameters.

Examples of parameter passing

•Consider the C function

•C used pass-by-value

•The action of swap1 can be described by the following pseudocode

•Although a ends up with d’s value, and b ends up with c’s value, the value of c and d are unchanged because nothing is transmitted back to the caller.

•We can modify the c swap function to achieve the effect of pass-by-reference

•The actions of swap2 can be described with

•swap2 can be written in C++ using reference parameters as follows

•This simple swap operation is not possible in Java, because it has neither pointers nor C++’s kind of references. In Java, a reference variable can only point to an object, not a scalar value.

•Semantics of pass-by-value-result for input-mode scalar parameters.

•consider function swap3, which we assume uses pass-by-value-result

•The actions of swap3 with this call are

•Explore what happens when aliasing is involved with pass-by-value-result and pass-by-reference. Consider the following C-like

•In fun, if pass-by-reference is used, I and a are aliases.

•If call-by-value-result is used, I and a are not aliases.

•The action of fun, assuming pass-by-value-result are