Intro
- developed to develop theorem proving tactics in first-order calculus
- was applied to the LCF system that was also developed by Robin Milner at the University of Edinburgh
- LCF system is used as an interactive theorem prover where theorems are abstract types and implemented through appropriate inference rules
- ML can automatically infer data types based upon the circumstances in which they are used without explicit type annotations
- the drive behind this implementation is to develop programs that implicitly state the problem to be solved and then approach it through a rational and logical way
- the use of testing as a way of deciding a programs worth is never going to be exhaustive and thus the rational programming approach attacks the issue head on
- ML has spawned several other languages and dialects as well as influencing the design of others
- the most common implementations today are CAML and Standard ML (SML) as well as F# which is aimed at Microsoft’s .NET structure
- used for language design and manipulation through compilers, analyzers and theorem provers
Functional Programming
- there is no reassigning of variables
- through the use of functional programming the logical analysis of such a program is simple and straightforward
- using a functional language we can make assertions and prove them much easier than in a traditional language
- there is no such thing as an exhaustive test therefore we should program through the use of logic
- the use of a functional language in a parallel environment as yielded great benefits in a number of studies
- functional languages still have abstraction, input/output blocks and the option of packaging things in a given object
- there are some aspects of the language which may appear primitive because the focus lies on the logical build of the program
Functional vs. Imperative
- imperative languages are developed and applied through a series of test and fix approaches regardless of the paradigm used to establish this execution
- we can prove confidence with the imperative languages but not in a measurable way or with a sense of completeness
- functional languages develop their programs through an idea of logically plotting out how to solve the problem and thus are much better equipped to be published with certainty
- the problem cannot be explicitly defined through imperative languages but that is the exact aim of imperative languages: to define and directly address the issue
- imperative languages can reuse code and can partition the problem into more manageable chunks through the use of methods
- the same approach can be used in functional languages but there is also a higher level of abstraction in ML through the use of logic to build solutions to the problem
- parallel processing is not possible in imperative languages but the compiler can produce predictable and compact with the addition of GUIs
- parallel processing is possible but the memory requirements are large and unpredictable; adding a GUI is possible but difficult
- FORTRAN, Pascal and C are examples of imperative languages that remove the obstacles of the base level of computation
- ML, Hope and Lisp focus on developing programs through the use of logic
Overview
- the call-by-value strategy is the set of rules that determine how arguments and functions are evaluated
- the arguments are evaluated then bound to the variable before the function is executed or passed back
- garbage collection takes place through the deallocation of unused data and unnecessary space
- utilizes parametric polymorphism so that the type of the data does not matter when applying it to functions
- pattern matching is used to test whether things have a desired structure, to find relevant structure, to retrieve the aligning parts, and to substitute the matching part with something else
- this is useful in ML because it will enable the program to make sure that it sticks to the allowed syntax and operates correctly
- static typing is used which means that a variable can be used before it is declared or initialized
Development and Creation
- when Milner created ML it was the first language with polymorphic type inference and type-safe exception handling
- the goal of the creation of ML was to “define an extremely powerful language for programming which not only encourages people to write correctly, but whose meaning is defined completely and lucidly by mathematics, not by a users' manual”
- Milner believes that we don’t fully understand the science behind all of our programming languages and thus tried to create ML to supplement this
- Milner looks for a programming languages to switch their drive in development to something more based on the mathematical and rational approach
- ML is used to build systems and algorithms
- Milner received the Turing award in 1991 for his work with ML and LCF
- originally interpreted in a type of Lisp generator created by Milner and his team
- has imperative features that were built on a functional core, is not an entirely pure language otherwise there would be no mutation or assignment operators
Characteristics
- ML is safe, in that a program that passes the type-checker cannot dump core, access private fields of abstract data types, mistake integers for pointers, or otherwise "go wrong."
- The Standard ML module system supports modules (called structures) and interfaces (called signatures); the signature of a module determines what components and types from the module are visible from outside. There is a flexible system for matching structures to signatures: there can be several different views (signatures) of the same structure, and there can be several different implementations (structures) matching the same signature.
- ML has higher-order functions: functions can be passed as arguments, stored in data structures, and returned as results of function calls. Functions can be statically nested within other functions; this lexical scoping mechanism gives the ability to create "new" functions at run time.
- Function calls in ML, like those of C, Pascal, C++, Java, etc., evaluate their arguments before entering the body of the function. Such a language is called strict or call-by-value, in contrast to some functional programming languages that are lazy or call-by-need. Strict evaluation makes it easier for the programmer to reason about the execution of the program.
- ML supports polymorphic functions and data types. Data-type polymorphism allows a single type declaration (such as "list") to describe lists of integers, lists of strings, lists of lists of integers, and so on; but the programmer can be assured that, given an "int list", every element really is an "int". Function polymorphism allows a single function declaration (such as filter_list) to operate on lists of integers, lists of strings, lists of integer-lists, and so on, avoid needless duplication of code.
- Programmers in compile-time type-checked languages get the benefit not only of faster execution but also less debugging: many of the programmer's mistakes can be caught early in the development process; and the types may lead to clearer thinking about the program's specification.
- The ML programmer need not write down the type of every variable and function-parameter: the compiler can usually calculate the type from context. This makes programs more concise and easier to write.
- Automatic deallocation of unreachable data makes programs simpler, cleaner, and more reliable.
- ML's exception-handling mechanism -- similar to the ones in C++, Java, Ada, etc. -- provides dynamic nesting of handlers and eliminates the need for ad hoc, special exceptional return values from functions.
- In ML, most variables and data structures -- once created and initialized -- are immutable, meaning that they are never changed, updated, or stored into. This leads to some powerful guarantees of noninterference by different parts of the program operating on those data structures. In a functional language such as ML, one tends to build new data structures (and let the old ones be garbage collected) instead of modifying old ones.
- However, ML does have updatable (assignable) reference types, so that in those cases where destructive update is the most natural way to express an algorithm, one can express it directly.
- ML supports information hiding, so that one can implement a data type whose representation is hidden by an interface that just exports functions to construct and operate on the type.
- A functor is an ML program module takes the signature of another module as an argument. The functor can then be applied to any module matching that signature. This facility is like the template of C++ or the generic of Ada or Modula-3, but in ML the functor can be completely type-checked and compiled to machine code before it is applied to its argument(s); this leads to better program modularity.
- Features such as polymorphism, parametric modules, and a heavy reliance on garbage collection have meant that compiling ML to efficient machine code requires techniques not usually necessary in C compilers. Several Standard ML compilers generate high-quality machine code, including Standard ML of New Jersey and Harlequin ML Works.
- The ML language is clearly specified by The Definition of Standard ML (Revised) (Milner, Tofte, Harper, MacQueen, MIT Press, 1997), which defines the language in 93 pages of mathematical notation and English prose. This book is not meant for the casual reader, but it is accessible to the serious student of programming languages and its existence and accessibility provide an implementation-independent formulation of Standard ML.
- A consequence of having the language definition in a formal notation is that one can prove important properties of the language, such as deterministic evaluation or soundness of type-checking.
Standard ML
- first proposed in 1983 and then was developed between 1984-88 by scientists at Bell Laboratories and PrincetonUniversity
- a slight modification and revision of the original ML as created by Milner as well as the addition of the SML Basis Library
- further updated in versions of SML ’90 and finally in ‘97
- major changes consisted of the elimination of imperative type variables, elimination of structure sharing, and the addition of type definitions in signatures
- The elimination of imperative type variables. The role of imperative type variables in constraining polymorphism in the presence of effects is now played by the value restriction, which allows the type of a val binding to be generalized only when the right hand side is an expression in a restricted form called a value expression. A value expression is either a constant, a variable, a function expression (i.e. a lambda expression) or is built from these elements using products and constructions.
- The elimination of structure sharing. Type sharing is still available, and the structure sharing notation remains, but is interpreted in a weaker sense, as an indirect way of expressing type sharing. The role of type sharing is restricted somewhat so as not to conflict with type definitions.
- The addition of type definitions in signatures. There is also a new where type notation that allows one to modify an existing signature by adding definitions for its type components. New forms of datatype specifications and definitions allow a datatype to be defined to be the same as an existing datatype.
- SML/NJ is a compiler and programming environment that was designed to run Standard ML that was initially done by Bell and Princeton but is now taken care of by Bell, Princeton, Lucent Technologies, AT&T Technologies and Yale
- SML/NJ is one of the most common, if not the most common, implementations of ML these days and includes a number of features:
- The core of the SML/NJ system is an agressively optimizing compiler that produces native machine code for most commonly used architectures: x86 (IA32), Sparc, MIPS, IBM Power 1 (PowerPC), HPPA, and Alpha.
- SML/NJ runs under Windows 95 and NT as well as many flavors of Unix. Renewed support for MacOS is planned for the next release.
- SML/NJ provides an interactive top level based on incremental compilation, but it can produce stand-alone executables using the exportFn function.
- SML/NJ uses Matthias Blume's Compilation Manager, CM, to greatly simplify the development of large software projects.
- A variety of general-purpose data structures, algorithms and utilities (such as finite sets and maps, regular expressions, pretty-printing) are provided by the SML/NJ library.
- SML/NJ extends the SML '97 language with higher-order functors, OR-patterns, first-class continuations, and a number of other useful features.
- Support for manipulating "object languages" (e.g. logics, calculi, simple programming languages, specification languages) is provided by a simple quote/anitquote mechanism.
- Moscow ML – a lighter version of SML that was created by Sergei Romanenko at the Russian Academy of Sciences in Moscow with other contributors from Cambridge and Copenhagen; uses the entire runtime environment and ideas from the Caml Light implementation; also introduces extra features such as recursive modules and first-class structures
- ML Kit –implements the usual ML library and introduces the ability to avoid recompilation upon the changing of code; an open source compiler and utilizes bootstrapping to compile itself along with the program
- MLton – a large scale compiler for the standard ML library that focuses on optimizing the efficiency of code
- PolyML – another implementation of the ML basis which was originally written by David Matthews at the Computer Laboratory at CambridgeUniversity
Data Types/Operators
- the types for ML are very similar to that of C++ and Java including the basic types like int, bool, real and string
- one slight difference is that ML uses ~ for negation in numbers and statements
- when using the real number data type division uses /
- there are also additional Boolean data types including andalso for conjuction, orelse for disjunction and not for negation
- andalso and orelse are non-strict in the manner that they only evaluate the second operand in the appropriate situation
- strings are defined between quotation marks and use ^ for concatenation
- when checking for equality between two pieces of data they use = and < > for unequal
Syntax
- when entering operations into the implementation the machine will spit out a response in the correct format
- 3 + 4
- val it = 7 : int
- the response defines the value of the entered expression and refers to the most recent computation by the name it; it also specifies the data type following the expression
- there is an option to use an if…then…else statement but there must always exist a value or function to process if we arrive at the else portion
- there is a data type called unit that is specially designed for tracing the side effects of something in the program
- there are no associated operators and no values are returned
- we enter ( ) and the machine will output: val it = ( ) : unit
- the unit data type is common in many functional languages but can be taken as a sign that we are straying from the purely functional approach to programming
- Every function must take some arguments and return a result: specifying the unit type for these indicates that we're not really interested in them (e.g. a function that has return type unit is a little like a function with return type void in C).
Definitions
- we are allowed to define things for later use in ML through the use of definitions of both values and functions
- the use of global constants helps us to increase the readability of our programs
- defined using three pieces of information: the keyword val, the name of the constant and the “=” symbol that denotes the value associated with the constant
- there is no change in the interpreter’s reaction to the command except that it recognizes there is now a value associated with that name
- this can be used later on like a normal variable in any other programming language
- to set up a stored function we must include a few more pieces of data in the call: the keyword fun, the name of the function, the names of the parameters and optionally their types, the “=” symbol and the body of the function
- the interpreter reacts in a slightly different way in that it doesn’t respond with any value besides fn for the call because that has yet to be defined upon the function call later on
- it also prints out the parameter types that go into the function and then what they are translated into through the procedure of the function
- when calling a function we have previously defined we simply must call:
- succ 5;
- val it = 6 : int
- if we want to extend a function to have more options than just the simple input/output of the previous example we can extend them through the use of if…then…else operations or other function calls
- this is all written in a sentence like manner that include the all the necessary information following the original keyword “fun”