Meg Yahlcsc 415 Term Paper11/20/2012

Meg YahlCSC 415 Term Paper11/20/2012

Megan Yahl

CSC 415: Programming Languages

Dr. Lyle

November 20, 2012

History of C#

Naming

The C# programming language is a powerful language built on the .NET framework. Microsoft’s Anders Hejlsberg, often referred to as the “father of C#,” led the team that was responsible for the creation of this COOL language, as it was known at the time. According to Hejlsberg, the team seriously considered keeping the name COOL, which stands for “C-like Object Oriented Language,” but couldn’t due to trademark conflicts (Hejlsberg, The A-Z of Programming Languages: C#). Instead, they settled on C#, which also had a bit of word play to it. In music, the sharp sign raises a note by a half step. This could be seen as being “a step above” C, so to speak. When asked about the new name, Hejlsberg stated, “We sort of liked the notion of having an inherent reference to C in there, and a little word play on C++, as you can sort of view the sharp sign as four pluses, so it’s C++++. And the musical aspect was interesting too" (Hejlsberg, The A-Z of Programming Languages: C#).

Design

In an interview with Naomi Hamilton, a writer for Computerworld.com, Hejlsberg explained some of the design goals in developing C#. The overall goal was to “create a first class modern language on [the Common Runtime Language] platform that would appeal to the curly braces crowd: the C++ programmers of the world at the time, and competitively, the Java programmers” (Hejlsberg, The A-Z of Programming Languages: C#). This design involved “support for the next level up from object oriented programming to component-based programming” (Hejlsberg, The A-Z of Programming Languages: C#). Versioning was also an important consideration to Hejlsberg and his team. C# was designed so that it would version well, without new features breaking old code.

Identifiers,Bindings,andScopes

Identifiers

According to Ben and Joseph Albahari, authors of C# 4.0 in a Nutshell,“Identifiers are names that programmers choose for their classes, methods, variables,and so on” (Albahari 10). An identifier must begin with a letter or an underscore, and it cannot have the same name as a keyword, with one exception. If an identifier has the same name as a keyword, the former must be prefixed by the @ symbol. However, this should be avoided if at all possible due to its impact on readability. C# identifiers are case sensitive, and while there is not an enforced rule dictating proper case format, there is a common convention that should be followed. Typically, “parameters, local variables, and private fields should be in camel case (e.g.,myVariable), and all other identifiers should be in Pascal case (e.g., MyMethod)” (Albahari 10).

Bindings

C#, for the most part, is a statically typed language. With the release of C# 4.0, dynamic binding became available.Robert Sebesta, author of Concepts of Programming Languages, describes binding as “an association between an attribute and an entity, such asbetween a variable and its type or value, or between an operation and a symbol”(Sebesta 209).The difference between static and dynamic binding is the binding time, or the time at which the binding happens. Static binding happens at compile time, while dynamic binding happens at runtime. The Albahari brothers state that“Calling an object dynamically is useful in scenarios that would otherwise require complicated reflection code. Dynamic binding is also useful when interoperatingwith dynamic languages and COM components”(Albahari 5).

Scopes

According to Sebesta, scope is the range of statements in which a variable, class, package, or namespacecan be referenced (Sebesta 218). There are three main ranges of scope: local, nonlocal, and global (a special case of nonlocal). To illustrate this, assume there exist variablesvarA,varB, and varC, a code block foo, and a code block bar. Block bar is nested within block foo.If varA is declared within block bar, then varA is local to block bar. If varB is declared within block foo, then varB is local to block foo and nonlocal to block bar. If varC is declared outside all methods in the program, then varC is said to be global, andis visible to foo, bar, and all other code blocks and methods.

DataTypes

C# is a strongly typed language. This means that “every variable and constant has a type, as does every expression that evaluates to a value” (Microsoft Corporation). Additionally, “every method signature specifies a type for each input parameter and for the return value” (Microsoft Corporation). When designing the language, Anders Hejlsberg wanted a “unified and extensible type system” (Hejlsberg, The A-Z of Programming Languages: C#). He claims that this system is great for teaching because everything is an object. This statement has sparked quite a heated debate in the C# developer community, as it is mostly true, but a bit misleading. The debate revolves around a slightly modified version of Hejlsberg’s statement. The modified version essentially states that “in C# every type derives from object” (Lippert). Eric Lippert is a principal developer on the C# compiler team, and he counters this statement on his blog. After a lengthy explanation, Lippert concludes that “every non-pointer type in C# is convertible to an object” (Lippert).

Value Types

C# includes value types, reference types, generic type parameters, and pointer types. As implied in the name, value types store actual data, or values. Value types include “all numeric types, the char type, and the bool type, as well as custom struct and enum types” (Albahari 17).Value types cannot contain the null value; however, “thenullable typesfeature does allow for value types to be assigned tonull” (Microsoft Corporation). Value types are directly accessed without the need for a new operator. This is because “each value type has an implicit default constructor that initializes the default value of that type” (Microsoft Corporation). Value types can be represented both in boxed and unboxed form. Boxing is like taking the value type and wrapping it up in a little box, which can then be treated like an object. Unboxing simply reverses this process. This concept is important to the earlier point that “every non-pointer type in C# is convertible to an object” (Lippert).

As mentioned earlier, value types also include struct and enum types. A struct is “typically used to encapsulate small groups of related variables” (Microsoft Corporation). Structs may contain:constructors, constants, fields, methods, properties, indexers, operators, events, and nested types (Microsoft Corporation). Finally, structs can “implement an interface, but they cannot inherit from another struct” (Microsoft Corporation).

An enumeration is a data type defined by the user. It is “a distinct type consisting of a set of named constants called the enumerator list” (Microsoft Corporation). Enumerations are useful for providing “an efficient way to define a set of named integral constants that may be assigned to a variable” (Microsoft Corporation).Both structs and enumerations help with readability since programmers are able to give clear names and structure to what may otherwise be difficult to understand.

Reference Types

Reference types are made of an object and a reference to that object. Unlike value types, a reference type stores references to the actual data instead of the data itself. This memory handling is the fundamental difference between the two types.Programmers must use care with reference types since “two or more reference type variables can refer to a single object in the heap, allowing operations on one variable to affect the object referenced by the other variable” (Afana). This can hurt the reliability of the language. Reference typesinclude “all class, array, delegate, and interface types” (Albahari 17).

Arrays use square brackets for declaration and indexing. Array indexes start at 0 and end at one less than the array size. They are very efficient on memory because all the elements are stored together in one block. C# supports multidimensional arrays, both rectangular and jagged. Rectangular arrays, declared with ‘[,]’ after the type, are two-dimensional arrays with the same length and width. Jagged arrays, declared with ‘[][]’ after the type, are arrays of arrays. Arrays know their own length and also have a multitude of useful properties and methods accessible to them via the System.Array class. According to Albahari, “all array indexing is bounds-checked by the runtime” (Albahari 35).

Generic Types

Generics are useful for writing reusable code that can accommodate differenttypes. C# also achieves reusability via inheritance (which will be discussed later). According to Albahari, “inheritance expresses reusability with abase type, [whereas] generics express reusability with a ‘template’ that contains ‘placeholder’types. Generics, when compared to inheritance, can increase type safety and reducecasting and boxing” (Albahari 101).

Pointers

In C#, pointers are used to directly manipulate memory. They can only be used with unsafe code. Albahari states that “pointer types are primarilyuseful for interoperability with C APIs, but may also be used for accessing memoryoutside the managed heap or for performance-critical hotspots” (Albahari 170).

ExpressionsandAssignmentStatements

Expressions

In mathematics, there is a fundamental difference between an equation and an expression. An equation can be solved, whereas an expression only represents or evaluates to a value. An expression can be simple, made up of either a constant or a variable. It can also be complex by using operations to combine constants, variables, and other expressions. Expressions within expressions are sometimes denoted with grouping symbols, such as parenthesis.

Expressions in C# work much like they do in mathematics. In C# there are primary expressions, void expressions, and expression statements. Primary expressions “include expressions composed of operators that are intrinsic tothe basic plumbing of the language” (Albahari 45). As the name implies, void expressions do not have a value. Because of this, void expressions “cannot be used as an operand to build morecomplex expressions” (Albahari 45).

Operators

Operators are very important in C#. They “transform and combine expressions”(Albahari 12). There are many different operators in C# and each fits into a particular category. These categories (in descending order of operator precedence) include primary, unary, multiplicative, additive, shift, relational and type testing, equality, logical AND, logical XOR, logical OR, conditional AND, conditional OR, conditional, and assignment (Microsoft Corporation). Many of these operators can also be overloaded.

Expression Statements

Expressions that can stand alone as valid statements are called expression statements. They “must either change state or call something that might change state.Changing state essentially means changing a variable” (Albahari 49). Expression statements include “assignment expressions (including increment and decrement expressions), method call expressions (both void and nonvoid), and object instantiation expressions” (Albahari 49). In C#, the assignment operator is =. The assignment statement evaluates the expression on the right side of the operator and assigns it to the variable on the left side. There are also compound assignment operators, which are“syntactic shortcuts that combine assignmentwith another operator” (Albahari). While these shortcuts improve writability, they can potentially hurt readability.

StatementLevelControlStructures

According to Robert Sebesta, author of Concepts of Programming Languages, “a control structure is a control statement and the collection of statements whose execution it controls” (Sebesta 349). Control structures direct the flow of the program. They are intended to make decisions and execute different statements depending on the result. There are three main types of control structures in C#: selection, iteration, and unconditional branching, or jump statements. In C#, most control expressions are specified in parentheses and must be of type Boolean.

Jump Statements

The C# jump statements includebreak, continue, goto, return, and throw. The break statement immediately “ends the execution of the body of an iteration or switch” (Albahari 55).The continue statement immediately jumps to the end of a loop and starts the next iteration. The goto statement is powerful and can quickly complicate code if used irresponsibly. Albahari says that it allows code execution to jump to another label within the statement block.A label statement is just a placeholder in a code block, denoted with a colon suffix.The goto case-constant transfers execution to another case in a switch block (Albahari 55). Return statements can appear anywhere in a nonvoid method. Their job is to exit the method and return an expression of themethod’s return type (Albahari 56). Throw statements detect the occurrence of an error and throw the relevant exception, which is then handled by the programmer.

A key property of jump statements is that they “obey the reliability rules of try statements”(Albahari 54). Specifically, “a jump out of a try block always executes the try’sfinally block before reaching the target of the jump, [and] a jump cannot be made from the inside to the outside ofa finally block” (Albahari 54). This is important in exception handling, which will be discussed in further detail later.

A note on goto: When used irresponsibly, code can become nearly unreadable. It is for this reason that many programmers have reserved goto as an absolute last resort, or even shunned it altogether. The popular webcomic xkcd features a strip titled “GOTO,” in which the main character debates restructuring the flow of his entire program, or using a goto. He chooses the latter, and is subsequently attacked by a random velociraptor[1].

Selection Statements

Selection statements include the if-else construct and switch statement. The if-else construct is present in many mainstream languages, and works similarly in C#. The Boolean condition is specified in parenthesis, and the body of the construct is contained within curly braces. Curly braces are optional when the body has only one statement to execute. The if statement can be standalone or paired with a single else statement. It can also be paired with one or more else if statements. The if-else construct can be nested as deeply as desired, though it can quickly become complicated and difficult to read. The last else statement always belongs to the last unpaired if statement. In this context, “unpaired” denotes the last if that does not already have an else.

When the if-else construct has many else if statements, it may be better to utilize a switch statement instead. Aswitch statement is “a control statement that selects aswitch sectionto execute from a list of candidates” (Microsoft Corporation). The first switch section is functionally equivalent to an if statement. Any switch sections that follow are functionally equivalent to an else if statement, with the exception of the default section. If present, the default section must be the last section in the switch structure. It is functionally equivalent to an else statement. At the end of each section, some sort of jump statement must be included to exit the section. Because “C# does not allow execution to continue from one switch section to the next” (Microsoft Corporation), the absence of a jump statement will cause an error. The switch section to be executed is chosen by the value of the switch expression. This process is explained by the Microsoft Corporation:

Each switch section contains one or morecase labelsand a list of one or more statements.Each case label specifies a constant value.Control is transferred to the switch section whose case label contains a constant value that matches the value of the switch expression.If no case label contains a matching value, control is transferred to thedefaultsection, if there is one.If there is nodefaultsection, no action is taken and control is transferred outside theswitchstatement.

While the syntax may sound complicated, it is actually much more readable in practice. Any simplification in code, provided it has the same functionality, helps readability. Replacing complicated if-else structures with switch statements is one way to enhance a program’s readability.

Iteration

Loops allow repetition by executing a block of statements until a given condition is false. Loops can be exited by break, goto, return, or throw statements. A continue statement jumps straight to condition evaluation. There are four types of loops in C#. These include while, do-while, for, and foreach. The loop body syntax is like that of the if-else construct. Thewhile and do-while loops have Boolean conditions in parenthesis after the while. They differ in that while is pre-check and do-while is post-check. With a do-while loop, the body of the loop is guaranteed to execute at least once, whereas the body of the while loop may not be executed at all.

Theforloop is designed to loop a specific number of times, then exit. They are especially useful for iterating over arrays.There are three parts to a for loop: the initializer, the condition, and the iterator. Each part is separated by a semicolon. Any of these expressions can be omitted as long as both semicolons are present. However, omitting all parts causes an infinite loop. C#’s for loops are interesting in that the initializer section can either declare and initialize a loop variable, or it can contain any number of assignmentstatements, method invocations, prefix or postfixincrementor decrement expressions, object creations(usingnew), and awaitexpressions (Microsoft Corporation). The iterator section can also contain any of the aforementioned statement expressions. Between the two is the condition section, which “contains a Boolean expression that’s evaluated to determine whether the loop should exit or should run again” (Microsoft Corporation). The body of the loop is like that of the selection statements.

Theforeachloopis designed specifically for arrays and object collections. Instead of iterating a specified number of times, it iterates through the entire array or collection provided to it. Like any other loop, the foreach loop can be broken out of by using a jump statement. While the foreach loop can do a number of things to the array or collection, it “cannot be used to add or remove items from the source collection” (Microsoft Corporation).

Subprograms (Methods)

In C#, subprograms in general are referred to as methods. Methods can have any type, including user-defined types and void.