Computer Science 2210
February 7, 2012 / Lab Assignment 1 – String Handling / Due: ______

Purpose

The purpose of this assignment is become familiar with C# and Visual Studio while developing a program that uses the string handling capabilities of C# and .NET to do some text analysis. The solution will also use regular expressions to validate certain user input. It will utilize the generic List collection as a primary data structure.

Specifications

A Utility Class

Build a utility class of static methods that you might find useful in future assignments. Following is an example of such a method that you might include in this class.

///<summary>
///DisplayaPressAnyKeyto...messageatthebottomofthescreen
///</summary>
///<paramname="strVerb"thetermtoincludeinthePressAnyKeyto...message;defaultsto"continue..."</param>
publicstaticvoidPressAnyKey(stringstrVerb="continue...")
{
Console.ForegroundColor=ConsoleColor.Red;

if(Console.CursorTopConsole.WindowHeight-1)
Console.SetCursorPosition(0,Console.WindowHeight-1);
else
Console.SetCursorPosition(0,Console.CursorTop+2);
Console.Write("Pressanykeyto"+strVerb);
Console.ReadKey();
Console.Clear();
Console.ForegroundColor=ConsoleColor.Blue;
}//EndPressAnyKey

Some suggestions for other methods follow. The list is not exhaustive, and you should come up with your own list.

///<summary>
///Skipnlinesintheconsolewindow
///</summary>
///<paramname="n">thenumberoflinestoskip-defaultsto1</param>
publicstaticvoidSkip(intn=1)

///<summary>
///DisplayaspecifiedwelcomemessageinaMessageBox
///</summary>
///<paramname="msg">Themessagetobedisplayed</param
///<paramname="caption">thecaptionfortheMessageBox-theauthor'snameisappended</param>
///<paramname="author">thenameoftheauthoroftheprogram</param>
publicstaticvoidWelcomeMessage(Stringmsg,Stringcaption="ComputerScience2210",Stringauthor="Ima Jeanyus")

///<summary>
///Returnastringformattedfordisplaybetweenthedesignatedleftandrightmargins
///</summary>
///<paramname="txt">thestringtobeformatted</param>
///<paramname="leftMargin">theleftmargin</param>
///<paramname="rightMargin">therightmargin</param>
///<returns>theformattedstring</returns>
publicstaticstringFormatText(stringtxt,intleftMargin=0,intrightMargin=80)

To implement and use the WelcomeMessage method above, a reference to System.Windows.Forms must be added to the project references and a using command must be given for it in your Utility class. You may wish to add a GoodbyeMessage method as well.

Text Class

Build a Text class to input text from a text file, parse it into tokens, and serve the collection of those tokens to other classes that will analyze the text. It should have at least two public properties: one to hold the original string consisting of the entire text from the file; the second should represent a List<string> of all tokens found in the text.

For this assignment, a token is any word or punctuation mark found in the text. The term “word” should be interpreted broadly to mean any non-empty field between delimiters. In the sentence, “Hi Jack, why are 23 airport guards chasing me?”, the tokens are “Hi”, “Jack”, “,”, “why”, “are” “23”, “airport”, “guards”, “chasing”, “me”, and “?”. There may be other tokens not readily apparent such as a newline character. The delimiters should make sense for this type of assignment; for example, a space, common punctuation marks, newline (‘\n’), carriage return (‘\r’) characters, and so forth are good candidates for delimiters. Spaces and tab characters may be discarded and not kept in the list of tokens.

This class should have at least two constructors – a default constructor and one that takes a string parameter representing the path/file name of the text file to be processed. It should also have a method that can be invoked to display the original string and another that can be invoked to display the list of tokens in a reasonably formatted manner. Hint: the former method may take advantage of the Utility.FormatText method.

A Distinct Word Class

The DistinctWord class should represent a single word of text (a token above that contains no punctuation or escape characters). It should also include a counter that represents the number of times that word appears in the text file. Both the word and the counter should be represented with public properties.

The class should have a default constructor and a constructor that takes a string parameter representing a “word”. The parameterized constructor should convert all alphabetic characters to lower case. The class should have an overridden ToString method that formats this single word and its counter properties for possible display by another class.

The class should implement the generic IEquatable and IComparable interfaces. All comparisons should be based on the string property of the class only.

A Words Class

The Words class is a container for a collection of DistinctWord objects. It should have at least two public properties: a generic List of DistinctWord objects with a private setter, and a read-only Count property representing the number of items in the List. The DistinctWord objects should be sorted into alphabetical order.

The class should have a default constructor and a second constructor that takes one parameter representing a Text object. It should have a Display method that can display all of the DistinctWord objects in its collection in a formatted list. The Display method should use the DistinctWord class’ ToString method. An example of part of this report follows.

A Sentence Class

This class represents a single sentence. For the purposes of this assignment, a sentence is anything that ends with an end-of-sentence punctuation mark such as a period, question mark, or exclamation point. For simplicity, assume that each of these end-of-sentence punctuation marks always represents an end-of-sentence even if they occur in some other context. For example, the period following an abbreviation is still an end-of-sentence punctuation mark for this task.

This class should have public properties with private setters for the number of words in the sentence, the average length of the words in the sentence, the subscript for the first token in the Text class’ List of tokens where the sentence begins and the subscript for the final token in the sentence.

The class should have at least two constructors including a default constructor. The second constructor should take a parameter representing a Text object and an integer parameter representing the location of a token in the Text object’s List of tokens. The second parameter indicates where this constructor should begin extracting a sentence.

The class should have an overridden ToString method that formats a Sentence object for possibly display by another class. This method may utilize Utility.FormatText. An example of the display of such an object follows.

A SentenceList Class

The SentenceList class represents a generic List of Sentence objects. This class should have at least 3 public properties with private setters. One represents the List itself. Another represents the number of sentences in the collection, and the third represents average length (in words) of all the sentences in the List.

This class should have a default constructor and a parameterized constructor that accepts a parameter representing a Text object. It should have a Display method that allows one to display a nicely formatted list of all of the sentences in the objects List of Sentences. The display should have headings and at the end, it should display statistics such as the number of sentences and the average number of words of the sentences. A partial example of such a report follows.

A Paragraph Class

This class is similar to the Sentence class. It represents a paragraph extracted from the list of tokens in a Text object. For this assignment, a paragraph ends with two consecutive newline characters, two consecutive carriage return characters, or the end of Text object’s list.

The class should have public properties with private setters for the number of sentences, the number of words, the average sentence length in words, and the subscripts of the first and last token in the paragraph.

It should have constructors and a ToString method similar to its counterparts in the Sentence class. An example of output produced using the ToString method follows.

A ParagraphList Class

This class is a container for a generic List of all Paragraph objects. It is analogous to the SentenceList class for Sentence objects. It should have similar methods and properties. A partial example of a report follows.

A Driver Class

The driver should be menu driven, and it should demonstrate the above classes. Except for managing the menu, most of the real work should be done in the other classes. The instructor will post a Menu class that you may use for managing the menu along with an example of its use.

The driver should display an appropriate welcome message including the names of the program’s authors, and it should prompt the user to input his/her name and email address. The format of the email address should be verified using regular expressions. When the user decides to exit the program, a goodbye message including the user’s name and email address should be displayed.

The driver should accept text input to be analyzed from the keyboard or from a user-specified text file.

Additional Specifications

Each of the classes in this assignment may have as many other properties and methods as you need.

For each method you create, make an intentional decision about whether it should be public or private. If it is for internal use by the class itself, it should be private.

Try to use good object-oriented principles in designing and implementing your solution. Make good use of the features of C# and .NET to avoid brute-force approaches where possible.

This project is large enough so that it should not be tackled as a whole initially. The following approach is suggested.

  1. Design, develop and test your Utility class first.
  2. Use a “throw away” test driver to debug and verify the utility functionality.
  3. Do not do any of the remaining steps until step 2 is completed. For each remaining step, do not move on to a subsequent step until the all of the previous ones are completed.
  4. Using a UML like diagram, design your other classes, but do not start on implementation.
  5. Write your driver program using the UML diagram. This should be relatively short and simple. You will not be able to fully test/debug the driver until the other classes have been built. Comment out those method calls and local variable definitions that refer to not-yet-implemented classes.
  6. Implement the Text class from your UML diagram. Uncomment the portions of the driver that refer the Text class only. Test the functionality of this class thoroughly before moving to the next step. This is called Unit Testing.
  7. Repeat step 6 for each of the remaining classes, using a logical sequence (i.e., build the Paragraph class before the ParagraphList class).
  8. Fully test the final program.

Submit

Please submit your work as described in the Course Fact Sheet posted on the course website. For this assignment only, you may work in teams of 2-3 people as long as the work is shared approximately evenly among all group members and as long each person learns all of the material. Future tests and quizzes will verify that all team members learned the material and did their share.

Each component (class, driver, etc.) should have a header block of comments that includes the name of the person(s) responsible for it. Each team member must have sole responsibility for his/her share of the classes. The entire project name (as submitted) should include the names of all contributors.

Lab 1 AssignmentPage 5