Project: Probabilistic Text Generation

Collaboration Complete this by yourself. You may get help from Section Leaders and Rick.

You will be implementing a class named RandomWriter that uses Joe Zachary's algorithm to generate text based on a given input file. Input files may be any book from http://www.gutenberg.org/. However you must have these two files in your project (so we can more easily grade):

· http://www.cs.arizona.edu/~mercer/Projects/alice.txt

· http://www.cs.arizona.edu/~mercer/Projects/grim.txt

RandomWriter.java

Your job is to write a class, that when run as a program will generate text like this:


Print 500 letters, seed length 3, file 'alice.txt'

===================================================

n and, I shing the while thing me Alicertain that for the Knave

in and, why intraords.' The very userpill litterpill of a feel

ver unear! Oh, and go THIS FIT don't the passed very difficell,

peppeat the said not senter there trying 'Oh, the booking over

conce: 'livil their feeling to hasts all aloud. Alice down for

Alice. 'That doorward about and my done of that on their of

her for herse did yourtle shing talking to said yet, so said

two wondere acroquest to looke trial. 'Exactly that of the


Begin class RandomWriter with a main method and the constructor for this class.

// Programmer: MyFirstName MyLastName <- Change to your name

import java.io.File;

import java.io.FileNotFoundException;

import java.util.ArrayList;

import java.util.Random;

import java.util.Scanner;

public class RandomWriter {

public static void main(String[] args) {

// Change seedLength to see how it affects the probabilistic text.

// 1 is essentially random, 14 is very probabilistic (may be original text)

int seedLength = 3;

String fileName = "alice.txt";

int n = 500;

System.out.println("Print " + n + " letters, seed length " + seedLength

+ ", file '" + fileName + "'");

System.out.println("===================================================");

System.out.println();

RandomWriter rw = new RandomWriter(fileName, seedLength);

// Implement this method using the algorithm in the pdfs.

// New lines are not included, so you also need to add line breaks

rw.printRandom(n);

}

// Add instance variables here . . .

// Constructor is responsible for the following

// 1) Remember the fileName and seedLength for other methods

// 2) Calls makeTheText that sets the one big string as the same

// text in the input file (fileName) with new lines replaced with spaces

public RandomWriter(String fileName, int seedLength)

Suggested Methods

In addition to the main method and the constructor shown above, we recommend you implement the following methods in the order shown. Note: There are no JUnit tests to write, but you could print things to see your progress (just remove those printlns later).

The first method--makeTheText--should be called from the constructor because the big string is needed before anything else is done. The second method (setRandomSeed) should be called from printRandom so the seed changes as the first thing before printing random text.


// Create one big string that is the input file with every new line replaced with

// spaces. Hint: use Scanner's nextLine method to read each line of the input file

public void makeTheText()

// Grab a nGram of the desired length (set in the constructor) from

// the big string that represents the entire book.

public void setRandomSeed()

// Print charsToPrint randomly generated characters with

// newlines every 60 or so characters (no word breaks)

//

// This method uses Joe Zachary's algorithm that creates

// an array of followers for every one character printed.

//

// Precondition: 'theSeed' has all characters in the text with no new lines.

public void printRandom(int charsToPrint)

For printRandom you must use Joe Zachary's algorithm from the lecture presentation where each character printed requires the creation of a new array that has all the followers for the current nGram. That algorithm is repeated here:

• Read all text from the input file into one big String (use class Scanner)

• Pick a random n-letter nGram from the text (use class Random and its nextInt(int) method)

• For each character to be printed:

• Make an array of every character that follows the nGram in that one big String object

- Recommended: set the array capacity to the one bid String's length() / 7

- Remember: You also need to keep track of the number of followers (n++)

• Randomly pick a character ch from the array

• Print ch

• Create a new nGram by removing the first character from the nGram and appending ch

Grading Criteria 100pts (subject to change)

Turn in to one File--RandomWriter.java--to the D2L DropBox named RandomWriter

____/ +100 Generates text that gets closer to the input text as the nGram length increases (subjective).

For example, when nGram length = 2, a few correct spelling of words may appear; but when 15,
some entire sentences are the original text.

-10 If you do not include your first and last name at the top of the file RandomWriter.java

-10 If we can't run the program by running RandomWriter.java as a Java Application

Include a main method inside the class with your constructor and instance variables

-10 If we have to edit code to run the file RandomWriter.java

-10 If you do not have alice.txt and grim.txt input files included in your Eclipse project

-10 If line breaks do not occur approximately every 60 characters with no line breaks

-10 If you do not use Zachary's algorithm to create a new array of followers for each character printed.