Lab 12: Constructing a Concordance (Final Graded Lab)
Solution: MakeConcordance.java
Solution: CEntry.java

Objectives

The objectives of this lab are:

The Problem

A concordance is an alphabetical list of the words in a book. For this exercise each concordance entry will consist of the following items: the word, how many times it occurs in the book, its line number and its word order on the line. For example, the concordance entry of the word "to" in test document is:
    to(7) locations: [1,5] [10,6] [18,7] [22,7] [31,12] [33,5] [34,1]

Thus, there were a total of 7 occurences of "to" in this document and the first occurrence was on line 1, word 5, the second was on line 10, word 6. Note that the occurrences are stored in order from first to last.

Download the file taleshort.txt, which contains a few paragraphs from the well known Dicken's novel A Tale of Two Cities (or the entire book, Tale of Two Cities). Write a Java program that will construct a concordance for any book (stored as an ascii text file) named on the command line. After the program constructs the concordance, it should allow the user to repeatedly enter words on the command line and it should display the entry for that word. Here is some sample output:

$ java MakeConcordance taleshort.txt 
Input a word to look up (or just hit <RET> to quit): to
to(7) locations: [1,5]  [10,6]  [18,7]  [22,7]  [31,12]  [33,5]  [34,1]  
Input a word to look up (or just hit <RET> to quit): westminster
westminster(1) locations: [27,6]  
Input a word to look up (or just hit <RET> to quit): impeach
That word does not occur in this book
Input a word to look up (or just hit <RET> to quit): 
$ 

Strategy/Approach

Here are some questions to think about:

Sub Task: Reading a Text File

You should be able to adapt the code you developed last week for this task. In this case, however, you need to be able to read each line separately. Here's sample code that reads a text file line by line:
// Read a file and print its lines
import java.io.*; // Import Java IO classes
...
try {
    File f = new File(args[0]);
    InputStreamReader iStream = new InputStreamReader( new FileInputStream(f));
    BufferedReader reader = new BufferedReader(iStream);
    String inString = reader.readLine();
    while (inString != null) {
        System.out.println("LINE:" + inString);
	inString = reader.readLine();
    }
} catch (FileNotFoundException e) {
    System.err.println("Error: File " + args[0] + " not found");
    e.printStackTrace();
} catch (IOException e) {
    System.err.println("Error: I/O exception");
    e.printStackTrace();
}
...

Sub Task: Command-line Input

For command line input, you can use a java.io.BufferedReader or java.util.Scanner object. Here are the basic commands you need to use for a BufferedReader:
BufferedReader reader = new BufferedReader(new InputStreamReader(System.in));
String inString = reader.readLine();

To repeatedly read the user's command, you would put readLine() into a loop that exits when the reader object returns an empty line.

Sub Task: Concordance Entries

Design an appropriate Java class to store each concordance entry. This class should be a revision of last week's WordFreq class (or an extension of it). This class needs to store the word, its count, and a list of the locations of its entries. Here's how my solution displays an entry:
    to(7) locations: [1,5] [10,6] [18,7] [22,7] [31,12] [33,5] [34,1]

Grading

You will be graded on whether your program works correctly and efficiently, is well designed, uses appropriate data structures and algorithms, is well documented, and is completed within the lab period. Among the design considerations that I will be looking for are whether you make proper use of various object-oriented concepts and principles, such as the toString() method, the distinction between public and private, and so on. Because there is always a chance that you may not completely finish the project, you should document your code as you go. That way you can receive partial credit for documentation.

You're done. Great work!