CPS 100 FALL 1996: Assignment #4

Due: Monday, Oct. 28 by 8am

Last Date to Turn in: Monday, Nov. 4 by 8am

40 points

In your cps100 directory, create a directory called assign4 using the mkdir command. Change into the assign4 directory.

In order to do this assignment, you need to copy some files using the following cp command (don't forget the trailing period, or dot):

       cp  ~rodger/cps100/assign4/*  .

This command will copy files into your directory for you to use. If you type ls you should see the following files: Makefile, ladder.cc, ladder.h, ladderq.cc, QueueAr.h, QueueAr.cc, and template.cc.

For this assignment, you'll be using a database of English five-letter words from the Stanford GraphBase (knuth.dat, a list compiled by Don Knuth). This list has about 6,000 words in it. There is a smaller file of data to work with also called words5.dat.

In your ladder directory, create a link to these data files by typing:

   ln -s ~rodger/data/knuth.dat  knuth.dat
   ln -s ~rodger/data/words5.dat  words5.dat

Then you can access these files without specifying a long path-name for the files.

For the programming problem that follows, you should use the style rules discussed in class, which includes meaningful variable names, indentation, and comments (pre and postconditions) at the top of the file and for each function. Also include your name, date, course, and purpose in a comment at the top of the program.


Problem: Word Ladder: Turning Stone into Money

The input to the program is the name of a word file. The user should then be prompted for two words of the same length (5 characters). The output is a sequence of words in which consecutive words share all but one letter, starting with the first word and ending with the last. One letter can be changed to another letter only if the resulting symbols form a valid word. For example, to turn stone into money, one possible ladder is (replace 't' by 'h', replace 'o' by 'i', etc.):

stone shone shine chine chins coins corns cores cones coney money

All of these words can be found in the Knuth file and in a dictionary. The user will continue to enter words, and word ladders searched for, until either of the words entered is NOT 5 letters in length.


Assignment: Part I

You are to write a program ladderq.cc that uses a file of 5-letter words to find the shortest ladder from one word to another using a process outlined below. You must develop a class to do this, the class Ladder has been started for you, but you will need to add more member functions (both public and private).

Your program should:

A sample run: > ladderq Enter filename: words5.dat Enter two 5-letter words (length != 5 to end): smart brain Here is the ladder: smart start stark stack slack black blank bland brand braid brain Enter two 5-letter words (length != 5 to end): angel devil There is no path from angel to devil Enter two 5-letter words (length != 5 to end): no more >

The file knuth.dat has extraneous information in it. Ignore lines that begin with *, and only process the first 5 characters on other lines. Knuth asks that the file not be altered, hence these restrictions. Code to read this file is included as the member function LoadWords, already written for you to use.


Algorithm

To find the shortest ladder, you should use the templated Queue class provided (QueueAr, the modified Queue class from the Weiss book) First, store all of the words from the file in a vector of type Wnode * (this is done for you in LoadWords).

struct Wnode { string word; Wnode * prev; }; (this assumes the use of the class string from CPstring.h. You can use some other kind of string if you want).

A ladder is found by putting the starting word (or rather a pointer to it) on the queue, then putting all words 1 letter away from this word on the queue, then putting all words 2 letters away on the queue, then all words 3 letters away, etc. As each word is taken off the queue, if the last (target) word is found the process can stop (there may be other words on the queue, but they'll be ignored).

A Word w isn't actually stored on the queue, a pointer to a struct containing w is stored. The other field of the struct is a pointer to the word that is one letter away from w and that caused w to be put on the queue (the word's predecessor). For example, if w is house, then pointers to structs containing mouse, louse, douse, horse (and so on) are enqueued with each struct pointing to house since this word preceeded the others and caused them to be enqueued. The first word doesn't have a predecessor. It's field cannot be 0/NULL since this is used for another purpose. An easy fix is to make the pointer self-referential, it points to the struct itself (and this will need to be checked when printing ladders).

More Details

The first word (entered by the user) is looked up in the list of words, and a pointer to the struct containing the word is enqueued. For extra credit your program should be able to handle a first word even if the word is NOT in the list of words (all other words in the ladder, except perhaps for the last, must be in the list of words).

Put a pointer to the struct containing the word onto the queue (it's a queue of Wnode pointers). Then repeat the dequeue/enqueue process below.

Dequeue an element (it's a pointer). Find all words one letter apart from the dequeued word. If one of these is the target word, you're done (or if one of the words is one apart from the target word you're done, you can stop early). Otherwise enqueue each of the words found if it hasn't been queued up before (you can use the prev pointer fields in a Wnode to determine if a word has been enqueued before --- initially all prev fields should be set to 0/NULL, this helps determine if a word has been enqueued before). This means each word is enqueued at most once.

When the target word is derived, you'll need to print out the ladder from the first word to the target word. The prev pointer in the Wnode stores information that will allow the ladder to be recreated, you may need to use recursion or a vector since the ladder will be backwards (but should be printed properly). Alternatively you can store the words in an array/vector and print them out in reverse without using recursion, but using a loop over the vector.


Ladder Member Functions

You must implement the functions described below. You'll find it useful to implement other member functions. Sometimes the functions should be private. This is the case when a member function is a helper function for other member functions, but shouldn't be called by the user. Making a helper function private ensures that only other member functions can access the helper function, but client programs cannot.

You'll probably find it useful to write a function IsOneApart() that is used to determine if two strings are one letter apart. To do this, count the letters that are equal. If this is one less than the total number of letters in the words, the words are one apart. This function does NOT need to be a member function, it has two strings as parameters (const reference) and returns true if the strings are one letter apart. You can just define this function in ladder.cc and use it there.

You'll probably want debugging code/member functions to verify what's going on. If you build helping/debugging member functions into your class you'll save time in the long run since the member functions can be used to help debug code.

You may want to write a separate function to find a word in the vector of words (pointers to words) read in. You can write this code inline (rather than making a function out of it), but the function can be useful in debugging and developing.

It is advisable that you test out each function you write to make sure it works correctly. In order to do so, it would be very helpful to create a small data file with about 8 words containing a small ladder of size 4 in it for testing.

Using Templates

To use the templated Queue class you'll need to do use a file called template.cc. Template code needs to be seen by the compiler. To this end, all .h and .cc files are #included in a separate file template.cc. This file is illustrated below. #include "QueueAr.h" #include "QueueAr.cc" #include "ladder.h" template class Queue<Wnode *>;

If you want several kinds of queues, just put another definition in the template.cc file. Once the template.cc file is compiled to template.o, you only need to relink, not recompile, every time you make a change in ladder.cc. This will make your recompiles much faster.


Extra Credit (5 pts)

Write a new version of ladderq.cc called ladXtra.cc. This program should process the words so that only "good" matches are tried when ladders are found. The preprocessing step will take a long time, but word ladders will be found very quickly.

The idea is that for each word, all words one-letter away are determined (and stored somehow) when the words are loaded. When looking for candidate words to enqueue, only words that are one-letter away (these are already known) are checked for previous use. This saves searching through the entire list of words and checking whether each is one letter away.


Submitting Program

When your programs compile and produce the correct output, create a "README" file (please use all capital letters). Include your name, the date, and an estimate of how long you worked on the assignment in the "README" file. You must also include a list of names of all those people (students, prof, tas, tutor) with whom you consulted on the assignment. See the rules for collaboration in the CPS 100 syllabus.

To submit your programs electronically type (leave off ladXtra.cc if you didn't do the extra credit):

   submit100 assign4 README ladderq.cc ladder.h ladder.cc template.cc ladXtra.cc

You should receive a message telling you that the program was submitted correctly. If it doesn't work try typing ~rodger/bin/submit100 in place of submit100 above.

You can submit by typing make submit (or make submitX if you did the extra credit program) if the correct README file is in the directory from which you submit. You can always edit the Makefile command submit or submitX to add or change filenames.