In addition to following the above specification, your program should be robust, i.e. deal with errors in a calm and reasonable manner. For example, if a file that the user specifies doesn't exist, acceptable behavior does not include dumping garbage to the screen, segfaulting, or berating the user mercilessly. Gentle admonishment of the form "jabberwock.txt does not exist" is sufficient.
grepple- interactive search of file(s) for words
grepple[ -v ] [ -r ] file1 [ file2 ... ]
- grepple reads the given files/directories and allows the user to interactively query the resulting database using any of the commands listed in USAGE below. For example (grepple! is the grepple prompt):
> grepple jabberwocky.txt alice.txt
grepple! find brillig
- (verbose) Indicates that the program should also print the line where the word occured. In the above example:
> grepple -v jabberwocky.txt alice.txt
grepple! find brillig
jabberwocky.txt,5:'Twas brillig and the slithy toves
jabberwocky.txt,35:'Twas brillig and the slithy toves
- (recursive) Indicates that any directory named on the command line should be searched recursively for subdirectories, subsubdirectories, and so on. The default behavior is for directories to be searched for files, but not recursively searched.
- Here are the commands grepple accepts:
- Find word in the database, and print out the results. A word is defined as any whitespace-delimited sequence of characters, and there is no substring matching. Therefore, "find they" will not match the word "they'll".
Multiple occurrences of a word in a line will be printed only once.
readfile1 [ file2 ... ]
- Read the given files, and insert them into the database. Directories are flat-searched, i.e., only the files within the directory are inserted into the database
recreadfile1 [ file2 ... ]
- Read the given files, and insert them into the database. Directories are recursively-searched (as with -r option)
unreadfile1 [ file2 ... ]
- Delete the given files' info from the database. Directories are flat-unread, only files in the directory are deleted, no subdirectories are read/deleted.
recunreadfile1 [ file2 ... ]
- Delete the given files' info from the database. Directories are recursively-unread
- Set the given option. The only currently supported option is
- Unset the given option. See the
setcommand for supported options.
- Quit grepple.
You will want to think carefully about what types of data structures you will use to hold the information that you need.
For example, will you use one hash table to hold all the word information? Will each word have one entry in the hash table, or will each occurrence of a word have an entry? Or will there be one entry per word per file? Or do you want a separate hash table per file?
Or ... More questions: Do you store the text of the lines in your database, or do you go search the files on disk on the occasions that the user asks for verbose output? There are a number of tradeoffs that may influence your decisions. First of all, certain schemes will use more memory than others, and this may affect the maximum number of words/files you can handle. On the other hand, certain schemes will provide faster lookups at the expense of higher memory usage or bigger startup costs (i.e. when reading in a file). One method may make it easy to perform the "unread" operation, but make it harder to do word lookups.
To further influence your decision, here are some example extensions that you may be asked to implement in the future:
- Save the current database to file indexfile.
- Load a grepple database from the file indexfile.
- List the names of files in the current database.
- Give some statistics about the current database.
- Find all words that match pattern (where pattern is a regular expression).
There is no one right answer. Obviously, it is impossible to address all of the above concerns in the most satisfying manner. The idea is to design your code in such a way that it is easily maintainable, easily extensible, and readable.
Grading criteria for this assignment will be based on:
An explanation of what to turn in the final delivarables is available.