Elementary Patterns: ChiliPlop 98 Hot Topic Proposal

Owen Astrachan, ola@cs.duke.edu

This proposal for participation in the ChiliPlop '98 hot topic Elementary Patterns and their Role in Instruction is in two parts. In the section on current practice I describe how we currently use patterns in our three course sequence introduction to programming and computer science at Duke University. In the section on proposed pattern use in CS1 I outline a proposal for introducing patterns differently in a a first course in Computer Science. The intent of this proposal is both to make the first course a better introduction to programming and design for students and to set the stage for a more in-depth study of patterns in subsequent courses.

Current Practice in Using Patterns at Duke

In brief, we currently use a spiral introduction to patterns. We show students design patterns as part of a library of classes we provide and that students use in the first two courses. We explicitly study design patterns in our third course; all these courses are described below. The other feature of our introduction is to show patterns in context: solving problems in programs that students understand at their current level of program-understanding. We have had to develop several program frameworks to support this context, we expand on this below and in the future directions section.

We have a three course sequence introducing students to programming and computer science. Each course and its core is summarized below:

Course Purpose
CPS 06 Introduces students to programming in C++ and to computer science. Students use a library of classes on which they build their programs, and generally implement a design provided to them, or implement a piece of a larger program, rather than designing and implementing programs from scratch. A string and vector class are used rather than their "raw" C++/C counterparts.
CPS 100 Continues the study of programming, again in C++. Data structures and analysis of algorithms are covered as are more object-oriented programming concepts such as inheritance. Again, most of the programs rely on provided infrastructure in terms of classes, design, or scaffolding on which students build.
CPS 108 This is a course in software design and implementation. Roughly the first half uses C++, then Java is used. This is a course organized around team projects with much larger programs than in the first two courses. Students typically implement two to three 1,000-2,000 line programs with 10-15 classes, and one 5,000 line program, often with 100 or more classes. The emphasis in the course is on design, but this is also the first course in which students see some low-level details of C++ (e.g., arrays allocated with new rather than using a vector class).

A Gentle, Spiral Introduction to Design Patterns

In these courses we introduce several of the GOF [1] design patterns. However, we introduce them using a spiral approach in which a pattern is used in the first two courses, but might not be identified as a pattern although it is still named. Currently, CPS 108, the course in software design, is the first place we explicitly talk about patterns using GOF nomenclature and the Alexandrian style of using forces and consequences of pattern use.

In the spiral approach, a design pattern is introduced and used in programs and classes in the first course. A name is associated with the pattern, but to date we have not discussed why the pattern/solution is appropriate (see future directions). The pattern is used again in the second course, often in a new context, but to solve the same problem for which the the pattern was originally introduced. Our intent is that students see the pattern, and subsequently patterns in general, as a normal part of their programming and design toolkit; not as a special topic discussed as advanced, extra, or somehow different than core programming. In our software design course, we use a more rigorous introduction, using either GOF [1] or Buschmann et al [3] as a resource. Since we have built a concrete context on which to discuss specific patterns in our first two courses, the more abstract and general concept of design patterns is more accessible to students. In summary, the approach we use is

A Specific Example of Design Pattern Use

For example, the text for our CS1 course, A Computer Science Tapestry[2], uses a list class, a class for reading one word at a time from a file, and a class for simulating a random walk. Each of these classes uses an iterator to process the elements in the respective container, e.g., list elements, words, and steps in a random walk. For example, an extremely inefficient method for finding the most frequently occurring word in a text file is shown in the code that follows:
int main() { int max = 0; string word,maxWord; string filename = promptString("enter file name: "); WordStreamIterator outer,inner; outer.open(filename); // open two iterators inner.open(filename); for(outer.first(); ! outer.isDone(); outer.next()) { word = outer.current(); // current word for comparison int count = 0; // count # occurrences for(inner.first(); ! inner.isDone(); inner.next()) { if (inner.current() == word) // found another occurrence { count++; } } if (count > max) // maximal so far { max = count; maxWord = word; } } cout << "word \"" << maxWord << "\" occurs " << max << " times" << endl; return 0; }

Once students understand iteration, we can discuss the code above. We do not discuss the implementation of the WordStreamIterator class since this requires somewhat esoteric knowledge of resetting and rereading streams. When we get to arrays, we can study how to make this code more efficient. However, we have designed the course to emphasize conceptual understanding before concerns of efficiency. The key point here is that students understand the first, isDone, next pattern for iterating without knowing how the code supports the iteration.

In our second course, we use a map class with an external iterator that uses the same naming conventions. The code below uses the same class WordStreamIterator, in conjunction with a hash-table based map class, to count occurrences of all words in a file.


int main() { string filename = promptString("enter a filename: "); WordStreamIterator wstream; wstream.open(filename); string word; HMap<string,int> hmap(Hash,7001); for(wstream.first(); ! wstream.isDone(); wstream.next()) { word = wstream.current(); if (hmap.includes(word)) hmap.getValue(word)++; else hmap.insert(word,1); } Iterator<Pair<string,int> >* iter = hmap.makeIterator(); for(iter->first(); ! iter->isDone(); iter->next()) { cout << iter->current().first << "\t" << iter->current().second << endl; } delete iter; }

The essential point here is that we use the same iterator pattern, but in a new way: as part of an inheritance hierarchy of abstract and concrete iterator classes. Note that the makeIterator function above is an example of the GOF factory pattern. When we initially show this example to students we do not talk about the factory pattern, but we highlight the creation of a concrete HashIterator internally within the hash map class that appears to clients as an abstract Iterator.

Finally, in our third course, we discuss how iterators are implemented, the trade-offs between internal and external iterators, and the memory-responsibility problem of the makeIterator factory method (solved by using a stack-based pointer Proxy class, where Proxy is another GOF pattern). We expect students to develop iterators on their own to solve problems that arise in conjunction with programming projects they design and implement.

Software Cadavers

The second important aspect of our methodology is a suite of programs and program frameworks we call software cadavers [4]. Fist year medical students use human cadavers to study and explore anatomy. The same cadaver is used for a large part of the first year facilitating exploration of new areas in a familiar setting. Students use the cadaver to study areas beyond pure anatomy, but the anatomy is a springboard to related topics and provides a framework for other study.

We use software cadavers in the same way: we introduce programs and program frameworks to provide a setting in which students explore programming and design patterns. In our current use of these program frameworks (aka software cadavers) we have emphasized building a structure that facilitates the study of design patterns (but see the future directions section for some ideas on how other patterns can be incorporated).

Conceptual Cadavers We use conceptual cadaver to mean a concept for a program that we implement in several different ways, or a basic idea that we use in different ways. The idea is repeated in one course and across courses. For example, in the word-reading code above we read words from a text file. In the next course we read words from a text file but use a map to store them. Processing textfiles (e.g., as accessible from Project Gutenberg [5] as large sources of data facilitates discussions of trade-offs in algorithms, data structures, and program design. For example, in the 1998 offering of our CPS 108 course one of the student projects develops file-based and web-based search engines (e.g., see Glimpse [6]). This requires iterating over a source of words. The source can be a file or a web-page specified by a URL, with the right interface the differences should be minor or non-existent. Our students use a factory to return the word source, an iterator to access the words, an adaptor to ensure the URL source and the file source have the same interface, and so on.

Concrete Cadavers Our conceptual cadavers are designed to provide a familiar setting in which to study GOF (and other) design patterns. Students, however, need concrete examples rather than abstract explanations. To this end we have developed pattern software cadavers. These are working programs that provide a framework for seeing patterns in action.

We are working on developing several software cadavers, at this point we have three developed, two in active use. We use a C++ program that models the game of roulette to show OO concepts and idioms in practice, we omit a detailed coverage of this software framework here. Our most useful program is both a conceptual and concrete cadaver. It builds on a program for manipulating bitmap images that we use in our introductory courses and described in [7] written in C++. As we use currently use the program it is object-based rather than object-oriented. However, in our CPS 108 course we introduce an object-oriented, pattern-based version of the program in C++. The idea of using the same program before and after [8] patterns have been used has worked very well. This version uses several GOF patterns including Command, Adapter, and Model/View/Controller. A class diagram is shown below

We use a Java version of the pixmap program in our CPS 108 course to introduce the language, supplying another conceptual connection to a familiar setting. A screen dump from the Java program is shown below. In addition to the patterns employed in the text version, the Java version uses the Observer/Observable pattern.

We are in the process of developing a C++ version very close to the Java version, but which uses the Qt widget set.


Future Directions: Proposed Use of Patterns in CS1

To date, our use of patterns has focussed on design patterns. Since our introductory courses are built on an apprentice programming model [9], [10] we do not expect our students to use the design patterns we introduce in building new programs. Rather, we expect students to extend and modify existing programs before building and designing their own programs. However, in this process students work at becoming better programmers. To this end we propose to use programming patterns in a manner similar to what is outlined in [11].

The main idea in our proposed direction for the first course is for students to recognize and choose appropriate patterns for solving programming problems. We want students to become accustomed to thinking about the forces and consequences that exist in a problem, and to choose a pattern to help solve the problem. When confronted with a programming problem students often have trouble starting to derive a solution. We propose that an appropriate pattern tool-kit will assist in starting the programming process and direct students towards appropriate solutions. However, there cannot be too many patterns or students will think that programming is about looking through a book of solutions to find the right one. The hard work here is in developing ten to fifteen programming patterns (this number may be too high, it may be that five to ten is more appropriate). These core patterns will help students develop low-level programming expertise rather than higher level design expertise which we believe will come in later courses. In some ways, very basic code-design patterns such as Simply Understood Code are about designing code rather than designing programs. These simple patterns are as important as the programming patterns we have in mind. In [11] Wallingford provides some example of the kinds of patterns we plan on developing and using.

We propose the following pattern, outlined in [4] as the kind of programming pattern we hope to build a course around.

Consider the problem of processing sequential data, e.g., all the values in an array, or the values in a file. When the processing is complex, beginning students often try to use two loops where one loop leads to code that is easier to understand and easier to develop correctly. To see this problem in context, ask students to write code to remove all zeroes from an array, leaving the order of the other elements unchanged, i.e., to change the array on the left below to the array on the right.

In our experience, most students write one loop to process all the array entries, and one loop to try to find and remove zeros. This invariably leads to off-by-one bugs, problems running off the end of the array, and problems with boundary cases (e.g., first zero, last zero). A simple loop with an if statement tracking the last non-zero index leads to short, concise code; see [13]. A similar example is the partition part of quicksort. Most approaches use an outer loop with two inner loops: one moving from left-to-right and the other from right-to-left. It is easy to get this loop wrong, and it is not as amenable to optimizations needed in production quicksorts as the simple one-loop with an in if statement outlined in [15]. As another example, consider processing run-length encoded representations of black and white pixel data for rectangular images, e.g., 4 1 2 3 represents 0000100111. With no guidance, students often try to write nested loops to fill a matrix, rather than a single loop to read the file and populate the matrix. Captured as a pattern we might call this problem One Loop for Linear Structures and categorize it briefly as

Algorithmically, a problem may seem to call for multiple loops to match intuition on how control structures are used to program a solution to the problem, but the data is stored sequentially, e.g., in an array or a file (or even a matrix). Programming based on control leads to more problems than programming based on structure.

Therefore, use the structure of the data to guide the programmed solution: one loop for sequential data with appropriately guarded conditionals to implement the control.

A Pattern-based Course

Rather than building a course around syntactic properties of a language, we propose to build a course around problems whose solutions are cast as patterns. At first these patterns may be as simple as Simply Understood Code[12] --- the use of patterns is, we feel, as important as the patterns used. We are certainly not advocating casting everything into a pattern; we do propose that providing a name, forces, consequences,and examples of each pattern is important. Since we propose to introduce language issues, including syntax, as part of the solution to particular programming problems, we may introduce a pattern before it has been used. In this case we propose to not name the pattern initially, but offer it as part of the solution to a problem. The second time the pattern is used, it will be named and catalogued (here we're not sure how to catalog all the patterns used). This is the same spiral approach we outlined in the first section, but taken to a micro programming level rather than a more macro design level.

The key issues here are:

Again we propose to use both a spiral approach and software cadavers that are part of our apprentice style of learning in building our introductory courses around programming and design patterns.


Bibliography

[1] Design Patterns: Elements of Reusable Object-Oriented Software. Erich Gamma and Richard Helm and Ralph Johnson and John Vlissides. Addison-Wesley. 1995.

[2] A Computer Science Tapestry. Owen Astrachan. McGraw-Hill. 1997.

[3] A System of Patterns: Pattern-Oriented Software Architecture Frank Buschmann and Regine Meunier and Hans Rohnert and Peter Sommerlad and Michael Stal. John Wiley. 1996.

[4] Design Patterns: An Essential Component of CS Curricula. Owen Astrachan, Geoffrey Berry, Landon Cox and Garrett Mitchener. SIGCSE Technical Symposium on Computer Science Education. Atlanta, GA, February 1998. (to appear)

[5] Project Gutenberg

[6] Glimpse: a tool to search file systems.

[7] Animation, Visualization, and Interaction in CS 1 Assignments. Owen Astrachan and Susan Rodger. SIGCSE Technical Symposium on Computer Science Education. Atlanta, GA, February 1998. (to appear)

[8] Before and After. Technique discussed at the first Using Patterns Conference: (UP97).

[9] AAA and CS1 : The Applied Apprenticeship Approach to CS 1. Owen Astrachan and David Reed. SIGCSE Technical Symposium on Computer Science Education. Nashville, TN, March 1995.

[10] Application-based Modules using Apprentice Learning for CS 2. Owen Astrachan and Robert Smith and James Wilkes. SIGCSE Technical Symposium on Computer Science Education. San Jose, CA, February 1997, pp 233--237.

[11] Toward a First Course Based on Object-Oriented Patterns. Eugene Wallingford. The Papers of the Twenty-Seventh SIGCSE Technical Symposium on Computer Science Education. 1996. pages 27--31.

[12] Simply Understood Code. Richard Gabriel. Located at the wiki-wiki web.

[13]Pictures as Invariants. Owen Astrachan. The Papers of the Twenty-Second SIGCSE Technical Symposium on Computer Science Education. 1991. 112-118.

[14]Programming Pearls. Jon Bentley. Addison-Wesley. 1986.


ola@cs.duke.edu
Last modified: Tue Feb 24 22:32:22 EST 1998