Robert Stroud and Chris Phillips, statement, CS2 workshop

Robert Stroud and Chris Phillips, University of Newcastle upon Tyne, UK

Background - we were one of the first departments to adopt C++ as a first teaching language and have been teaching C++ now for 6 years on our equivalent of CS1/CS2. We don't teach inheritance in the first year but we emphasise data abstraction and show students how to use classes (and later implement their own classes) right from the start of our CS1 course. We use safe array and string classes to protect students from the usual confusions associated with C++ pointers/arrays and these have been very successful.

Java vs C++ - although Java wasn't an option when we chose C++, it would have come out badly in our evaluation criteria 6 years ago which focused on ease of using and defining ADTs. Things we liked about C++ that Java doesn't provide include parameterised types, operator overloading and value semantics. In retrospect, operator overloading is perhaps too much for CS1 students who are struggling with basic programming concepts, but it does allow an elegant unification of abstract types and built-in types. Nevertheless, it seems clear that Java would be attractive to students and would offer a welcome reduction in complexity, even if we have misgivings about some aspects of the language as computer scientists.

Values vs References - C++ is unusual in being a hybrid language that allows ADTs to have value semantics but it is different from other object-oriented programming languages in this respect and this feature of the language introduces its own complexities (e.g. member initialisation syntax which we loath and try to avoid). A CS1 course based on Java can probably avoid making an issue of this but the distinction between a value and a reference probably has to be dealt with in CS2 and we are unsure about how best to do this. Furthermore, some features of C++ have no direct analogy in other programming languages (e.g. the use of reference return types to allow an operator expression to be used as an lvalue rather than an rvalue). We find this useful to draw parallels between lists, arrays and associative arrays but perhaps this is too subtle for our students. (For example, we don't provide separate getItem and putItem operations for our List class but allow the current item to be updated in place just as the ith item in an array can be updated in place.) We would like to know how other people deal with this teaching issue.

Algorithms and Data Structures - the traditional title of a CS2 course implies a separation of topics that a truly object-oriented approach would avoid. But what is the right way to present these topics in a unified fashion? For example, it is not clear how traditional material on sorting fits into an object-oriented framework although insertion into an ordered list (not quite insertion sort) and binary search can be fitted into a treatment of ordered collections as a useful object-oriented abstraction. Providing array classes with sort methods seems completely artificial to us - but writing standalone functions to sort arrays isn't object-oriented.

Trees and recursion - a recursive treatment of trees should lend itself to a recursive implementation of trees but we have found it very difficult to get this right (the C++ template system gets in the way!). Interestingly, we note that other authors have apparently had the same problem and have had to revert to using static member functions to get the recursion to work without repeating code.

Pointers - if you provide students with classes that implement linked list abstractions, then how do they get the opportunity to learn about pointers for themselves the hard way? We found that providing a very general List class meant that students didn't need to implement their own data structures which seemed undesirable. In the end, we decided to show students three different implementations of stacks and queues: an array based implementation, a list class based implementation, and a pointer based implementation (i.e. implementing the list operations explicitly). In this way, we were able to show some simple examples of pointer manipulations without hiding them inside classes. We also used the idea of lists of pointers to objects to teach students about the difference between a value and a reference (e.g. keeping two different indexes to the same set of records, sorted in different ways). This seems to be an important teaching point which is particularly relevant to an object-oriented approach but perhaps isn't as well covered by text books as it should be.

ADT vs data structure - too many books don't make a clear distinction between an abstraction and its physical representation. For example, we would see arrays, linked lists, trees and hash tables as being physical data structures that can be used to implement data abstractions such as ordered collections and keyed collections. We think it is important to teach students about this distinction. Authors who provide linked list abstractions with array subscript operators are not helping matters!

Iterators and iteration - what is the right way to deal with iteration over a data structure? Surely enumerating all the items in a collection should be a fundamental operation. Unfortunately, it's not easy to define such operations without using coroutines (c.f. Clu). After much soul-searching, we decided that our List class should have a notion of the current item (c.f. Bertrand Meyer's concept of ADTs as machines), even though we know that this is the "wrong" way to do it. We have yet to find a satisfactory definition of a List class with a separate iterator class that is both easy to use and supports updating the list via the iterator in a safe fashion (e.g. suppose you delete the last item in the list whilst a separate iterator is pointing to it). Even if we could find such an implementation (the C++ STL classes don't meet our criteria!), we would want to be able to explain how the class worked to students which is a different issue altogether.

Design notation - we use a BSI structured flowcharting notation to explain algorithms to students but we don't have a good notation for explaining data representation which is the crucial design decision for a typical class in a CS2 course. The approach we have adopted is to use pictures to relate the abstract state to the concrete representation and then explain how each operation on the abstract state has a corresponding effect on the concrete representation. Explaining the representation is the crucial issue - implementing the class is trivial if you understand the mapping from the abstraction onto the representation. How do other people teach data representation? (Showing students the C++ code directly is not an acceptable solution!)

Use of standard libraries - we have no objection in principle to using standard libraries to show students how to use standard data abstractions but the abstractions have to be usable in a comprehensible fashion (does the C++ STL pass this test?). The big problem comes when you need to explain to students how the abstraction is implemented - "industrial strength" libraries were not designed with pedagogy in mind. Whilst it might be possible to deal with implementation issues on a follow-up course, we would be concerned about "dumbing down" our CS2 course if we didn't discuss implementation at all. We don't want CS2 to become a guide to using the standard library classes. However, we have no objection to teaching use before implementation (e.g. Mark Weiss's book) and have experimented with different orders in which we teach the standard material. (Note - the implementation of our Array and String class is already beyond the understanding of CS2 students in our opinion. We'd much rather we were using a language with sensible array and string semantics, i.e. bounds checking, but that's another issue!)

Owen L. Astrachan
Last modified: Wed Oct 14 16:04:08 EDT 1998