Visual Data Structures using Java - The Eye

data types and structures, frameworks. General Terms. Algorithms, Performance, Human Factors. Keywords. Java, Visualization, Interface, Graphics, Testing, ...
817KB taille 0 téléchargements 279 vues
Visual Data Structures using Java Donald Yessick Coastal Carolina University Conway, SC 29528-6054 (843) 349-2834

[email protected] ABSTRACT

how to code a binary tree traversal to trace one.

This paper describes a classroom experience in which the instructor created visually interesting programs for students in a data structures course (CS-3) without adding the complexity and confusion that often accompany graphics in programming. With this teaching approach, the overall complexity of the student projects was unaffected; the students were not required to program any graphics and instead concentrated entirely on the development of the data structures. The programs created simple visual representations of data structures, which improved student satisfaction and also gave students immediate feedback during code testing.

Teaching programming to students with diverse skill levels and little interest in coding is challenging. The Internet generation has been raised to expect immediate gratification and visual stimuli[1]. Easily bored with plain text, they know searching for preexisting solutions is faster and easier than solving any problem themselves[2].

Categories and Subject Descriptors E.1 [Data]: Data Structures – arrays lists, stacks, and queues. trees; K.3.2 [Computers And Education]: Computer and Information Science Education – computer science education, curriculum;. D.3.3 [Language Constructs and Features]: Data Types and Structures – abstract data types, classes and objects data types and structures, frameworks.

General Terms Algorithms, Performance, Human Factors.

Keywords Java, Visualization, Interface, Graphics, Testing, Project

1. INTRODUCTION AND MOTIVATION Why can’t Johnny code? The simple answer: He doesn’t believe he needs to. Many of us have heard students say they don’t care for coding, and it is true that a student sufficiently skilled in theory can slip through as a computer science major without having to write much code[3]. A strong theoretical ability can indeed reduce coding responsibilities; one does not have to know

But computer science is about code. Despite the availability of whatever it is that students are using instead of their own original code — whether it is plagiarized code, software tools that do much of the work for them, or other resources — coding remains essential to understanding the essence of computational theory. And in a CS-3 data structures course, the students must understand more than just how to code — they must understand the time and space costs of code. Moore’s law fools many students into believing we live in an age of limitless abundance. It is harder to show that even elementary problems deserve efficient solutions. Polynomial growth algorithms seem to run as fast as linear algorithms when the data sets are traceable and the machines are screaming multi-core number crunchers. The accomplished engineer may scoff at the idea that computational time and space will ever be abundant, but students taking a CS-3 level course still have little respect for the limits of computation. Students at this level of study are often for the first time facing complexity of logic rather than mere syntax and grammar. This paper presents a testing framework for student projects in a data structures for computer science (CS-3) course. The aim of this teaching approach was to assign classic programming projects in a way that challenged students to exercise their logic muscles. These classic projects were intended to build skill in logic, coding, and analysis. The new approach of the testing framework helped students to concentrate on the complex structures of data and the design patterns that flow through them.

2. THE DATA STRUCTURES COURSE Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Conference’04, Month 1–2, 2004, City, State, Country. Copyright 2004 ACM 1-58113-000-0/00/0004…$5.00.

Data structures courses typically begin with linear structures and sorting algorithms, then move to more complex structures of trees and heaps, and finally incorporate a treatment of graphs and associated algorithms. Throughout, asymptotic values are studied and students are given their first informal immersion into big-O theory and notations. Data structures is a class in which students are supposed to learn how to write efficient code, but all too often the students are satisfied with inefficient code that returns correct output.

Figure 1 Screen shot of five structures being tested. The top window shows a list structure and a selection of its contents. The second row shows a stack displaying the top value and a blue box representing the size of the stack. The middle shows a queue displaying the front item and a box representing the remainder of the queue. On the right there is a binary search tree, the elements are displayed as a heap using an inorder traversal rather than demonstrating the actual tree structure. On the bottom there is a vector. Notice that the right and left halves of each window are identical; one half of the window is drawn using the instructor’s code, the other half using the student’s. For the most part, this classroom experience followed the path of a traditional data structures course. The course lectures covered all of the topics traditionally taught. For projects, however, the instructor provided students with an application students could use to test their solutions. The test framework ran dual simulations of the various algorithms and data structures and generated a visual representation of the processes using simple graphics. The test framework compared the student structure or algorithm in one half of the window, a working solution next to it. When the two halves matched, the student knew his or her work might be done. When the two sides did not match, the student knew that the code needed to be reworked. The student was not told directly which methods were tested, and the test programs did not run a complete suite of tests. Some responsibility for testing remained with the student.

3. STRUCTURE OF THE TESTING FRAMEWORK Some of the test code was provided at the source level while the parallel portion students were required to replicate was provided only in compiled form. In this way the students were exposed to code of higher complexity than they might have yet been able to produce, but they were not required to understand or study this code — merely integrate it with their own code base. The compiled code provided a target for them to reach and enabled the distributed code to be tested immediately by using the compiled code on both sides of the parallel execution, as seen in figure 1. The test framework provided three major components: a graphical object, a test driver, and the instructor’s own solution to the data structure problem.

Figure 3 Bubble sort. Figure 2 Binary Tree On the left a tree is drawn using a level order traversal of its nodes. On the right the actual shape and height of the tree are preserved, demonstrating how the binary tree behaves in the presence of random data. The level order tree represents the minimum height while the tree on the The first component, a graphical object, was reused in every project. A simple graphic, the graphic object was a randomly generated square, circle, or triangle with a color attribute. When the class progressed to sorting algorithms and search trees, students learned that the objects were comparable first by shape and second by color. Students called the test driver in every project, passing the data structure currently under development as an argument. The test driver ran a loop forever that randomly added several objects to the structure or removed a single object in each iteration. The students could supply a parameter to adjust the animation speed. As the program favored insertions to deletions, the students’ code could be stress-tested with long-running executions. Because the choice to insert many or remove one was random, the empty structure was occasionally tested as well.

The rectangle has been copied to temp space and is about to be swapped with the oval. Shapes are weighted by area depended on the count of items in the structure. The students were able to detect and debug off-by-one errors immediately. To guarantee that the instructor’s binary tree and the students’ binary trees remain identical in the image, the tree graphic was a min heap using inorder traversal. This was because deletions of nodes having two children in the binary search tree could cause a variance in the student model and instructor model depending on whether the replacement node was selected left or right sub tree. Because of this the instructor-provided and the student-created tree could have different shapes and heights. However, because, they had processed the same inserts and deletes, count and inorder traversal remained the same for both trees, although preorder, postorder and level order could easily differ. The binary tree was thus displayed as a sorted min heap using the inorder traversal provided by the trees. The leftmost tree node value was painted as root. Another drawing program shown in

Lastly, the instructor provided as a third component his own version of the data structure — in byte code rather than source code so students could learn to integrate code without source. The instructor’s solution worked as a black box but gave the students something to compare their work against. An appropriate graphic was created to represent each data structure. For a stack, only the top element was shown, peeking above a rising blue block. The queue was similar but ran left to right, again with only the head visible. Figure 1 shows a run with five structures: a binary tree, a vector, a linked list, a stack, and a queue. The students were given an interface for each structure they were required to implement and the test program utilized the interface model to run the student code in parallel with the instructor’s solution. The test performed inserts and deletes and then drew the screen image according to data stored in the structures. Even the background color

Figure 4 Insertion sort. The circle at eleven o’clock is in temp space as it is in the middle of being placed. A rectangle has been copied into the circle’s former location, and items will be shifted until a the circle’s home is found. Note the sorted portion of the

figure 2 was capable of painting the actual shape of the tree and could be used to illustrate tree balancing algorithms of AVL and red black trees. Algorithms were also demonstrated visually. Figures 3 through 8 show screen shots of various sorting algorithms in progress. Because the framework attached delay to swaps and compares and provided visual clues during swaps and compares, students were able to develop a greater feel for each algorithm’s efficiency and compare the algorithms to others solving the same problem. Providing a visual demonstration of these Figure 5 Select sort. algorithms is not unique, but Select sort has fewer swaps and the demonstration can bear a larger dataset of 500. Again, unlike earlier note the sorted region sweeping in. The raised pieces reflect the search for a minimum value demonstrations, the approach d i h used here is not algorithmspecific and scales easily to a larger number of elements. This classroom experience was presented during a four-week data structures course. The students reported satisfaction in 4. RESULTS seeing the visual demonstration of their efforts. During debugging, the instructor was often able to provide insight into a student’s error by watching the program run. Students quickly learned that off-by-one errors could be easily detected by watching the background shading, and that infinite loops invariably led to blank screens. The calls to the student code were wrapped in try catch blocks. Trapping the errors allowed printing a stack trace to the console, thus null pointer exceptions and other routine errors could be nonfatal yet visually evident. Given the short four-week time frame, providing the testing framework allowed the students more opportunities to write the code representing the structures. In the final simulation project, students were required to put together all the data structures they developed. The instructor did not require a visual aspect in the final project for the four-week summer course but plans to repeat the simulation project with a visual aspect during a full-length semester.

5. FUTURE DIRECTIONS Figure 6 QuickSort during a partition step. The boxes are highlighting the low and high indexes as they move toward the center. Note the sorted region on the outside and the partitioned but unsorted region on the interior end

In a data structures course taught during the fall 2007 semester, students will be required to implement their own drawable objects. The simulation program used as the final project will require a visual representations of structures during the simulation.

Additional design patterns will be emphasized and incorporated into student projects. Currently the student code and instructor code run in tandem, but in a future version of this framework each of the two solutions will run on its own thread, allowing the two implementations to in effect race.

6. CONCLUSION Visual representations of data structures have been around for years. Typically the techniques are applied to just one data structure at a time or the graphic is statically compiled separate from the project itself. The novelty of this approach is to put the graphics, albeit simple, into the hands of the students. The animations have the interesting side effect of slowing down the code, particularly in the sorting operations, and this increases student awareness of running times and makes concrete the relationship to asymptotic analysis. Providing the students with working test code gives immediate feedback on their faulty programs. And maybe programming is going to be fun again.

7. ACKNOWLEDGMENTS Our thanks to Lisa, and the summer students who encouraged this project.

Figure 7 Merge sort. This screen shot shows merge sort just before the final merges. The list contains three sorted sub lists: the outside half and two inside quarters. Two merge steps remain

8. REFERENCES [1] Kelleher, Caitlin, Randy Pausch, 2007. Using Storytelling to Motivate Programming, Communications of the ACM. 50,7 (July. 2007). [2] McCabe, D. L. 2005. Cheating among college and university students: A North American perspective. International Journal for Educational Integrity, 1(1). Retrieved Sept. 2, 2007, from http://www.ojs.unisa.edu.au/index.php/I JEI/article/viewFile/14/9 [3] Spolsky, J. 2006. Joel On Software, "The Guerrilla Guide to Interviewing (version 3.0)", Wednesday, October 25, 2006 http://www.joelonsoftware.com/articles/ GuerrillaInterviewing3.html

Figure 8 Heap sort. Heap sort has behavior not unlike selection sort. The sorted region in the interior and the max heap order on the exterior of the spiral are visible. The highlighted regions are reheapifying down after an extract max operation