# C S 3C: ADVANCED DATA STRUCTURES & ALGORITHMS IN PYTHON

## Foothill College Course Outline of Record

Heading | Value |
---|---|

Units: |
4.5 |

Hours: |
4 lecture, 2 laboratory per week (72 total per quarter) |

Prerequisite: |
C S 3B. |

Advisory: |
Demonstrated proficiency in English by placement via multiple measures OR through an equivalent placement process OR completion of ESLL 125 & ESLL 249. |

Degree & Credit Status: |
Degree-Applicable Credit Course |

Foothill GE: |
Non-GE |

Transferable: |
CSU/UC |

Grade Type: |
Letter Grade (Request for Pass/No Pass) |

Repeatability: |
Not Repeatable |

## Student Learning Outcomes

- The successful student will be able to write and incorporate balanced trees, hash tables, directed graphs and priority queues in his or her software.
- The successful student will be able to analyze the time complexity of a variety of algorithms and data structure access techniques and choose the best algorithm and/or data structure for the project at hand.

## Description

## Course Objectives

The student will be able to:

A. Demonstrate a working knowledge of data abstraction through various data types, data structures and built-in Python classes, and choose the appropriate data structure to model a given problem.

B. Design, implement, test, and debug intermediate-level Python programs that use each of the following fundamental programming constructs: user interaction and communication, string processing, numeric computation, simple I/O, arrays and the Python built-in classes.

C. Analyze the basic algorithms of a general tree ADT.

D. Use object-oriented programming (OOP) to create alternative implementations of binary search trees in Python, and verify or compare the logN behavior of each.

E. Compute the big-O, little-o, omega and theta time complexity of selected algorithms.

F. Define asymptotic behavior and perform empirical benchmarks to compare brute-force techniques with divide-and-conquer strategies.

G. Analyze the basic algorithms of a general tree ADT.

H. Use object-oriented programming (OOP) to implement a binary search tree in Python and verify its logN behavior.

I. Describe the advantages of balanced trees and analyze the performance of AVL trees.

J. Write code that realizes a Splay Tree using either top-down or bottom-up splaying.

K. Define linear probing, quadratic probing and open addressing as used in the hash tables ADT.

L. Design a priority queue using heaps in Python.

M. Implement a Quicksort and at least one other NlogN sort and compare the results as the number of data items approaches infinity.

N. Define indirect sorting and explain when it is needed.

O. Create a graph data and implement shortest path, minimum spanning tree and maximum flow problems.

## Course Content

A. The notion of run-time complexity

1. Exhaustive search versus Heuristic search

2. Search space pruning

3. Analysis of recursive solutions which are not appropriate due to exponential time complexity vs. iterative polynomial complexity

B. Formal treatment of time complexity

1. O(n) order of magnitude

2. o(n) order of magnitude

3. theta(n) order of magnitude

4. omega(n) order of magnitude

5. Constant, polynomial, logarithmic, and exponential time complexity

6. NlogN time complexity

7. Improper use of recursion leading to exponential time complexity

C. Measuring asymptotic behavior

1. Empirical methods of measurement (benchmarking)

2. Brute-force vs. strategies that use branching or divide-and-conquer algorithms

3. Timing greedy and depth-first algorithms in graphs

D. General trees

1. Tree nodes, roots, leaves, children and siblings

2. Binary node implementation of a general tree

3. Insertion and deletion in general trees

4. Traversal with recursion

E. Searching and Binary Search Trees (BSTs)

1. Ordering condition and structure condition

2. Object-oriented programming (OOP) implementation

3. Time complexity consequence of the divide-and-conquer algorithm in of BSTs

F. Searching and BSTs

1. Ordering condition and structure condition

2. OOP implementation

3. Time complexity of BSTs

4. Lazy deletion of tree nodes

5. Threaded trees

6. Bin heaps

G. Balanced BSTs 1: AVL trees

1. Tree height and rebalancing

2. Single and double rotations as the fundamental rebalancing tools

3. Implementing AVL trees by inheriting from a generic BST

H. Balanced BSTs 2: Splay trees

1. Splaying

2. Top down vs. bottom-up splaying

3. Implementing splay trees using Python

I. Hashing

1. Hashing functions

2. Separate chaining

3. Linear and quadratic probing

J. Priority queues

1. The percolate down operation

2. Bin heap implementation of priority queues

3. Heap sort

K. Non-NlogN sorts

1. Insertion sort

2. Shellsort

3. Benchmarking non-NlogN sorts compared with standard library sort

L. Indirect sorting

1. What it is

2. When it is needed

M. Graph theory

1. Structures of a graph

2. Nodes, edges and adjacency tables

3. Shortest path algorithms and Dijkstra

4. Minimal spanning tree algorithms and Kruskal

5. Maximum flow graphs and their algorithms

## Lab Content

A. Implementation of time-intensive algorithms on various data types.

1. Design and implement an algorithm whose execution time and/or memory requirements grow significantly when data size increases.

2. Use generics (a.k.a. templates), which are a universal tool in advanced data structures, in some aspect of the algorithm.

3. Demonstrate that the algorithm adapts correctly when the generic that you use is applied to at least two distinct underlying data types.

4. Document the results of the algorithm when the program is applied to different sized data and different underlying data types.

B. Using linked-list ADTs to optimize for size-varying or space-sensitive data types.

1. Demonstrate the ability to use programming language-supplied linked-list structures in a problem that is not easily solved using fixed-size ADTs, such as arrays.

2. Incorporate generics so as to allow the algorithm to work on various underlying data types.

3. Try different sized data for the linked-list and demonstrate that it handles growth properly.

4. Summarize the results along with sample program runs.

C. Analyzing time complexity in the lab.

1. Implement an assigned algorithm after first predicting its time complexity (linear, quadratic, NlogN, etc.).

2. Run the algorithm on various sized data sets, recording times.

3. Describe the largest size data set that the computer can handle without running out of memory or taking an unreasonable amount of time.

4. Compare the expected growth rate with the observed growth rate.

D. Demonstrating competence with binary search trees.

1. Implement a binary search tree (BST) from scratch, or make significant assigned adjustments to an existing BST data structure supplied by your instructor.

2. Use recursion as appropriate for some of the BST methods.

3. Demonstrate that the class works on various underlying base type by use of generic specialization.

4. Supply runs and report on expected vs. observed time complexity.

E. Demonstrating competence with balanced trees.

1. Implement an assigned balanced tree algorithm (such as AVL, splay or red-black) from scratch, or make significant adjustments to an existing balanced tree algorithm supplied by your instructor.

2. Use recursion as appropriate for some of the balanced tree methods.

3. Demonstrate that the class works on various underlying base type by use of generic specialization.

4. Supply runs and report difference between balanced tree times and simple BST times.

F. Incorporating hash tables into programs.

1. Produce a lab that creates or modifies a hash table and hashing function.

2. Write a client that tests out the hash table on various data.

3. Using a large data set, demonstrate that near-constant time access is produced by the hashing function and hash table.

4. Supply runs and report results with varying sized data sets.

G. Analysis of a single sort algorithm.

1. Implement a single sort algorithm as directed by the instructor.

2. Experiment with coding adjustments to try to improve the performance.

3. Compare the known time complexity of that algorithm with what you observe using increasingly larger data sets.

4. Attempt to explain any discrepancies in the expected vs. observed growth rate of the sort algorithm.

H. Analysis of multiple sort algorithms.

1. Implement multiple sort algorithms, at least two of which involve Shell sort and quicksort.

2. Experiment with coding adjustments to try to improve the performance on any one of them to see if you can beat the fastest of the algorithms.

3. Time the algorithms on very small to very large data sets.

4. Report on which algorithms work better on small sets, and which on large sets.

I. Writing projects that use graph theory.

1. Implement a generic (a.k.a. template) that will realize/specialize a graph of any underlying data type, either from scratch or using code provided by your instructor.

2. Write one of the common algorithms for graphs: shortest path, maximum flow or minimum spanning tree.

3. Discuss the problems that arise when debugging labs which involve data structures as complex as graph theoretic algorithms.

4. Devise a reasonable output for displaying graphs and supply samples with your program runs.

## Special Facilities and/or Equipment

B. The college will provide a website or course management system with an assignment posting component (through which all lab assignments are to be submitted) and a forum component (where students can discuss course material and receive help from the instructor). This applies to all sections, including on-campus (i.e., face-to-face) offerings.

C. When taught via Foothill Global Access on the internet, the college will provide a fully functional and maintained course management system through which the instructor and students can interact.

D. When taught via Foothill Global Access on the internet, students must have currently existing email accounts and ongoing access to computers with internet capabilities.

## Method(s) of Evaluation

A. Tests and quizzes

B. Written laboratory assignments which include source code, sample runs and documentation

C. Final examination

## Method(s) of Instruction

A. Lectures which include motivation for syntax and use of the Python language and OOP concepts, example programs, and analysis of these programs.

B. Online labs (for all sections, including those meeting face-to-face/on campus), consisting of:

1. A programming assignment webpage located on a college-hosted course management system or other department-approved internet environment. Here, the students will review the specification of each programming assignment and submit their completed lab work.

2. A discussion webpage located on a college-hosted course management system or other department-approved internet environment. Here, students can request assistance from the instructor and interact publicly with other class members.

C. Detailed review of programming assignments which includes model solutions and specific comments on the student submissions.

D. In person or online discussion which engages students and instructor in an ongoing dialog pertaining to all aspects of designing, implementing and analyzing programs.

E. When course is taught fully online:

1. Instructor-authored lecture materials, handouts, syllabus, assignments, tests, and other relevant course material will be delivered through a college hosted course management system or other department-approved internet environment.

2. Additional instructional guidelines for this course are listed in the attached addendum of CS department online practices.

## Representative Text(s) and Other Materials

Karumanchi, Narasimha. Data Structure and Algorithmic Thinking with Python. CareerMonk, 2016.

## Types and/or Examples of Required Reading, Writing, and Outside of Class Assignments

A. Reading

1. Textbook assigned reading averaging 30 pages per week.

2. Reading the supplied handouts and modules averaging 10 pages per week.

3. Reading online resources as directed by instructor though links pertinent to programming.

4. Reading library and reference material directed by instructor through course handouts.

B. Writing

1. Writing technical prose documentation that supports and describes the programs that are submitted for grades.

2. Writing specifications using prose to connect natural English language to the formulaic programming languages.