Course Syllabus

Credits: 3 hour lecture [3 credits]

To be offered Winter 2020

Short Title: SW Reliability

General Information

Course Coordinator(s):	Dr. Colin S. Gordon
Instructor Contact Information (phone, email, website):	csgordon@drexel.edu https://www.cs.drexel.edu/~csgordon/
Office Hours, Location, Mailbox:	Office hours by appointment, in 1174 for those near campus, or online for remote students. If you have anything you'd like to discuss, please do not hesitate to schedule an appointment! It really is not an imposition. There are no recurring hours because historically office hours associated with evening classes have not been well-attended because of student scheduling conflicts, and the presence of recurring hours tends to discourage students who can't make them from reaching out.

Student Learning Information

Course Description

This course provides in-depth coverage of testing and other software validation techniques intended to produce reliable, correct software. Topics include formal and informal specification, core testing techniques, approaches to comparing quality of test suites, principles of reasoning about software correctness, guided automatic test generation, designing software to support validation, and using current and future tools to automatically validate or find bugs in software. The course strikes a balance between teaching principles of reasoning about programs, techniques in current use, and providing a basis for understanding cutting-edge techniques still in early stages of adoption.

Prerequisites: CS 571 [Min Grade: C] (Can be taken Concurrently), CS 520 [Min Grade: C] and CS 570 [Min Grade: C]

Course Purpose within a Program of Study

Within the revised MSSE program, this will serve as one of 6 required courses to provide a foundation for deep knowledge of software engineering, directly addressing one of the discipline's core goals: to help developers produce correct software.

Within the CS PhD program, the course will serve as an elective suitable for any graduate student, but particularly for those with research interests in software engineering, programming languages, and formal verification.

Statement of Expected Learning

The course objectives are to:

Teach students about the range of notions of what makes software correct, and how to clearly state those requirements for software.
Teach students the essential ideas involved in reasoning (formally and informally) about software correctness.
Teach students how to use automated and semi-automatic techniques to help validate that software meets its specification.
Understand how software validation changes (and remains the same) when applied to sophisticated systems including concurrency and non-determinism.
Prepare students to understand, apply, and extend software validation methods currently in use and adopt new verification techniques as they become practical

As learning outcomes, students completing this course should be able to:

Clearly specify what correctness means for a given software system, and modularly for its components.
Design software in ways that make fine-grained testing and informal reasoning about correctness easier than alternative designs.
Write (manually) suites of tests that validate specific portions of specifications.
Understand the strengths and weaknesses of various automated validation techniques, such as test generation and static analysis
Select appropriate validation methods for a given project
Clearly and persuasively argue that some validation technique (e.g., test suite) adequately exercises both the relevant program component and its specification.
Understand the core principles of reasoning about whether software is correct.
Effectively apply advanced analysis tools that semi-automatically generate tests to validate a specification, or systematically search for a violation of a specification.

Course Materials

Required and Recommended Texts, Readings, and Resources

Required: No textbooks, instructor-selected research papers.

Recommended: Hillel Wayne's Practical TLA+.

Drexel University Libraries provides an electronic version of this free to all Drexel Students. To access it:
1. Click here for a permalink to the library catalog listing, and click the "Full Text Online" link.
2. On the resulting page, click the first "Click to access" link.
  - The other link takes you to a Skillport version of the book, which has broken half the formatting.
3. You'll be taken to O'Reilly's Safari site. It will probably ask you to register an account. If you do, you'll be able to log directly into O'Reilly's Safari without going through the library, and you'll be able to use the phone/tablet app to read away from your computer, including offline access.
Alternatively, you may purchase a hardcopy.
Alternatively, you can purchase a DRM-free PDF and EPUB directly from the publisher here.

Required and Supplemental Materials and Technologies

For the first two homeworks, you may choose your target language from the following language/platform/tool combinations that support basic unit tests, code coverage, and property-based testing:

For Python, pytest, pytest-cov, and hypothesis.
For Scala, ScalaTest, Scoverage, and ScalaCheck.
- This is easiest to arrange by using the sbt build tool with its scoverage plugin
For C# or F# on .NET Core, xUnit, Coverlet, and FsCheck.
- Note that you must use .NET Core, which is cross-platform and is not the same as the .NET Framework or Mono. If you are using the dotnet command in the terminal, you're using .NET Core.
For Java, JUnit, JaCoCo, and jqwik
- This is probably easiest to arrange using the gradle build tool, with its jacoco plugin

You can use this flexibility as an opportunity to explore a new language ecosystem, or as a way to stick with what's already familiar. You may request to use another language setup, but I'll only approve if there's a mature code coverage tool and a reasonably mature property based testing library. Whichever approach you choose, I must be able to execute your tests and gather code coverage information from the command line so I can automate running your tests under Linux using the standard cross-platform dependency management for your platform. If running your code requires manually downloading dependencies and configuring search paths, your grade will be penalized; this is why I have recommended using build tools that automate dependency management as well for JVM and CLR languages (Python's setup.py approach also makes this feasible).

We will use TLA+ for an assignment involving formal specification and model checking. The toolkit itself is fairly mature, has a freely available book (Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers), and has a nice tutorial site (Learn TLA+) written by the author of another nice book (not free, but available via Drexel's library subscriptions). While new to most students, are truly and increasingly used in industry. We will read about how TLA+ has been used prominently in Amazon Web Services, and from there it has spread widely among distributed systems developers.

I strongly suggest trying to install the TLA+ Toolbox early; historically there have been issues with JAR signing, and getting those issues fixed before you need it for an assignment will help you.

Assignments, Assessments, and Evaluations

Graded Assignments and Learning Activities

The course grading is focused on responses to readings, as well as short homework assignments.

Readings

Each week (except the first) you will need to respond to two research papers, no later than 6pm the day before class. Late responses are not accepted, but below there is a policy allowing you to skip a few of these during the term without penalty.

For each paper you will submit a commentary that covers your understanding of the major points and contributions of the paper, the paper’s limitations, things you might not have understood (which we can explain in class!), ways to extend or improve the paper’s work, or other problems this paper’s solutions might provide insights to. Note that not all of these questions make sense for every paper we read. I mainly want to see (a) how well you understood the main ideas from the paper, and (b) that you’ve thought about how well or poorly it might really work.

You should consider:

A short (3-4 sentence) summary
List the paper’s major contributions (these may or may not match what the authors claim)
What’s the greatest technical strength in the paper?
What are some limitations of the techniques described?
What are some ways the paper’s results could be extended, improved, or better evaluated?
Does this paper suggest solutions (or approaches) to other problems?
Are there parts you didn’t understand?

Not all of these will make sense for every paper, but most of them are sensible to cover for most papers.

Note that different people will often have different takes on the same paper, disagree on whether a choice made by the authors is a strength or weakness, or find different things clarifying or confusing. This is all okay! Everyone has different backgrounds. If you are confused about some part of the paper, don’t be shy: almost certainly someone else found it confusing, too, and maybe they’re too timid to mention it. By pointing out what was confusing or difficult, we raise the opportunity to discuss it in class and help everyone understand better. I found some parts of these papers difficult or confusing the first time (or two) I read them - this is normal.

You may skip up to 4 reading responses during the term, by submitting, instead of a response, a sentence saying you are using a skip --- by the deadline. You may do this for any 4 responses, distributed any time in the term. You may skip one response in each of four weeks, both responses in each of two weeks, or take the in-between option of skipping one week completely and half the work on two additional weeks. Assignments for those weeks will simply be omitted when calculating your reading response grade for the term. One thing to consider, though: the skip only means you do not need to write a response. If it's a paper that is useful for your homework, you may still find it beneficial to read the paper, even if you skip the response. You may apply skips retroactively; if at the end of the term you have a 0 or 1 you'd like to remove, and you have skips remaining, you can apply the skip to that response. There is no bonus for having unused skips at the end of the term.

Homeworks

The homeworks are on the following:

Blackbox and whitebox testing review
Property-based testing and Design by Contract
Formal specification and model checking
Static Analysis and Symbolic Execution
Refactoring for Testing and Test Case Prioritization

The late policy for homeworks is as follows: for the term, you have 5 late days to distribute between homework assignments 1-4 at your discretion. The final homework may not be submitted late, or I will be unable to grade it in time to submit your final grade. (The final homework is also intended to be less work than earlier assignments, so this should be feasible.)

Grading Matrix

50% Reading Assignment Assessments
50% Homework (even split of 5 assignments at 10% each)

Responses are graded on a scale of 0 to 2, with the possibility earn a 3/2 on any summary:

0 - No summary, or your summary looks like you did not read the material.
1 - You read the material, but essentially summarized without reflecting
2 - You wrote a solid, thoughtful response
3 - You wrote a particularly insightful response

As noted above, the grading for responses emphasizes that you have made a serious effort to understand the paper and consider its strengths, limitations, etc. The grading does not necessarily depend on *correctness* of your understanding, because most of you will lack the background to fully understand all parts of the paper, which can lead to misunderstandings and lack of understanding for parts. That's okay and expected, and won't hurt your grade unless your response suggests you have very fundamental misunderstandings of what the papers are even trying to accomplish. Historically, this has only occurred when people have not actually made a serious effort to read the papers.

The late/skip policies were described above.

In addition the late and skip policies, extensions are possible for good reason with reasonable notice. I am aware that students have jobs, family matters, paper deadlines for their PhD, etc., which can interfere with completing assignments. I want your grade to reflect your mastery of the material and quality of work you hand in, not whether or not you were fortunate enough to avoid major life events during the term. If something comes up during the term, let me know. If it's unexpected (e.g., you end up in the ER when you were planning to work on coursework), let me know when you can and we'll figure it out. If it's something you know about in advance (e.g., you must travel for work), let me know as soon as you know, and we can discuss whether we should give you an extension on an assignment. I reserve the right to request supporting evidence for your stated need for an extension.

Attendance

If you're in the in-person section: Drexel's stated policy is that course attendance is mandatory for students in the in-person section. I will not take attendance explicitly every class, because it's tedious and takes away time from actual material. But I do expect you to come, and if you are absent on a regular basis it will negatively affect your term grade, beyond the grading percentages above. That said, I understand things happen (you might get sick, your car might break down). If you're in the in-person section and will miss class, please let me know.

If you're in the online section: I will work to make it possible for you to ask questions live during class, for those interested in watching live. Otherwise, enjoy the flexibility of online learning.

Grade Scale

The following scale will be used to convert points to letter grades:

Grade
97-100	A+	82-86.99	B	70-71.99	C-
92-96.99	A	80-81.99	B-	67-69.99	D+
90-91.99	A-	77-79.99	C+	60-66.99	D
87-89.99	B+	72-76.99	C	0-59.99	F

Note that the instructor may revise this conversion if/when necessary.

Course Schedule

[This schedule is tentative and may change during the course.]

Week by week:

Overview of software correctness, specification, and process. The gold standard (formal verification) its relationship to techniques in widespread use. Review of baseline testing techniques: unit testing, test case design, measuring quality with coverage (path/statement/value/etc.), assertions and design by contract.
- No reading due
Automated testing: Property-based Testing
- K. Claessen, J. Hughes, “QuickCheck: A Lightweight Tool for Random Testing of Haskell Programs,” ACM International Conference on Functional Programming (ICFP), 2000.
- J. Hughs. "How to Specify It! A Guide to Writing Properties of Pure Functions](https://www.tfp2019.org/resources/tfp2019-how-to-specify-it.pdf)." Trends in Functional Programming (TFP), 2019.
  - You may instead read the Java translation of the article by the author of JQwik: here, which is likely more readable for most of you.
- If you're curious about other types of guided random testing (you are NOT required to look at these):
  - J. Hughes. "Experiences with QuickCheck: Testing the Hard Stuff and Staying Sane." Chapter from A List of Successes That Can Change the World. 2016.
  - J. Hughs, B. C. Pierce, T. Arts, U. Norell. "Mysteries of DropBox: Property-Based Testing of a Distributed Synchronization Service," IEEE International Conference on Software Testing (ICST), 2016.
  - C. Pacheco, S. K. Lahiri, M. D. Ernst, T. Ball, “Feedback-directed Random Test Generation,” ACM International Conference on Software Engineering (ICSE), 2007.
Symbolic Execution and Fuzzing
- C. Cadar, D. Dunbar, D. Engler, “KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs,” USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2008.
- Bounimova, Godefroid, and Molnar. "Billions and Billions of Constraints: Whitebox Fuzz Testing in Production", International Conference on Software Engineering (ICSE), 2013.
Formal Specifications
- J.M. Wing. “A Specifier’s Introduction to Formal Methods”, IEEE Computer, Sep. 1990, pp. 8-24.
- R.W. Butler, S. C. Johnson. “Formal Methods for Life-Critical Software”, AIAA Computing in Aerospace 9 Conference, San Diego, CA., Oct. 19-21, 1993, pp. 319-329.
Hands-On Formal Specification
- C. Newcombe, T. Rath, F. Zhang, B. Munteanu, M. Brooker, M. Deardeuff, “How Amazon Web Services Uses Formal Methods,” Communications of the ACM, 58(4), 2015.
  - This link requires a campus IP address, possibly via VPN (https://vpn.drexel.edu)
- Your choice of one of the following; I recommend Practical TLA+ unless you're interested to see a bit more (just a bit) of temporal logic:
  - Chapters 2 & 3 of Lamport's Specifying Systems (20 pages, direct PDF link)
  - Chapters 1 & 2 of Wayne's Practical TLA+ (39 pages, but many are full-page figures, with larger font) (Requires Drexel sign-in)
Static Analysis and Tool Adoption
- N. Ayewah, D. Hovemeyer, J.D. Morgenthaler, J. Penix, W. Pugh, "Using Static Analysis to Find Bugs." IEEE Software, 25 (5): 22-29, 2008.
- A. Bessey et al., "A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World." Communications of the ACM, 53(2): 66-75, 2010.
Mix: Validating Concurrent Software, and Validating Machine Learning Systems
- Savage, Burrows, Nelson, Sobalvarro, and Anderson. "Eraser: A Dynamic Data Race Detector for Multithreaded Programs." ACM Transactions on Computer Systems, 15(4), November 1997.
- C. Murphy, K. Shen, and G. Kaiser. "Automatic System Testing of Programs without Test Oracles." International Symposium on Software Testing and Analysis (ISSTA), 2009.
Refactoring
- Katoaka, Ernst, Griswold, and Notkin. "Automated Support for Program Refactoring using Invariants. International Conference on Software Maintenance (ICSM), 2001.
- Kim, Zimmermann, Nagappan. "A Field Study of Refactoring Challenges and Benefits". Foundations of Software Engineering (FSE), 2012.
Test Case Prioritization & Diagnosis
- Luo, Moran, Poshyvanyk. "A Large-Scale Empirical Comparison of Static and Dynamic Test Case Prioritization Techniques". Foundations of Software Engineering (FSE), 2016.
- Glerum et al. "Debugging in the (Very) Large: Ten Years of Implementation and Experience", ACM Symposium on Operating Systems Principles (SOSP), 2009.
Process
- C. Fishnman, “They Write the Right Stuff”, Fast Company, December 1996, http://www.fastcompany.com/28121/they-write-right-stuff
- TBD: Reading on Cleanroom Process
- Part of Leveson's A Systems-Theoretic Approach to Safety in Software-Intensive Systems." IEEE Transactions on Dependable and Secure Computing, 2005.

Academic Policies

This course follows university, college, and department policies, including but not limited to:

Academic Integrity, Plagiarism, Dishonesty and Cheating Policy: http://www.drexel.edu/provost/policies/academic_dishonesty.asp
Student Life Honesty Policy from Judicial Affairs: http://www.drexel.edu/provost/policies/academic-integrity
Students with Disability Statement: http://drexel.edu/oed/disabilityResources/students/
Course Add/Drop Policy: http://www.drexel.edu/provost/policies/course-add-drop
Course Withdrawal Policy: http://drexel.edu/provost/policies/course-withdrawal
Department Academic Integrity Policy: http://drexel.edu/cci/resources/current-students/undergraduate/policies/cs-academic-integrity/
Drexel Student Learning Priorities: http://drexel.edu/provost/assessment/outcomes/dslp/
Office of Disability Resources: http://www.drexel.edu/ods/student_reg.html

The instructor(s) may, at his/her/their discretion, change any part of the course before or during the term, including assignments, grade breakdowns, due dates, and schedule. Such changes will be communicated to students via the course web site. This web site should be checked regularly and frequently for such changes and announcements.

Students requesting accommodations due to a disability at Drexel University need to request a current Accommodations Verification Letter (AVL) in the ClockWork database before accommodations can be made. These requests are received by Disability Resources (DR), who then issues the AVL to the appropriate contacts. For additional information, visit the DR website at drexel.edu/oed/disabilityResources/overview/, or contact DR for more information by phone at 215.895.1401, or by email at disability@drexel.edu.

SE 576 Software Reliability and Testing