CS441: Compiler Implementation
Course Syllabus
Credits: 3 hours lecture (3 credits)
Winter 2024
General Information
- Course Coordinator: Colin Gordon
- Instructor Contact Information: csgordon@drexel.edu, https://www.cs.drexel.edu/~csgordon/
- Office hours:
- Live in office: Day and time TBD but will be in 1174
- via Zoom: By request, no standing link
- By appointment: If for whatever reason you cannot make it to my normal office hours, please don't hesitate to set up a one-off time
 
Student Learning Information
Course Description
Covers the fundamentals of optimizing compilers and code generation, including compiler intermediate representations, basic compiler optimizations, program analyses to enable optimizations (such as value numbering), and generation of efficient code (register allocation and instruction selection).
College/Department: College of Computing & Informatics
Repeat Status: Not repeatable for credit
Restrictions: None
Prerequisites: CS 360 [Min Grade: C] and (CS 281 [Min Grade: C] or ECE 350 [Min Grade: D])
Course Purpose within a Program of Study
Serves as an elective for BS or BA in Computer Science, or BS in Software Engineering. Can also serve as an elective for ECEC majors.
Statement of Expected Learning
Students will learn the core concepts in how compilers generate optimized code for modern processors. Students will be able to:
- Describe and implement core aspects of common compiler designs
- Explain common classes of optimizations
- Implement compiler optimizations on a compiler intermediate representation
- Analyze performance impact to determine whether the addition of an optimization was effective
Course Materials
Required and Recommended Texts, Readings, and Resources
Engineering a Compiler, 3rd Edition, by Keith Cooper and Linda Torczon (https://shop.elsevier.com/books/engineering-a-compiler/cooper/978-0-12-815412-0).
The 3rd edition fixes a number of issues with the 2nd edition, and adds some new relevant material we will likely cover this term. However, if for some reason you'd like to try to use the 2nd edition, this should be possible with some work. I'll try to post section numbers for both versions of the text, but be aware that there are many subtle corrections to things that were not well-explained, or where important subtleties were placed in odd parts of the book far removed from where they mattered. The second edition is available used or electronically via Drexel Libraries.
Required and Supplemental Materials and Technologies
The course will ask you to submit code in the language of your choosing (almost; not an esoteric language, but any language actually meant for daily programming is fine. You'll be required to write small scripts to run your compiler; details will come with assignments.
In addition, you'll be asked to use a reference interpreter for the class intermediate language which is implemented in Rust. Depending on which OS you are using to develop your compiler, you may be asked to compile the reference interpreter from source, which will require installing Rust.
Assignments, Assessments, and Evaluations
Graded Assignments and Learning Activities
The course is a cumulative project-based course. Students will complete one term-long project in 4 individually-graded checkpoints, each adding new capabilities to a small compiler. The final result is an optimizing compiler backend for a simple functional language with imperative updates.
Because each course assignment builds upon the student's own implementation of the prior assignment, students are strongly urged not to fall behind, as there is no way to "cut your losses" by dropping an assignment and jumping into the next one.
Students will be expected to actively participate in class discussions and activities. These will include whole-class discussions about trade-offs between different approaches, as well as small-group activities working out some details of lecture material.
Students are permitted 6 late days to use during the term on Checkpoints 1-3, in whatever distribution students see fit, whether this is a single late day on one checkpoint, 2 on another, and 3 on another, or all 6 on one, or anything in between. No justification is required, you must simply inform the instructor of your intended number of late days to take on each assignment. Late work submitted in a period covered by late days receives not penalty. Late submissions that are not covered by late days, but are still submitted before the next checkpoint duedate will receive a 20% grade penalty. You may not submit a checkpoint after the subsequent checkpoint is due. The final checkpoint may not be submitted late (since I need to grade it and assign final grades for the term).
The late policy above is not meant to rule out extensions. I tend to be generous with extensions: if you have an important reason that makes it difficult to complete an assignment on time (e.g., ER visit, family emergency, work obligations, presenting research, etc.) and let me know as soon as you know (or, when you can in the case of things like true emergencies) I will generally grant extensions. Informing me at the last minute about something you've known about for some time (e.g., two days before a deadline about a hackathon you've been planning to attend for two months) will probably not get you an extension, but I'd rather you ask an be turned down than risk someone with a valid reason to request an extension going without. Note that the university permits me to ask for supporting documentation to corroborate your claims.
Grading Matrix
- 20% Class Participation.
- The primary criterion for this grade is observable engagement in lecture, rather than expectations of specific actions. Participation includes, but is not limited to, attendance, asking questions in lecture, answering questions posed by the instructor in lecture, participating in small-group activities during class, asking good questions at office hours or over email, and any other activities that demonstrate engagement with the class beyond simply doing the assignments.
 
- 80% Project Checkpoints (20% each)
Grade Scale
+----------+-------+ | Points | Grade | +----------+-------+ | 97-100 | A+ | +----------+-------+ | 92-96.99 | A | +----------+-------+ | 90-91.99 | A- | +----------+-------+ | 87-89.99 | B+ | +----------+-------+ | 82-86.99 | B | +----------+-------+ | 80-81.99 | B- | +----------+-------+ | 77-79.99 | C+ | +----------+-------+ | 72-76.99 | C | +----------+-------+ | 70-71.99 | C- | +----------+-------+ | 67-69.99 | D+ | +----------+-------+ | 60-66.99 | D | +----------+-------+ | 0-59.99 | F | +----------+-------+
Course Schedule
(This schedule is tentative and may change during the course.)
- Week 1 (1/8 & 1/10): Course Overview, Computer Architecture Review, Program Representations
- Week 2 (1/15 MLK Jr Day & 1/17): Basic Code Generation
- No class 1/15 for MLK Jr. Day
- Out: Checkpoint 1: Parsing & Basic Codegen
 
- Week 3 (1/22 & 1/24): Dominators, SSA
- Week 4 (1/29 & 1/31): Code Motion, Constant Propagation, Value Numbering
- In: Checkpoint 1
- Out: Checkpoint 2: Basic data optimizations
 
- Week 5 (2/5 & 2/7): Pointer Analysis, Devirtualization, & Inlining
- Week 6 (2/12 & 2/14): Type-based optimization
- In: Checkpoint 2
- Out: Checkpoint 3: Type checking and respresentation optimizations
 
- Week 7 (2/19 & 2/21): Instruction Selection, Register Allocation, Instruction Scheduling
- Week 8 (2/26 & 2/28): Memory Management
- In: Checkpoint 3
- Out: Checkpoint 4: GC & Concurrency
 
- Week 9 (3/4 & 3/6): Compiling Parallel Programs
- Week 10 (3/11 & 3/13): Special Topics
- Finals week: Final project checkpoint 4 due.
The last week of instruction will feature coverage of an advanced topic to be voted on by the class. Likely topics to be put to a vote include:
- Just-in-Time Compilation, as used in the JVM, CLR, and modern Javascript engines
- Dynamic binary translation, as used for on-the-fly execution of a previous architecture's code (e.g. Apple's
- Rosetta to run PowerPC executables on x86-64, and Rosetta 2 to run x86-64 executables on ARM) and in early x86-64 virtual machine monitors like VMWare
- Dynamic binary instrumentation, as used for security monitors, dynamic analysis tools, and more