UCB CS294-113: Virtual Machines and Managed Runtimes

In early 2015 I was honored to be invited to develop and present a graduate course on Virtual Machines at UC Berkeley. The result is CS294-113: Virtual Machines and Managed Runtimes, which was presented in the Fall of 2015.

This page contains the materials from that course. All materials are Copyright © Oracle and Mario Wolczko, 2015-6, except as noted. The materials can be used non-commercially under the following Creative Commons license:

CS294-113: Virtual Machines and Managed Runtimes by Mario Wolczko is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Caveat: a few items are still to be completed.

Update, July 2016

I gave a talk "A Concise and Opinionated History of Virtual Machines" at the VM Summer School in London. This gave me the opportunity to cover some of the history of System VMs that I did not have time to cover in this course. The material on Language VMs will mostly be familiar. Slides and video are available here.

Overview

The original web page advertising the course can be found here (of historical interest only, as it was written before the course was prepared and hence is not totally accurate).

Prerequisites

Students are required to have a strong background in systems programming in C and machine-level operation (assembly and machine code), and a working knowledge of Java. Basic knowledge of compiler internals is recommended but not required.

Description

The widespread adoption of FORTRAN in the 1950s and 1960s resulted in a plethora of high-level programming languages directly compiled to machine code, some of which still thrive (e.g. FORTRAN itself, as well as C and C++). However, in the 1970s and 1980s a different approach to execution gained in popularity, in which a layer of software continually intermediates between the high-level program and the machine. Most often called Virtual Machines, this approach initially gained popularity with the Pascal P-machines, Smalltalk-80's bytecode machine, and gained a huge boost with the emergence of Java and the JVM in the mid-1990s. Though Virtual Machines now dominate high-level language implementation, they have a reputation for being many orders of magnitude slower than traditionally compiled languages.

However, when coupled with dynamic compilation techniques, Virtual Machines can provide performance comparable to direct compilation while offering machine independent binary distribution, advanced memory management, better security, interactive program development and many other advantages. The objective of this seminar is to explore the design and construction of virtual machines by studying the history of the field, analyzing landmark systems and by hands-on construction and modification.

The presentation will take a mostly chronological approach, starting with early techniques and progressing through to the state of the art. Each week we will learn about the preeminent problems of a given era and how those problems were overcome. In the labs we will reprise some of these accomplishments through a graded series of exercises in which we build components of a virtual machine for an invented language. The initial exercises will implement basic techniques in C; later we will switch to a virtual machine framework known as Truffle which will provide sophisticated components that we can assemble and customize into a larger system.

Lectures

Topics covered include:

Virtual machines and managed runtimes: taxonomy, characteristics, history
Interpretation of abstract syntax trees and bytecodes
Threaded code and other performance techniques
Automatic memory management (reference counting, tracing collection)
Basic just-in-time compilation
Advanced garbage collection (generational and concurrent techniques)
Dynamic optimizing compilation
Adaptive feedback-driven techniques
Dynamic deoptimization
Trace compilation; metatracing
Metacircular VMs

Each lecture will be on Friday from 2pm to 5pm, with breaks.

Readings

Weekly reading will be assigned. Some papers will be key for continued understanding of the lectures; for these there will be class discussions, and possibly written or verbal summaries required. All assigned reading and supporting material is listed in the course bibliography.

Labs

The labs will consist of a series of construction/modification exercises in which components of a VM are implemented or enhanced, to reinforce the corresponding lecture material. Most exercises will be in C, and some in Java. Labs will be done either solo, or in pairs.

Source code -- TBD

TBD: make available source code to model answers, test harnesses, etc.

Videos

The Berkeley lectures were recorded and the videos uploaded to YouTube. Note that it is raw, unedited amateur video -- don't expect professional production standards.

The neat boundaries between topics are not reflected in the videos: sometimes a topic ran over the allotted time and was finished the following week, so you may have to hunt in the videos to find a specific topic.

In the following table I have listed the lectures with links to the videos, the accompanying slides and the lab exercises. The exercises are placed next to the relevant material; in the actual course, they sometimes lagged by a week or two (typically, exercises were dispensed weekly; students were given two weeks for #5 and #9, and 3 weeks for #7).

Week Video Slides Description Lab exercise Reading

1 #1 1 Intro
1A Feeny
2 AST interpretation Course overview; Introduction to VMs: terminology, taxonomy; some examples to illustrate the variety and architectures of VMs: JVM, VirtualBox, System/370 VM, Virtual PC, Rosetta, Dynamo, Crusoe; applications and advantages of VMs. Introduction to the Feeny teaching language. Introduction to interpretation using ASTs. 1 Feeny Introduction support files (review)
2 Feeny AST intepreter support files (review) Smith & Nair (2005), Chapter 1

2 #2 3 VM anatomy
4 Bytecodes Anatomy of a VM, using the JVM as an examplar: state (threads, heap, classes); class files; bytecodes; primitives and intrinsics; native libraries and JNI. Bytecode and bytecode interpretation: abstract machines; encodings; compiling to bytecode; interpreting bytecode; control flow: branches, call and return, method dispatch; serialization; performance. 3 Feeny bytecode interpreter
support files
4 Feeny bytecode compiler
support files (reviews) -

3 #3 5 JVM bytecode ISA A tour of the JVM bytecode ISA: arithmetic, loads and stores, object creation and access, type checks, stack management, type conversion, control transfer, method invocation and return, detour: vtable dispatch, synchronization, exceptions, verification, types. Abstract interpretation; stack type maps. See last slide. Gosling (1995);
JVM Spec chapter 3

4 #4 (Caution: last few minutes NSFW) Introduction This session was a discussion of the Deutsch-Schiffman Smalltalk implementation and associated paper, with L Peter Deutsch and Allan Schiffman present to answer questions. - Deutsch & Schiffman (1984)

5 #5 6 Dynamic languages
7 Memory management part 1 Dynamic languages: dispatch and storage challenges; boxing; object tables; primitives and tagging; call target caches.
Memory management: allocation (static, stack, heap); list-based allocation; fragmentation; performance considerations; bump allocation; deallocation; GC; liveness; reachability; reference counting; (presentation of remaining slides left to next week). 5 Write a copying collector
Slides -

6 #6 7B Mem.mgmt part 2
7C Debugging hints This week our guest via Skype was David Ungar, to discuss his many contributions to VM technology.
After that we concluded the first part of memory management (not on video, alas -- we forgot to turn the camera on!): tracing collection; marking and mark stacks; sweeping; heap parsing; compaction; sliding compaction; forwarding pointer compaction; using a temporary object table; threaded compaction; copying collection; tri-color abstraction; incremental collection; Baker's algorithm; barriers; generational collection; scavenging; remembered sets and card marking; inexact collection.
Debugging hints for the collector exercise. - Ungar (1984);
Ungar & Smith (1987)

7 #7 8 Self internals A tour of Self VM internals: language recap; bytecodes; memory organization (heap, tags, maps, object layouts; use of C++ objects; oops); allocation; scavenging; root-finding; programming primitives and _Define; heap scanning. 6 Tagging -

8 #8 9 Advanced interpretation
10 Dynamic compilation Advanced interpretation: threading; indirect threading; stack caching; multiple dispatch tables; superinstructions; selective inlining; interpreter generation; the HotSpot interpreter(s).
Dynamic compilation: design choices (compilation unit, when, optimization level); Just-In-Time compilation (including rant); simple speed metrics; example using expression language; managing compiled code: code cache; flushing; finding active methods; stack scanning; finding roots; safe points. 7 Write a jit Ertl & Gregg (2004)

9 #9 11 Dynamic optimization Our guest this week was Cliff Click, with Bits of advice for the VM writer followed by audience questions.

Dynamic optimization: inline caches; single and PICs; optimizing primitives; customization; uncommon branches; splitting.
- Any of:
Paleczny, Vick & Click (2001);
Click, Tene & Wolf (2005);
Click & Rose (2002);
Click & Paleczny (1995).

10 #10 12 Advanced optimization Inlining; extended example putting it all together: customization, inlining, splitting, primitive handling, uncommon branches; tiered compilation; prediction; counters; decay; PICs and counters; recompilation (deciding on the top level; gathering a call tree; inlining); class hierarchy analysis; escape analysis; dynamic deoptimization; deopt points; deoptimizing the active frame and other frames; on-stack replacement; dependencies. 8 Inline caches Hölzle, Chambers & Ungar (1992)

11 #11.1 #11.2 13 Mem.mgmt. part 3 Our guest this week was Lars Bak, who spoke about How Dart Learned From Past Object-Oriented Systems and then engaged in an extended Q&A.

Memory management part 3: conservative (aka inexact) collection; GC metrics (throughput, overhead, footprint, pause times, MMU and BMU, energy).
- Inside V8
Inside VMs

12 #12.1 #12.2 14 Metacircularity This week's guest via Skype was Carl Friedrich Bolz, talking about The meta-tracing approach to virtual machine construction.

Metacircular VMs: motivation; metacircular interpreters; compiling to C (Squeak); metacircularity -- with performance; Jikes; Klein; Maxine (template JIT compiler; snippets; inspector).
- Gal, Probst & Franz (2006);
Bolz et al. (2009)

13 #13 15 Truffle and Graal A new approach: motivation; partial evaluation of specialized AST interpreters; example and pseudocode; node replacement; compiling ASTs; choice of implementation language; annotations and processors; Truffle: types; children; specializations; frames; root nodes; control flow (if, call and return); AST inlining; profiling; assumptions; objects. 9 Truffle implementation of Feeny Würthinger et al. (2013)

14 #14 16 Truffle part 2 This week our guest was Thomas Würthinger who gave an extended demo of Truffle.

Truffle part 2: objects; shapes and transitions; inline caches in Truffle.
- Humer et al. (2014);
Würthinger et al. (2012)

15 #15 17 Concurrency
18 Concurrent GC
19 Competition
20 Wrapup
Our guest this week was Michael Van De Vanter, talking about Solving the "Developer Tool Problem" for High-Performance Managed Runtimes.
Also (in a packed final session): Concurrency: motivation; scheduling; synchronization; HotSpot fast locks; biased locking; code management;
Concurrent GC: stacks; allocation; terminology; parallel collection; marking; work stealing; copying; sweeping; concurrent collection; challenges; tricolor invariants; color of mutator and allocator; SATB; Baker's algorithm.
The Great Feeny Competition of 2015 -- results and awards;
Wrap up: orphaned topics; next steps; thanks. - Van De Vanter (2015);
Seaton, Van De Vanter & Haupt (2014)

Week	Video	Slides	Description	Lab exercise	Reading
1	#1	1 Intro 1A Feeny 2 AST interpretation	Course overview; Introduction to VMs: terminology, taxonomy; some examples to illustrate the variety and architectures of VMs: JVM, VirtualBox, System/370 VM, Virtual PC, Rosetta, Dynamo, Crusoe; applications and advantages of VMs. Introduction to the Feeny teaching language. Introduction to interpretation using ASTs.	1 Feeny Introduction support files (review) 2 Feeny AST intepreter support files (review)	Smith & Nair (2005), Chapter 1
2	#2	3 VM anatomy 4 Bytecodes	Anatomy of a VM, using the JVM as an examplar: state (threads, heap, classes); class files; bytecodes; primitives and intrinsics; native libraries and JNI. Bytecode and bytecode interpretation: abstract machines; encodings; compiling to bytecode; interpreting bytecode; control flow: branches, call and return, method dispatch; serialization; performance.	3 Feeny bytecode interpreter support files 4 Feeny bytecode compiler support files (reviews)	-
3	#3	5 JVM bytecode ISA	A tour of the JVM bytecode ISA: arithmetic, loads and stores, object creation and access, type checks, stack management, type conversion, control transfer, method invocation and return, detour: vtable dispatch, synchronization, exceptions, verification, types. Abstract interpretation; stack type maps.	See last slide.	Gosling (1995); JVM Spec chapter 3
4	#4 (Caution: last few minutes NSFW)	Introduction	This session was a discussion of the Deutsch-Schiffman Smalltalk implementation and associated paper, with L Peter Deutsch and Allan Schiffman present to answer questions.	-	Deutsch & Schiffman (1984)
5	#5	6 Dynamic languages 7 Memory management part 1	Dynamic languages: dispatch and storage challenges; boxing; object tables; primitives and tagging; call target caches. Memory management: allocation (static, stack, heap); list-based allocation; fragmentation; performance considerations; bump allocation; deallocation; GC; liveness; reachability; reference counting; (presentation of remaining slides left to next week).	5 Write a copying collector Slides	-
6	#6	7B Mem.mgmt part 2 7C Debugging hints	This week our guest via Skype was David Ungar, to discuss his many contributions to VM technology. After that we concluded the first part of memory management (not on video, alas -- we forgot to turn the camera on!): tracing collection; marking and mark stacks; sweeping; heap parsing; compaction; sliding compaction; forwarding pointer compaction; using a temporary object table; threaded compaction; copying collection; tri-color abstraction; incremental collection; Baker's algorithm; barriers; generational collection; scavenging; remembered sets and card marking; inexact collection. Debugging hints for the collector exercise.	-	Ungar (1984); Ungar & Smith (1987)
7	#7	8 Self internals	A tour of Self VM internals: language recap; bytecodes; memory organization (heap, tags, maps, object layouts; use of C++ objects; oops); allocation; scavenging; root-finding; programming primitives and _Define; heap scanning.	6 Tagging	-
8	#8	9 Advanced interpretation 10 Dynamic compilation	Advanced interpretation: threading; indirect threading; stack caching; multiple dispatch tables; superinstructions; selective inlining; interpreter generation; the HotSpot interpreter(s). Dynamic compilation: design choices (compilation unit, when, optimization level); Just-In-Time compilation (including rant); simple speed metrics; example using expression language; managing compiled code: code cache; flushing; finding active methods; stack scanning; finding roots; safe points.	7 Write a jit	Ertl & Gregg (2004)
9	#9	11 Dynamic optimization	Our guest this week was Cliff Click, with Bits of advice for the VM writer followed by audience questions. Dynamic optimization: inline caches; single and PICs; optimizing primitives; customization; uncommon branches; splitting.	-	Any of: Paleczny, Vick & Click (2001); Click, Tene & Wolf (2005); Click & Rose (2002); Click & Paleczny (1995).
10	#10	12 Advanced optimization	Inlining; extended example putting it all together: customization, inlining, splitting, primitive handling, uncommon branches; tiered compilation; prediction; counters; decay; PICs and counters; recompilation (deciding on the top level; gathering a call tree; inlining); class hierarchy analysis; escape analysis; dynamic deoptimization; deopt points; deoptimizing the active frame and other frames; on-stack replacement; dependencies.	8 Inline caches	Hölzle, Chambers & Ungar (1992)
11	#11.1 #11.2	13 Mem.mgmt. part 3	Our guest this week was Lars Bak, who spoke about How Dart Learned From Past Object-Oriented Systems and then engaged in an extended Q&A. Memory management part 3: conservative (aka inexact) collection; GC metrics (throughput, overhead, footprint, pause times, MMU and BMU, energy).	-	Inside V8 Inside VMs
12	#12.1 #12.2	14 Metacircularity	This week's guest via Skype was Carl Friedrich Bolz, talking about The meta-tracing approach to virtual machine construction. Metacircular VMs: motivation; metacircular interpreters; compiling to C (Squeak); metacircularity -- with performance; Jikes; Klein; Maxine (template JIT compiler; snippets; inspector).	-	Gal, Probst & Franz (2006); Bolz et al. (2009)
13	#13	15 Truffle and Graal	A new approach: motivation; partial evaluation of specialized AST interpreters; example and pseudocode; node replacement; compiling ASTs; choice of implementation language; annotations and processors; Truffle: types; children; specializations; frames; root nodes; control flow (if, call and return); AST inlining; profiling; assumptions; objects.	9 Truffle implementation of Feeny	Würthinger et al. (2013)
14	#14	16 Truffle part 2	This week our guest was Thomas Würthinger who gave an extended demo of Truffle. Truffle part 2: objects; shapes and transitions; inline caches in Truffle.	-	Humer et al. (2014); Würthinger et al. (2012)
15	#15	17 Concurrency 18 Concurrent GC 19 Competition 20 Wrapup	Our guest this week was Michael Van De Vanter, talking about Solving the "Developer Tool Problem" for High-Performance Managed Runtimes. Also (in a packed final session): Concurrency: motivation; scheduling; synchronization; HotSpot fast locks; biased locking; code management; Concurrent GC: stacks; allocation; terminology; parallel collection; marking; work stealing; copying; sweeping; concurrent collection; challenges; tricolor invariants; color of mutator and allocator; SATB; Baker's algorithm. The Great Feeny Competition of 2015 -- results and awards; Wrap up: orphaned topics; next steps; thanks.	-	Van De Vanter (2015); Seaton, Van De Vanter & Haupt (2014)

The Whole Shebang

Here's a zip file containing all the slides, exercises and associated material. If you are a bona fide educator and wish to get the original files (in Keynote format), send me an email -- the animations are smoother in the original, and you'll probably want to change things for your own style and delivery. (Don't forget the attribution required by the license, and the other license requirements.) If you say "pretty please", I may even be willing to give you a copy in PowerPoint or anything else that can be exported from Keynote ;-).

The source of the various Feeny tools (written in Stanza) is here.

Check out this Visual Studio Code extension for Feeny, care of Daniel Rosenwasser (@drosenwasser).

Other courses

There aren't many courses on the web that cover similar material, but here are the ones I've found (all done by former colleagues!):

Gregor Richards' Waterloo courses on VMs for Dynamic Languages and Automatic memory management and GC.
Phil McGachey's Harvard course on Managed Environments.
Michael Haupt and Andreas Sewe taught a course on Virtual Machines at TU Darmstadt in 2010. Michael sent me a copy of his slides; I do not believe they are available on the web.
Lars Bak taught a VM course in Aarhus in 2002 (Design and Implementation of Object-Oriented Virtual Machines; thanks to Alan Harkreader for finding this archive)
Ben Titzer has a class (first run 2022) on Virtual Machines and Managed Runtimes at CMU.

Copyright details

All slides are Copyright © Oracle and Mario Wolczko, 2015-6, except those by Patrick Li (Feeny introduction and associated exercises), and the embedded graphics in the Deutsch-Schiffman introduction (taken from various web pages, mostly Wikipedia). Excerpts from papers and books (and the Hitch Hiker's Guide to the Galaxy) have been attributed in the slides (let me know if I missed an attribution). Copyright on the youtube videos is probably some mashup of all the people who appear, UC, and the recordist, Patrick Li. You should assume all rights are reserved except those which were surrendered or granted as part of the youtube upload process.

Acknowledgements

I'd like to express my thanks to the following:

Patrick Li, my T.A. for the course. Patrick devised the Feeny language used in the exercises, wrote the Lab exercises and the model answers, and did all the grading,
Prof. Jonathan Bachrach for the invitation to give the course,
The guest speakers (in order of appearance): Peter Deutsch, Allan Schiffman, David Ungar, Cliff Click, Lars Bak, Carl Friedrich Bolz, Thomas Würthinger and Michael Van De Vanter,
My management at Oracle Labs for supporting this effort,
Michael Haupt for sharing his VM course material,
Christian Wimmer for assistance with Truffle, and
All the students who participated, for their patience, enthusiasm, attention, questions, efforts and feedback.

Mario Wolczko
March 2016