In early 2015 I was honored to be invited to develop and present a graduate course on Virtual Machines at UC Berkeley. The result is CS294-113: Virtual Machines and Managed Runtimes, which was presented in the Fall of 2015.
This page contains the materials from that course. All materials
are Copyright © Oracle and Mario Wolczko, 2015-6, except
The materials can be used non-commercially under the following
Creative Commons license:
CS294-113: Virtual Machines and Managed Runtimes by Mario Wolczko is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Caveat: a few items are still to be completed.
The original web page advertising the course can be found here (of historical interest only, as it was written before the course was prepared and hence is not totally accurate).
Students are required to have a strong background in systems programming in C and machine-level operation (assembly and machine code), and a working knowledge of Java. Basic knowledge of compiler internals is recommended but not required.
The widespread adoption of FORTRAN in the 1950s and 1960s resulted in a plethora of high-level programming languages directly compiled to machine code, some of which still thrive (e.g. FORTRAN itself, as well as C and C++). However, in the 1970s and 1980s a different approach to execution gained in popularity, in which a layer of software continually intermediates between the high-level program and the machine. Most often called Virtual Machines, this approach initially gained popularity with the Pascal P-machines, Smalltalk-80's bytecode machine, and gained a huge boost with the emergence of Java and the JVM in the mid-1990s. Though Virtual Machines now dominate high-level language implementation, they have a reputation for being many orders of magnitude slower than traditionally compiled languages.
However, when coupled with dynamic compilation techniques, Virtual Machines can provide performance comparable to direct compilation while offering machine independent binary distribution, advanced memory management, better security, interactive program development and many other advantages. The objective of this seminar is to explore the design and construction of virtual machines by studying the history of the field, analyzing landmark systems and by hands-on construction and modification.
The presentation will take a mostly chronological approach, starting with early techniques and progressing through to the state of the art. Each week we will learn about the preeminent problems of a given era and how those problems were overcome. In the labs we will reprise some of these accomplishments through a graded series of exercises in which we build components of a virtual machine for an invented language. The initial exercises will implement basic techniques in C; later we will switch to a virtual machine framework known as Truffle which will provide sophisticated components that we can assemble and customize into a larger system.
The Berkeley lectures were recorded and the videos uploaded to YouTube. Note that it is raw, unedited amateur video -- don't expect professional production standards.
The neat boundaries between topics are not reflected in the videos: sometimes a topic ran over the allotted time and was finished the following week, so you may have to hunt in the videos to find a specific topic.
In the following table I have listed the lectures with links to the videos, the accompanying slides and the lab exercises. The exercises are placed next to the relevant material; in the actual course, they sometimes lagged by a week or two (typically, exercises were dispensed weekly; students were given two week for #5 and #9, and 3 weeks for #7).
2 AST interpretation
|Course overview; Introduction to VMs: terminology, taxonomy; some examples to illustrate the variety and architectures of VMs: JVM, VirtualBox, System/370 VM, Virtual PC, Rosetta, Dynamo, Crusoe; applications and advantages of VMs. Introduction to the Feeny teaching language. Introduction to interpretation using ASTs.||
2 Feeny AST intepreter support files (review)
|Smith & Nair (2005), Chapter 1|
3 VM anatomy
|Anatomy of a VM, using the JVM as an examplar: state (threads, heap, classes); class files; bytecodes; primitives and intrinsics; native libraries and JNI. Bytecode and bytecode interpretation: abstract machines; encodings; compiling to bytecode; interpreting bytecode; control flow: branches, call and return, method dispatch; serialization; performance.||
3 Feeny bytecode interpreter
4 Feeny bytecode compiler
support files (reviews)
|3||#3||5 JVM bytecode ISA||A tour of the JVM bytecode ISA: arithmetic, loads and stores, object creation and access, type checks, stack management, type conversion, control transfer, method invocation and return, detour: vtable dispatch, synchronization, exceptions, verification, types. Abstract interpretation; stack type maps.||See last slide.||
JVM Spec chapter 3
|4||#4 (Caution: last few minutes NSFW)||Introduction||This session was a discussion of the Deutsch-Schiffman Smalltalk implementation and associated paper, with L Peter Deutsch and Allan Schiffman present to answer questions.||-||Deutsch & Schiffman (1984)|
6 Dynamic languages
7 Memory management part 1
Dynamic languages: dispatch and storage challenges; boxing; object tables; primitives and
tagging; call target caches.
Memory management: allocation (static, stack, heap); list-based allocation; fragmentation; performance considerations; bump allocation; deallocation; GC; liveness; reachability; reference counting; (presentation of remaining slides left to next week).
5 Write a copying collector
7B Mem.mgmt part 2
7C Debugging hints
This week our guest via Skype was David Ungar, to discuss his
many contributions to VM technology.
After that we concluded the first part of memory management (not on video, alas -- we forgot to turn the camera on!): tracing collection; marking and mark stacks; sweeping; heap parsing; compaction; sliding compaction; forwarding pointer compaction; using a temporary object table; threaded compaction; copying collection; tri-color abstraction; incremental collection; Baker's algorithm; barriers; generational collection; scavenging; remembered sets and card marking; inexact collection.
Debugging hints for the collector exercise.
Ungar & Smith (1987)
|7||#7||8 Self internals||A tour of Self VM internals: language recap; bytecodes; memory organization (heap, tags, maps, object layouts; use of C++ objects; oops); allocation; scavenging; root-finding; programming primitives and _Define; heap scanning.||6 Tagging||-|
10 Dynamic compilation
Advanced interpretation: threading; indirect threading; stack caching;
multiple dispatch tables; superinstructions; selective inlining;
interpreter generation; the HotSpot interpreter(s).
Dynamic compilation: design choices (compilation unit, when, optimization level); Just-In-Time compilation (including rant); simple speed metrics; example using expression language; managing compiled code: code cache; flushing; finding active methods; stack scanning; finding roots; safe points.
|7 Write a jit||Ertl & Gregg (2004)|
|9||#9||11 Dynamic optimization||
Our guest this week was Cliff Click, with Bits of advice for the
VM writer followed by audience questions.
Dynamic optimization: inline caches; single and PICs; optimizing primitives; customization; uncommon branches; splitting.
Paleczny, Vick & Click (2001);
Click, Tene & Wolf (2005);
Click & Rose (2002);
Click & Paleczny (1995).
|10||#10||12 Advanced optimization||Inlining; extended example putting it all together: customization, inlining, splitting, primitive handling, uncommon branches; tiered compilation; prediction; counters; decay; PICs and counters; recompilation (deciding on the top level; gathering a call tree; inlining); class hierarchy analysis; escape analysis; dynamic deoptimization; deopt points; deoptimizing the active frame and other frames; on-stack replacement; dependencies.||8 Inline caches||Hölzle, Chambers & Ungar (1992)|
|11||#11.1 #11.2||13 Mem.mgmt. part 3||
Our guest this week was Lars Bak, who spoke about
How Dart Learned From Past Object-Oriented Systems and then
engaged in an extended Q&A.
Memory management part 3: conservative (aka inexact) collection; GC metrics (throughput, overhead, footprint, pause times, MMU and BMU, energy).
|12||#12.1 #12.2||14 Metacircularity||
This week's guest via Skype was Carl Friedrich Bolz, talking about
The meta-tracing approach to virtual machine
Metacircular VMs: motivation; metacircular interpreters; compiling to C (Squeak); metacircularity -- with performance; Jikes; Klein; Maxine (template JIT compiler; snippets; inspector).
Gal, Probst & Franz (2006);
Bolz et al. (2009)
|13||#13||15 Truffle and Graal||A new approach: motivation; partial evaluation of specialized AST interpreters; example and pseudocode; node replacement; compiling ASTs; choice of implementation language; annotations and processors; Truffle: types; children; specializations; frames; root nodes; control flow (if, call and return); AST inlining; profiling; assumptions; objects.||9 Truffle implementation of Feeny||Würthinger et al. (2013)|
|14||#14||16 Truffle part 2||
This week our guest was Thomas Würthinger who gave an extended demo
Truffle part 2: objects; shapes and transitions; inline caches in Truffle.
Humer et al. (2014);
Würthinger et al. (2012)
18 Concurrent GC
Our guest this week was Michael Van De Vanter, talking
the "Developer Tool Problem" for High-Performance Managed Runtimes.
Also (in a packed final session): Concurrency: motivation; scheduling; synchronization; HotSpot fast locks; biased locking; code management;
Concurrent GC: stacks; allocation; terminology; parallel collection; marking; work stealing; copying; sweeping; concurrent collection; challenges; tricolor invariants; color of mutator and allocator; SATB; Baker's algorithm.
The Great Feeny Competition of 2015 -- results and awards;
Wrap up: orphaned topics; next steps; thanks.
Van De Vanter (2015);|
Seaton, Van De Vanter & Haupt (2014)
I'd like to express my thanks to the following: