CPS 512 (Duke University) Distributed Systems
home calendar topics work resources

This courses focuses on core concepts in distributed systems, using geo-distributed mega-services on cloud infrastructure as a motivation and driving example. Modern cloud applications are layered above common service platforms that handle the hard problems: elastic infrastructure. tracking groups of participating servers (views), distributing state and functions across a group, coordinating control and ownership of data, managing consensus, and recovering from server and network failures. The course focuses on the design of these service platforms and their abstractions.

This offering has a modest programming component and a substantial independent project, making it suitable for a broader audience than previous semesters. We use a subset of the Java-based DSLabs. Since CPS 512 is a quals course there are also two exams.

Topics. The course divides loosely into three parts. This semester we start with Internet naming (DNS) to illustrate multi-domain services and the concepts of principals, identity, authority, trust, governance, and secure communication. Part two moves into challenges of stateful services and elastic scaling, illustrated with elements of the Google cloud network stack: Kubernetes service abstractions, request routing, load balancing, and coordination services. We then dive deeper into foundational distributed systems topics that underlie these systems: distributed transactions, geo-replication, logical time and causality, eventual consistency with vector clocks, views and leader election, and consensus. As time allows we explore how these concepts apply in blockchain and ``Web 3'' platforms and services.