Self-Managing Systems

Course 296.2: Self-Managing Systems, Spring 2006
Course Instructor   Shivnath Babu


Outline

This class is being offered for the first time, so the contents will evolve as the course progresses. Broadly, the material covered in the class will be organized into three modules:

Motivating Factors, Problems, and Applications

Proposed material in this module includes:
  1. Background on increasing scale and complexity in computing systems (e.g., compute clusters, Grids, service-oriented computing).
  2. Predicting performance problems and system failures ahead of time.
  3. Automatic root-cause identification for problems.
  4. Answering "what-if questions", e.g., what will happen to performance if system memory is doubled?

Algorithms and Techniques

Proposed material in this module includes:
  1. Theoretical underpinnings, e.g., control theory, stochastic optimization, and decision theory.
  2. Systems perspective, e.g., instrumentation, distributed monitoring in Grids, and system support for adaptivity.
  3. Data analysis and machine learning for modeling application and system behavior.
  4. Combining structured (e.g., performance measurements), semi-structured (e.g., system logs), and unstructured (e.g., emails) data.
  5. How smart visualization techniques can help.

Usability and Design Considerations

Proposed material in this module includes:
  1. For which domains will self-managing systems work, and for which domains will they fail?
  2. Ease-of-use of self-managing systems.
  3. Can systems ever be made (fully) self-managing, or would you trust a self-managing system?
  4. What are the implications of self-managing technology for designers of new systems?