This class is being offered for the first time, so the contents will evolve as the course progresses. Broadly, the material covered
in the class will be organized into three modules:
Motivating Factors, Problems, and Applications
Proposed material in this module includes:
Background on increasing scale and complexity in computing systems
(e.g., compute clusters, Grids, service-oriented computing).
Predicting performance problems and system failures ahead of time.
Automatic root-cause identification for problems.
Answering "what-if questions", e.g., what will happen to performance
if system memory is doubled?
Algorithms and Techniques
Proposed material in this module includes:
Theoretical underpinnings, e.g., control theory, stochastic
optimization, and decision theory.
Systems perspective, e.g., instrumentation, distributed
monitoring in Grids, and system support for adaptivity.
Data analysis and machine learning for modeling application and
system behavior.
Combining structured (e.g., performance measurements),
semi-structured (e.g., system logs), and unstructured (e.g.,
emails) data.
How smart visualization techniques can help.
Usability and Design Considerations
Proposed material in this module includes:
For which domains will self-managing systems work, and for which domains
will they fail?
Ease-of-use of self-managing systems.
Can systems ever be made (fully) self-managing, or would you trust
a self-managing system?
What are the implications of self-managing technology for
designers of new systems?