Course plan: Distributed (Operating) Systems II
This course will be a followup to DOS1, covering more distributed systems
principles, and examining distributed systems implementations and tradeoffs.
The focus will include real-time, fault-tolerance and system design aspects.
The course will (a) provide significant opportunities for students to define
their own learning goals within the area of distributed systems, and will
be largely self-learning and discussion-oriented rather than instructor-driven..
The course will have three components:
A. Principles
B. Project
C. Group Discussions
A. Principles
The "lecture" component of the course (roughly 50% of class time) will
be discussions on various principles of distributed systems design.
The lectures will outline some broad principles, and then invite discussion
on their actual application in practical contexts. The purpose will
be (a) to familiarize participants with the broad directions in an area
and indicate further directions. (b) to enable participants to build their
own mental map of distributed systems design and the interplay between
the various principles and concerns.
The list of topics given below is merely an indicator. The actual
course content will be adaptive i.e. topics will be added or deleted depending
on student interests.
-
Mapping the problem space: System models. Nature of application,
resource constraints, design goals, development contexts. Fundamental
differences in design orientation for each, and impact on solution choices.
-
Solution spaces and tradeoffs: The existence of multiple solutions to problems
such as coordination, replication or synchnonization. Evaluating
alternatives and selecting..
-
Partitioning of functionality in a distributed environment. Factors
that impact task allocation. Strategies for task allocation.
Impacts.
-
Load balancing. Need/benefits, algorithms and challenges.
-
Network bandwidth allocation and scheduling. Bandwidth usage analysis..
-
Scheduling analysis. Meeting deadlines in a distributed environment.
-
Event ordering and synchronization. Impact on prioritization of transactions.
-
Concurrency constrol. Optimistic and pessimistic concurrency control.
Relationship with correctness and consistency. The concept of internal
and extenal consistency.
-
Use of state machines to model and analyze concurrent behavior.
-
Distributed testing and debugging. The use of logging and replay.
Harnesses for distributed systems testing.
-
Robustness, reliability and availability. Fault detection, diagnosis,
and recovery. Fault analysis. Graceful degradation.
-
Security (more in depth)
-
Distributed storage: shared memory, file systems.
-
Interfaces for distributed systems. The problems of concurrency.
-
Case studies (student presentations)
Suggestions for other topics to cover welcome. We may not have time
to cover all of the above - which topics are covered and in how much depth
depends on interest.
B. Project
Students choose and do their own project (there is no "common" project
that all students are expected to do). The project may involve any
(possibly more than 1) of the following:
-
Research study projects: reading up theory in an area. Typically
this will be theory addressing a specific area of system design.
-
Implementation analysis projects: studying the implementation of an existing
system. Case studies of Mach, EJB, CORBA, COM etc. encouraged.
-
Implementation enhancement projects: adding additional features to an existing
system.
-
Implementation projects: building something.
Projects may be done either individually or in groups.
Each project will typically include the following aspects:
-
Identifying the objectives, key concerns and constraints.
-
Key concepts involved: The theoretical basis and ideas for the problem
and solution.
-
Identifying critical design drivers, space of possible solutions, and choices
made.
-
Actual implementation / study report
-
Critique / analysis of the results
It is expected that the project work will be presented and discussed with
the rest of the class periodically, and that the project report will be
made available to everyone else on an ongoing basis (see below).
C. Group discussion
About 50% of class time will be spent on this. Students make presentations
& lead discussions on any topic of interest to them (very likely about
or related to their project). Typically, each team will make at least
2 presentations during the course. "Presentations" are actually discussions,
where the team describes what it is doing, and others ask questions and
make comments. The purpose of these discusisons is to provide case
studies where the application of the various design principles can be observed
and examined.
Evaluation
Based on
-
Projects and/or term paper.
-
Presentation and leading group discussions
-
Participation and insights brought in during group discussions.
-
Contributions to the course. Students are welcome to share their
knowledge, experiences and materials they find.
There will be significant self-evaluation and peer-evaluation.