logo


slogan

4
About Us
Training
Calendar
Registration
Contact Us
10

 

Design of High Availability Systems & Software
 
   

Course Highlights:
This course examines the high-level design of embedded systems and software that are to provide their services at near-perfect availability.

High availability systems must tolerate both expected and unexpected faults. Their design is based on redundant hardware and software combined in ways that will achieve “five-nines” (99.999%) or greater availability, equivalent to less than 1 second of downtime per day.   Basic hardware N-plexing and voting issues are discussed, followed by an in- depth study of a number of backward error recovery fault tolerance techniques including static N-version programming, Checkpoint-Rollback, Process Pairs, and Recovery Blocks. The class concludes with several forward error recovery techniques.  Many real-world examples are presented.

This course is far from a general course about system or software design theory, but rather it is highly focused on the design of embedded systems and software that must make their services available at all times, with less than 5 minutes per year of downtime.


Objective of the course
The primary goal of this course is to give the participant the skills necessary to design software for real-time and embedded computer systems that must relentlessly provide service despite the occurrence of internal and external faults.  This is a very practical, results-oriented course that will provide knowledge and skills that can be applied immediately.


Who Should Attend:
This course is intended for practicing real-time and embedded systems software system architects, project managers and technical consultants who have responsibility for designing, structuring and implementing the software for real-time and embedded computer systems that are required to continue providing service despite the occurrence of internal and external faults.

Course participants are expected to be familiar with general embedded and real-time software design.  [This knowledge can be gained by attending a prerequisite embedded software design course such as "Architectural Design of Real- Time Software".]

Course Co-Requisite:

Many (but not all) high-availability systems are also safety-critical systems -- with can threaten human safety or even human life in situations where the system fails and remains unavailable for significant periods of time.  For those high- availability systems that also have safety-critical requirements, we recommend that the course "Design of Safety-Critical Systems and Software" should be taken at the same time as this course.  The two courses have little overlap in content,
and offer complimentary approaches and perspectives.  It is also possible to combine these two one-day courses into a unified two-day course for presentation at customer sites.


Course Outline:

Definitions and Background

High Availability
Fault -> Error -> Failure
Single Points of Failure
Fault Tree Analysis
Exercise: Probabilistic Fault Tree Analysis

Underlying Principles

Fault Avoidance vs. Tolerance
Failure Curves
Redundancy
Replication vs. Functional Redundancy vs. Analytic Redundancy
Dynamic vs. Static Redundancy
Extended Example: Space Shuttle Software

Fundamental System-Level Design Patterns

Static Hardware Fault Tolerance
N-Plex Design
Exercise: MTBF, MTTF Calculations in Triple Modular Redundancy
Dynamic System Fault Tolerance
Redundant Pairs
Clusters
Cluster Failover Strategy Choices
Examples: Redundant Cluster Design

Concepts for Backward Error Recovery

Design Diversity
Dynamic  System Redundancy
Backward Error Recovery
Transactions
Checkpointing

System and Software Design Patterns for High Availability

Checkpoint-Rollback
Process Pairs
Recovery Blocks
Limitations of Backward Error Recovery Patterns
Forward Error Recovery Design Patterns

C Language in Critical Systems

Software Robustness: MISRA-C, LINT, Static Code Analyzers
Exercise: C-Language Shenanigans

Final Examination.



INSTRUCTOR:  Dr. David Kalinsky
Dr David Kalinsky has more than thirty years of experience in the design and construction of real-time and embedded computer systems software.  He is a popular lecturer and seminar leader on technologies for embedded software development, appearing before audiences of professional engineers in North America, Europe and Israel.  David regularly presents classes at the Embedded Systems Conferences on topics such as "Architectural Design of Device Drivers" and "Principles of High Availability Embedded Systems Design".

He has built and managed high-tech training programs on aspects of software engineering for the development of real-time and embedded systems for a number of Silicon Valley companies.  He has also been involved in the design of many embedded medical and aerospace systems.  In addition, he has in the past developed and taught training courses on a number of major real-time operating systems (RTOSs), including VRTX, pSOS, VxWorks, OSEK / VDX, Nucleus, OSE and others.  With his broad experience, he has trained thousands of embedded systems software engineers and architectural designers throughout the world.



Who We Are
We are a professional organisation providing training services to companies.  We offer a comprehensive range of training courses, workshops and seminars covering every aspects relating to engineering. 

We provide various training programs that meet the immediate and future needs of engineers. The training is organised through seminar style, hands-on workshop, project-based tutorial or a mixture to bring the maximum learning benefits to the enginners.
Our Trainers
We have a quality pool of leading authorities, worldwide experts and fully trained up professionals who are constantly striving to uncover the pitfalls and best practices of modern technology development.
     
All rights reserved by
Omniscient International
     About Us      Training       Calendar      Registration      Contact Us