Malicious-and Accidental-Fault Tolerance for Internet Applications
IST Research Project IST-1999-11583
1 January 2000 - 28 February 2003
Work on the MAFTIA architecture was led by Lisbon, but again involved most of the partners. The MAFTIA architecture was developed using a number of guiding principles: hybrid failure assumptions, recursive use of fault prevention and fault tolerance techniques, and the notion of trusting components to the extent of their trustworthiness. A crucial aspect of any fault-tolerant architecture is the fault model upon which the system architecture is conceived, and component interactions are defined. MAFTIA is based on a composite fault model with hybrid failure assumptions in which the presence and severity of vulnerabilities, attacks and intrusions varies from component to component. The failure assumptions are in fact enforced by the architecture and the construction of certain trustworthy system components, and thus substantiated.
Through the combined use of intrusion prevention techniques (i.e. attack
and vulnerability prevention and removal), and ultimately the implementation
of internal intrusion-tolerance mechanisms, we must justifiably achieve
confidence that each component of the system behaves as assumed, failing
in a controlled manner, i.e., that the component can be trusted because
it is trustworthy. It then becomes possible to implement intrusion-tolerance
mechanisms at the system level, using a mixture of arbitrary-failure (fail-uncontrolled
or non trusted) and fail-controlled (or trusted) components. This task
is made easier because the controlled failure modes of some components
vis-à-vis malicious faults restrict the system faults the component
can produce. In fact we have performed a form of fault prevention at the
system level: some kinds of system faults are simply not produced.
This approach is explored in several ways within MAFTIA. In particular,
it is our rationale for implementing small trustworthy components that
are simple enough to be built and justifiably shown to be correct. This
allows us to construct implementations of fault-tolerant protocols that
are more efficient than protocol implementations that have to deal with
truly arbitrary assumptions, and more robust than designs that make controlled
failure assumptions without enforcing them.
The MAFTIA architecture includes three main instances of such trusted components .The first is based on a Java Card, and is a local component designed to assist the crucial steps of the execution of services and applications. The second is a distributed component (named Trusted Timely Computing Base), based on appliance boards with private network adapters, which is designed to assist crucial steps of the operation of middleware protocols. The third is the use of trusted middleware components to provide a set of correct support services, whose provision is built on distributed fault-tolerance mechanisms, for example through agreement and replication amongst collections of participants in several hosts.
Figure 3 illustrates the overall MAFTIA architecture, which can be depicted in at least three different dimensions:
Figure 3 MAFTIA architecture dimensions
First, there is the hardware dimension, which includes the host and networking devices that make up the physical distributed system. Second, within each node, there are the local support services provided by the operating system and the run-time platform. These may vary from host to host in a heterogeneous system, and some services may even not be available on some hosts or may have to be accessed via the network using protocols providing an appropriate degree of trust. Third, there is the distributed software provided by MAFTIA: the layers of middleware, running on top of the run-time support mechanisms provided by each host; and MAFTIAs native services, depicted in the picture authorisation, intrusion detection, and trusted third party services. Applications built to run on top of MAFTIA use the abstractions provided by the middleware and the application services to operate securely across several hosts, and/or be accessed securely by users running on remote nodes, even in the presence of malicious faults.