The DOCKER Project at ISI


Helping People Connect Software

Main

Research

We are developing DOCKER (Distributed Operations of Code with Knowledge-based descriptions for Earthquake Research), a framework that ensures adequate use of earthquake simulation models and code through the use of a knowledge-based constraint reasoner. Although our research is motivated and driven by the analysis of seismic hazard, DOCKER is a domain-independent, Web-based tool that can be used for other applications as well.

DOCKER supports the publication and use of implemented simulation models through the following new capabilities:

System ensures appropriate use of models. Each simulation model designed to take into account specific types of earth shaking phenomena, which result in constraints that should be taken into account by the engineers using the models. Seismic hazard analysis models are a good example of this, where users must take into account 1) the type of tectonic region that the model is applicable to, 2) the type of faulting, 3) magnitude range, 4) distance range, and 5) site types. By representing these constraints in a declarative and expressive language, the system can exploit state-of-the-art knowledge representation and reasoning techniques to detect violations of these constraints and to guide users in finding models that are appropriate to the particular site and earthquake forecast being considered.
Users have just-in-time access to documentation of models in order to make judgments about the appropriate use of models. Engineers that have read the documentation months or years earlier may not necessarily recall all the details that the model authors specified. Backing up the constraints with the appropriate documentation sources will be very useful, especially when users need to make judgments about the severity and possible dismissal of constraint violations. In addition, it would be useful to accommodate constraints in different degrees of formalization. Some of these constraints, such as the basic types and bounds of input parameters, lend themselves more to formalization. Other characteristics seem more appropriate for human consumption "recordings with unknown or poor estimates of magnitude mechanism distance or site excluded from data set", so that the user can put the data into perspective.
New models are easily incorporated into the framework. Our environment should facilitate the publication and use of new models in order to accelerate scientific progress. In particular in seismic hazard analysis, new models and updates of existing models are likely to appear in the near and long term. Publishing models should be a painless process and should not require technical expertise in distributed computing or knowledge representation skills.

Our initial implementation of DOCKER to publish and use simulation models (embodied in executable code) that includes the following capabilities:

Integrates formal and informal representations of software components and their associated constraints: Constraints associated with the model can be captured with different degrees of formality, and are always grounded in documentation. When possible, models and their constraints are formally represented and the system maintains a reference to original documents that explain the constraint and that can be brought up to the user's attention so the user can assess the severity of a constraint violation. This is also useful since there are important constraints that require an involved and detailed effort in knowledge representation that may not yet be expressible with the existing ontologies and need to await further ontology development.
Enforces adequate use of components through knowledge-based constraint reasoning: DOCKER is integrated with Powerloom, a knowledge representation and reasoning system that provides an expressive language to describe model constraints. This enables the system to detect constraint violations and alert users when any models or code are invoked in an inappropriate context. With this framework, the system can also be proactive and support users in case of constraint violations by making suggestions about alternative settings that would respect those constraints. It also suggests the use of models that were not being considered by the user.
Supports access to models as layered services: DOCKER supports distributed access to models and code through a layered view of service-based interaction, where a software component is viewed as a service provider. It is designed to facilitate future integration with Grid services through the Open Grid Services Architecture (OSGA). DOCKER provides a higher layer of abstraction by providing a knowledge-based view of the capabilities of the services.
Facilitates model publication: DOCKER includes a user interface that facilitates publication of code by guiding users through simple steps that setup the system to automatically generate wrappers for the code that enable it to function at the various service layers. DOCKER also generates automatically the user interface required to access and invoke the models, as well as the necessary constraint checks.

DOCKER's user interface is supported by any Web-based browser. A User Interaction Manager routes user requests to the appropriate internal components. A Constraint Reasoner accesses a Powerloom server to load the formal descriptions of the models and to check constraints against the overall SCEC ontologies. For each model, a Publisher creates automatically several layers of wrappers for the code provided by the user. The lowest layers support message transport, and for this DOCKER generates the corresponding WSDL description and SOAP handlers. At higher layers, DOCKER generates automatically declarative expressions of simple constraints (type, enumerations, and bounds) in Poweloom, as well as XML Schemas for compatibility with lower layers. More complex constraints can be added manually by knowledge engineers.

DOCKER uses the IKRAFT Annotator [Gil and Ratnakar 02] to enable model publishers to link constraints and any other salient statement about their model in free text to the original documentation of the model (e.g., a technical publication). This enables end users to see the reasons behind each constraint in the model and understand whether it is ok to override certain constraints if necessary.

Our plans for future work include better support to handle constraint violations, support for ontology development to capture relevant domain terminology, integration with the Grid infrastructure being developed within the SCEC/IT work, and incorporation of physics-based (Pathway 2) models into our framework.

<< Back to IKCAP