Knowledge Base Systems
The research is mainly focused on rule integration in databases and parallel rule execution in multi-processor computers. An active object-oriented knowledge base system, called DEVICE, has been developed that efficiently integrates active or event-driven (ECA), production and deductive rules into a single Object-Oriented Database system. More specifically, an active OODB system has been extended with production and deductive rules by compiling the declarative condition of a high-level rule into a network of complex events that incrementally matches the condition at run-time. The network is similar to the RETE discrimination network, appart from two optimizations for saving space: the event object re-usability, when multiple rules have common conditions, and the virtual alpha node optimization, which does not store tokens in the network when they originate from a non-selective condition. The last event in the event network acts as a trigger of an ECA rule that hosts the original high-level rule. The action of the ECA rule depends on the type of the high-level rule.
Furthermore, a parallel active Object-Oriented Database (OODB) model, named PRACTIC, has been defined and its abstract machine has been studied. PRACTIC is based on the inter-class and intra-class parallel query processing of queries. A hierarchical multi-processor architecture and various data and knowledge declustering schemes have been studied, in order to increase the speed-up and scalability of OO query processing. A prototype has been implemented on a transputer network and a Unix workstation using CS-Prolog. In the future, it will also be ported to a cluster of workstations. Finally, the algorithms for distributing the event and rule objects of the DEVICE system onto the abstract PRACTIC machine have been developed, along with parallel rule matching and asynchronous execution schemes, for all three rule types.
Finally, the above research has been applied to data warehousing. Specifically, the system InterBase-KB has been designed to support the data-integration component of a Data Warehouse by integrating the DEVICE knowledge base system (see above) with a multi-database system, called InterBase*. The multidatabase system integrates various component databases with a common query language, however it does not provide capability for schema integration and other utilities necessary for Data Warehousing. The knowledge base system offers in addition a declarative logic language with second-order syntax but first-order semantics for integrating the schemes of the data sources into the warehouse and for defining complex, recursively defined materialized views. Furthermore, deductive rules are also used for cleaning, checking the integrity and summarizing the data imported into the Data Warehouse. The Knowledge Base System features an efficient incremental Warehouse, without querying the data sources.
Semantic Web
The goal of the research is to explore alternative mappings of XML documents to object-relational and object-oriented databases, so that XML data/documents can be stored, queried and maintained inside a database system, offering reliabibility, flexibility and sharing among users. Furthermore, we explore altwernative ways to query semi-structured XML data stored inside OODBs.
Currently, we have extended the infrastructure of an OO data warehouse (called InterBase-KB, see above) to handle semi-structured XML data as well. Specifically, we considered the problem of storing an XML document into an object database by automatically mapping the schema of the XML document to an object-oriented schema. Furthermore, we have developed the deductive rule language X-DEVICE for specifying queries and materialized views over the stored semi-structured data as an extension the deductive object-oriented language of the DEVICE system (see above).
Furthermore, we have developed R-DEVICE, a deductive object-oriented knowledge base system for querying and reasoning about RDF metadata. R-DEVICE, transforms RDF triples into objects and uses a deductive rule language for querying and reasoning about them. More specifically, R-DEVICE imports RDF data into the CLIPS production rule system as COOL objects. The main difference between the RDF and our object model is that properties are treated both as first-class objects and as attrib-utes of resource objects. In this way resource properties are gathered together in one object, resulting in superior query performance than the performance of a triple-based query model. Most other RDF storage and querying systems that are based on a triple model scatter resource properties across several triples and they require several joins to query the properties of a single resource.
Content Based Information Retrieval
After almost 10 years of WWW evolution, the search for information over the web is done in the same old keyword based fashion. Recent attempts for advanced information retrieval methods gave rise to the various content description efforts co-ordinated under the aegis of the XML/RDF technology. We investigate knowledge based techniques on using such data and metadata since unlike its syntax the way of their use is an open issue.
Early in our research in this area we designed the COMFRESH, a common framework for expert systems and hypertext. COMFRESH is based on a Prolog Inference Engine that reasons on knowledge describing the content of hypertext pages. The knowledge representation formalism we used was Conceptual Graphs (CGs). This integration of Expert System and Hypertext technologies is powerful enough to provide at the same time traditional browsing, content based browsing via dynamic and content related links, as well as content based querying.
The COMFRESH model gave rise to the Smart Video Text (SVT), an intelligent annotation based video data model. SVT utilizes CGs to capture the semantic associations among the concepts described in video annotations. This enables the basic annotation based video data model to provide functionality beyond the simple operator based video query and retrieval. Namely, the CG layer allows hypertext like browsing and natural language querying on the video data, based on the semantic relationship among video clips or logical video segments. Furthermore, the effectiveness of the operator based retrieval is greatly improved because the CG layer naturally provides semantic term matching through its internal concept hierarchy which serves as an ontology.