On-Line Thesis

PhD Thesis

Serena Sorrentino

thesis presentation
Label Normalization and Lexical Annotation for Schema and Ontology Matching The goal of this thesis is to propose, and experimentally evaluate automatic and semi-automatic methods performing label normalization and lexical annotation of schema labels. In this way, we may add sharable semantics to legacy data sources. Moreover, annotated labels are a powerful means in order to discover Lexical Relationships among structured and semi-structured data sources. Original methods to automatically normalize schema labels and extract lexical relationships have been developed and their affectiveness for automatic schema matching shown.

Nana Mbinkeu Carlos

Query Optimization and Quality-Driven Query Processing for Integration Systems This thesis focused on some core aspects in data integration, i.e. Query Processing and Data Quality. First this thesis proposed new techniques that consider the optimization of the full outerjoin operation, which is used in data integration systems for data fusion. Then this thesis demonstrated how to achieve Quality-Driven Query Processing, where quality constraints specified in Data Quality Aware Queries are used to perform query optimization.

Antonio Sala

Data and Service Integration: Architectures and Applications to Real Domains This thesis focuses on Semantic Data Integration Systems, with particular attention to mediator system approaches, to perform data and service integration. One of the topics of this thesis is the application of MOMIS to the bioinformatics domain to integrate different public databases to create an ontology of molecular and phenotypic cereals data. However, the main contribution of this thesis is a semantic approach to perform aggregated search of data and services. In particular, I describe a technique that, on the basis of an ontological representation of data and services related to a domain, supports the translation of a data query into a service discovery process, that has also been implemented as a MOMIS extension. This approach can be described as a Service as Data approach, as opposed to Data as a Service approaches. In the Service as Data approach, informative services are considered as a kind of source to be integrated with other data sources, to enhance the domain knowledge provided by a Global Schema of data. Finally, new technologies and approaches for data integration have been investigated, in particular distributed architecture, with the objective to provide a scalable architecture for data integration. An integration framework in a distributed environment is presented that allows realizing a data integration process on the cloud.

Laura Po

Automatic Lexical Annotation: an effective technique for dynamic data integration. La tesi illustra come l'annotazione lessicale sia un elemento cruciale in ambito di integrazione dati. Grazie all'annotazione lessicale, vengono scoperte nuove relazioni tra gli elementi di uno schema o tra elementi di schemi diversi. Diversi metodi per eseguire automaticamente l'annotazione delle sorgenti dati vengono descritti e valutati in diversi scenari. L'annotazione lessicale può perfezionare anche sistemi per la scoperta di matching tra ontologie. Sono presentati alcuni esperimenti di applicazione dell'annotazione lessicale ai risultati di un matcher. Infine, viene introdotto l'approccio all'annotazione probabilistica e viene illustrata la sua applicazione nei processi di integrazione dinamici.

Mirko Orsini

Query Management in Data Integration Systems: the MOMIS approach.  

Francesco Guerra

thesis presentation

Dai Dati all'Informazione: il sistema MOMIS

La tesi ha come obiettivo la descrizione della metodologia per la costruzione della GVV implementata nel sistema MOMIS. In particolare si focalizza sul problema dell'update e sul problema della gestione di sorgenti multi-lingua. Il sistema MOMIS viene poi confrontato con i principali sistemi a mediatori sviluppati e con i principali matcher e ne viene proposto un uso all'interno del Semantic Web. Infine viene descritto l'uso del sistema all'interno di applicazioni operanti in ambito e-commerce e all'interno dei progetti di ricerca europei WINK e SEWASIE.

Ilario Benetti

Knowledge Management for Electronic Commerce applications  

Alberto Corni

Intelligent Information Integration: The MOMIS Project This thesis describes the work done during my Ph.D studies in Computer Engineering. It is organized in two parts. The first and main part describes the reseach project MOMIS for the Intelligent Integration of heterogeneous information. It outlines the theory for Intelligent Integration and the design and implementation of the prototype that implements the theoretical techniques. During my Ph.D. studies I stayed at the Northeastern University in Boston, Mass. (USA). Subject of the second part of this document is the work I did with Professor Ken Baclawski in information retrieval on annotation of documents using ontologies, and retrieval of the annotated documents.

Maurizio Vincini 

pdf1   pdf2  
Utilizzo di tecniche di Intelligenza Artificiale nell'Integrazione di Sorgenti Informative Eterogenee Nella tesi di Dottorato viene presentato il sistema MOMIS (Mediator envirOnment for Multiple Information Sources), per l'integrazione di sorgenti di dati strutturati e semistrutturati secondo l'approccio della federazione delle sorgenti. Il sistema prevede la definizione semi-automatica dello schema univoco integrato che utilizza le informazioni semantiche proprie di ogni schema (col termine schema si intende l'insieme di metadati che descrive un deposito di dati).

Domenico Beneventano

Uno Strumento di Inferenza nelle Basi di Dati ad Oggetti (Subsumption inference for Object-Oriented Data Models) Object-oriented data models are being extended with recursion to gain expressive power. This complicates both the incoherence detection problem which has to deal with recursive classes descriptions and the optimization problem which has to deal with recursive queries on complex objects. In this phd thesis, we propose a theoretical framework able to face the above problems. In particular, it is able to validate and automatically classify in a database schema, (recursive) classes, views and queries, organized in an inheritance taxonomy. The framework adopts the ODL formalism (an extension of the Description Logics developed in the area of Artificial Intelligence) which is able to express the semantics of complex object data models and to deal with cyclic references at the schema and instance level. It includes subsumption algorithms, which perform automatic placement in a specialization hierarchy of (recursive) views and queries, and incoherence algorithms, which detect incoherent (i.e., always empty) (recursive) classes, views and queries. As different styles of semantics: greatest fixed-point, least fixed-point and descriptive can be adopted to interpret recursive views and queries, first of all we analyze and discuss the choice of one or another of the semantics and, secondly, we give the subsumption and incoherence algorithms for the three different semantics. We show that subsumption computation and incoherence detection appear to be feasible since in almost all practical cases they can be solved in polynomial time algorithms. Finally, we show how subsumption computation is useful to perform Semantic query optimization, which uses semantic knowledge (i.e., integrity constraints) to transform a query into an equivalent one that may be answered more efficiently.
The phd thesis is in Italian. The content of this phd thesis can be found in the following two papers:
  • D. Beneventano, S. Bergamaschi, "Incoherence and Subsumption for recursive views and queries in Object-Oriented Data Models", Data & Knowledge Engineering 21 (1997), pag. 217-252, Elsevier Science B.V. (North- Holland). Abstract (ps), Paper (ps)
  • D. Beneventano, S. Bergamaschi, C. Sartori: "Description Logics for Semantic Query Optimization in Object-Oriented Database Systems", ACM Transaction on Database Systems, Volume 28: 1-50 (2003). Electronic Edition.