Annotation and Mapping Discovery among Data Sources


  • Schema matching is the task of finding the semantic correspondences (mappings) between elements of two schemata



  •  Approach: starting from “hidden” meanings associated to schema labels (i.e. class and attribute names, also called terms), the MOMIS Data Integration system discovers lexical relationships among schema elements 

  • Lexical Annotation of schema labels is the explicit assignment of meanings w.r.t. a reference lexical thesaurus (such as WordNet )
    • Manual Annotation is a boring and not scalable task --> Automatic or Semi-automatic Annotation

  • WSD (Word Sense Disambiguation) is the ability of identifying the meanings of words in a context by a computational technique 
    The semi-automatic CWSD (Combined Word Sense Disambiguation) method:

    1. associates to each label, one/more WordNet meanings

    2. combines two WSD algorithms: SD (Structural Disambiguation) exploits the schema derived relationships & WND (WordNet domains Disambiguation) exploits WordNet Domains
  • Schema label normalization: is the reduction of each label to some standardized form that can be easily recognized
    → abbreviation expansion and CN (Compound Noun) annotation




