The problem of data semantics (or meaning) is one of the pressing issues both in effectively representing information about a domain in an information system (data and software), and in supporting large-scale sharing of information across multiple independent sources, such as those available over the Internet. Addressing the problem of determining the meaning of data requires overcoming two challenges: (1) specifying semantics for a single information source; and (2) providing mechanisms for reconciling the semantic specifications provided by independent sources.
This research will propose new approaches to understanding the semantics of information sources. In addition, methods to reconcile independent sources will be developed and evaluated. The research will build on my earlier research by focusing on three areas. First, my students and I will develop methods to enhance the semantics of information models by exploiting the inference capabilities afforded through the classification of instances. Second, we will apply the concept of classification-based inference to provide a semantic layer to tagging mechanisms used in popular tools for annotating content in online social networks (e.g., del.icio.us and Flickr.com). Third, we will apply and evaluate the methods developed in the first two phases of research to business contexts.
In particular, we will use the classification principles to develop domain ontologies for the online travel industry in conjunction with an industry partner. As the accessibility and scope of networked resources grow, the need for effective methods and tools to locate and manage information is becoming more critical to both individuals and organizations.