Book Title:
Date: August 7, 2010
Abstract: Recognizing that two Semantic Web documents or graphs are similar, and characterizing their differences is useful in many tasks, including retrieval, updating, version control and knowledge base editing. We describe a number of text based similarity metrics that characterize the relation between Semantic Web graphs and evaluate these metrics for three specific cases of similarity that we have identified: similarity in classes and properties used while differing only in literal content, difference only in base-URI, and versioning relationship. When one graph is judged to be a version of another, we generate a “delta” consisting of of triples to be added or removed from one graph to make them equivalent. This method takes into account the text of the RDF graph’s serialization as a document, rather than relying solely on the document URI. We have prototyped these techniques in a system that we call Similis and evaluated its performance on several tasks using a collection of graphs from the archive of the Swoogle Semantic Web search engine.
Type: TechReport
Tags: delta, information retrieval, rdf, semantic web, semantic web graphs, similarity metrics
Google Scholar: search
509.pdf | downloads: 1229 |