Page 89 - JOURNAL OF LIBRARY SCIENCE IN CHINA 2018 Vol. 42
P. 89
088 Journal of Library Science in China, Vol. 8, 2016
generate RDF data. There are also some differences between the two tools. The DB2Triples applies
R2RML standard of W3C and supports one-time access to multiple relational databases, as well as
generates RDF data of multi-class entities. However, the disadvantages of DB2Triples tool lie in
that its ontology mapping requires the configuration files of JSON languages to edit text formatting
and lacks the user-friendly interface (Xia & Jin, 2015). Although OpenRefine is not convenient for
operating the data from multi tables at the same time, it is given the so-called WYSIWYG (What
you see is what you get) user interface. Therefore, when transforming the collection genealogy
data in multiple relational databases stored in SQL Server, the DB2Reiples tool is applied. When
transforming the data of The General Catalog of Chinese Genealogies in a single Excel table, the
OpenRefine tool is applied. Figure 2 takes the ancestor and celebrity in the genealogical table, for
example. It presents the process of transforming the data in Excel table from Turtle format to RDF
format by using OpenRefine tool.
2.3 The design based on the four principles of linked data and the implementation based
on semantic technologies
The design of system follows the Four Principles of Linked Data. After investigations on
Cool URIs standard (Sauermann & Cyganiak, 2008) and other linked data projects conducted
by international governmental sectors and libraries by combing with the actual demands, we
formulated The URI Design Specification of Shanghai Library and generate HTTP URI for various
entities in genealogy data according to this standard. With respect to the descriptive information for
entities, they are organized by RDF abstract data model and encoded by standard serialized format.
When visiting the HTTP URI of an entity, the relevant RDF information about the entity will be
obtained. It also supports content negotiation mechanism. When a user visits via ordinary browser,
the system returns to the Html page for people to read. However, when the semantic-web browser
or semantic proxy (program) is used to visit URI, the system returns to the corresponding formats
of RDF data (such as RDF/XML, RDF/Turtle, JSON-LD ) according to the requester’s request for
the content format delivered by Http reader.
The development of the system is based on the semantic technology framework. The RDB2RDF
tools supporting W3C’s RDB2RDF standards are used during the transformation process of data
from RDB or EXCEL format to RDF/Turtle format. And other data cleaning and transformation
tools like OpenRenfine are also used. Then the RDF data generated by those tools can be loaded
into the RDF store (Open Link Virtuoso) instead of traditional RDB database, and the data
interaction between the visualization layer and the storage layer is driven by SPARQL via Jena.The
data visualization tools such as SIMILE Timemap, Baidu Echarts and AMAP(高德地图) are used
to provide visualized data service to the end users. The data of RDF are stored in the RDF storage
called Open Link Virtuoso. Between DF storage ad visualized presentation layer, data can be