Page 89 - JOURNAL OF LIBRARY SCIENCE IN CHINA 2018 Vol. 42
P. 89

088 Journal of Library Science in China, Vol. 8, 2016


            generate RDF data. There are also some differences between the two tools. The DB2Triples applies
            R2RML standard of W3C and supports one-time access to multiple relational databases, as well as
            generates RDF data of multi-class entities. However, the disadvantages of DB2Triples tool lie in
            that its ontology mapping requires the configuration files of JSON languages to edit text formatting
            and lacks the user-friendly interface (Xia & Jin, 2015). Although OpenRefine is not convenient for
            operating the data from multi tables at the same time, it is given the so-called WYSIWYG (What
            you see is what you get) user interface. Therefore, when transforming the collection genealogy
            data in multiple relational databases stored in SQL Server, the DB2Reiples tool is applied. When
            transforming the data of The General Catalog of Chinese Genealogies in a single Excel table, the
            OpenRefine tool is applied. Figure 2 takes the ancestor and celebrity in the genealogical table, for
            example. It presents the process of transforming the data in Excel table from Turtle format to RDF
            format by using OpenRefine tool.


            2.3  The design based on the four principles of linked data and the implementation based
            on semantic technologies


            The design of system follows the Four Principles of Linked Data. After investigations on
            Cool URIs standard (Sauermann & Cyganiak, 2008) and other linked data projects conducted
            by international governmental sectors and libraries by combing with the actual demands, we
            formulated The URI Design Specification of Shanghai Library and generate HTTP URI for various
            entities in genealogy data according to this standard. With respect to the descriptive information for
            entities, they are organized by RDF abstract data model and encoded by standard serialized format.
            When visiting the HTTP URI of an entity, the relevant RDF information about the entity will be
            obtained. It also supports content negotiation mechanism. When a user visits via ordinary browser,
            the system returns to the Html page for people to read. However, when the semantic-web browser
            or semantic proxy (program) is used to visit URI, the system returns to the corresponding formats
            of RDF data (such as RDF/XML, RDF/Turtle, JSON-LD ) according to the requester’s request for
            the content format delivered by Http reader.
              The development of the system is based on the semantic technology framework. The RDB2RDF
            tools supporting W3C’s RDB2RDF standards are used during the transformation process of data
            from RDB or EXCEL format to RDF/Turtle format. And other data cleaning and transformation
            tools like OpenRenfine are also used. Then the RDF data generated by those tools can be loaded
            into the RDF store (Open Link Virtuoso) instead of traditional RDB database, and the data
            interaction between the visualization layer and the storage layer is driven by SPARQL via Jena.The
            data visualization tools such as SIMILE Timemap, Baidu Echarts and AMAP(高德地图) are used
            to provide visualized data service to the end users. The data of RDF are stored in the RDF storage
            called Open Link Virtuoso. Between DF storage ad visualized presentation layer, data can be
   84   85   86   87   88   89   90   91   92   93   94