Page 157 - Journal of Library Science in China, Vol.45, 2019
P. 157
156
156 Journal of Library Science in China, Vol.11, 2019
knowledge discovery and provides directions to researchers by pointing out that the integration of
KOS and cognitive computing will effectively improve the low precision of machine algorithms
led by lacking high-quality big data and semantic knowledge base. In addition, the proposed
approach may break through the deep cross-boundary fusion and automatic knowledge acquisition
of multi-source heterogeneous data, and then make great improvements on transformations from
unstructured literature data to structured and semantic knowledge networks.
Cataloging from digitization to datafication
HU Xiaojing 〇a ∗
In the past decade, the great change has taken place in the field of cataloging from theoretical
models and standards to applications since the invention of Machine Readable Cataloging
(MARC). This change is directly related to linked data technology and can be summarized as
cataloging from digitization to datafication, i.e., bibliographic data from machine-readable to
machine-actionable for integrating into the web. Cataloging community experienced important
changes in concepts (from records to data), clarified confused concepts (entities and their names
and descriptions), re-modeled bibliographic data, and engaged in various experiments and
programs.
First, the focus of cataloging transforms from records to data. In the theoretical model, IFLA
paid attention to “the basic level national bibliographic record” in Functional Requirements for
Bibliographic Records (FRBR). But Functional Requirements for Authority Data (FRAD) “focuses
on data, regardless of how it may be packaged”. A recordless environment is gradually being
formed. In cataloging rules, Resource Description & Access (RDA) emphasized the core elements,
but the new RDA (Toolkit Beta Site) abandons the core elements. In metadata format, BIBFRAME
and RDA vocabularies clearly identify different data which are confused in MARC.
Second, concepts between entities and their names and descriptions are clearly distinguished.
IFLA Library Reference Model (LRM) defines Nomen as an entity. Authority control becomes
entity management and no longer relies on the uniform form of a name. To distinguish between
entities (Real World Objects) and their descriptions (such as authority records), MARC 21 adds
new subfield $1 that records the identity of the entity itself.
Third, data are modeled as RDF vocabularies. Different vocabularies have different classes and
properties. Although BIBFRAME vocabulary and RDA vocabulary are very different in class or
entity identification, BIBFRAME can use with RDA as a content standard.
Finally, datafication is in practice. Library of Congress (LC)’s Bibliographic Framework
Initiative is in its final stage after several rounds of pilots. The Swedish National Library launched
* Correspondence should be addressed to HU Xiaojing, Email: xjhu@library.ecnu.edu.cn, ORCID: 0000-0002-1703-9724