Page 173 - Journal of Library Science in China, Vol.47, 2021
P. 173

172   Journal of Library Science in China, Vol.13, 2021





              Secondly, at the detection and extraction level of archival data objects, this study mainly applies
            image recognition and natural language processing technology based on deep learning to complete
            this task. In the image recognition task, the image entity detection of WU Baokang’s photos in
            various periods is carried out, and the entity type in the photo is determined. This section focuses
            on the identification of WU Baokang and his group photo in the photo, and re-annotates important
            people such as WU Baokang and Селезнев. In addition, representative physical entities with
            narrative significance in the image, such as the Shanghai HSBC Building and the Information
            Building of Renmin University of China, are also the contents of the image detection and entity
            labeling tasks.
              In terms of natural language processing of archival data, we identify named entities one by
            one in each volume of WU Baokang’s Academic Chronicle based on the Bert model, focusing
            on exploring the co-occurrence between WU Baokang and other named entities such as person,
            time, place, and document, extracting semantic relationships between entities and storing them in
            association with verbs as the core. At the same time, the mutual verification between text and photo
            content is carried out based on time nodes, and the semantic connotation of photos is enriched by
            the semantic relationship contained in text data, forming a knowledge-level fusion of dual-modal
            data, so as to provide necessary support for the in-depth interpretation and context recognition of
            photo files, as shown in Figure 7.





















                           Figure 7. Entity recognition results of WU Baokang’s Academic Chronicle


              Finally, at the level of context recognition of WU Baokang’s archival data, the preliminary
            research mainly adopts manual identification methods to define the relevant context content of
            photo archives at two levels: internal and external. Specifically, the team selected five doctoral and
            master’s students with archival research backgrounds and mastery of archival research methods to
            refer to International Council on Archives Records in Contexts Ontology (ICA RiC-O) proposed
   168   169   170   171   172   173   174   175   176   177   178