文章摘要

张晓林.“人工智能+”背景下的高质量数据集建设:图书馆的机遇与挑战[J].中国图书馆学报,2025,51(6):4~17
“人工智能+”背景下的高质量数据集建设:图书馆的机遇与挑战
High quality Dataset Construction in AI+ Environment:Library's Opportunities and Challenges
  
DOI:
中文关键词: 人工智能+  高质量数据集  数据质量  生态体系  图书馆
英文关键词: Artificial intelligence +  High quality datasets  Data quality  Data ecosystem  Library
基金项目:
作者单位
张晓林 上海科技大学 上海 201210 
摘要点击次数: 198
全文下载次数: 163
中文摘要:
      简要总结了国家对高质量数据集建设的战略部署,指出高质量数据集建设已成为全面实施“人工智能+”行动的重要基础,介绍了高质量数据集建设需要达到的多方面质量要求、建设过程的方法要求及运营体系要求。根据国家加快场景创新部署及AI4S等带来的范式革命,提出适应基础认知层、场景理解层、行动规划层三个递进层次的高质量数据集生态体系。在此基础上,提出参考世界模型思维、借用开放关联数据机制,创新性应用多种新兴技术方法,建立一个从多元场景应用出发、迭代与融汇并进的多层次多维度的高质量数据集体系。最后,分析了图书馆在高质量数据集建设中的优势、挑战,尝试给出了一些建议。图3。参考文献33。
英文摘要:
This paper briefly summarizes the national strategies for the development of high quality datasets (HQDs),pointing out that building HQDs has become an important foundation for the comprehensive implementation of the “artificial intelligence+” initiative. It then introduces the various quality requirements that HQDs need to meet,as well as the methodological requirements for construction and operation. Based on the nation's efforts to accelerate scenario based innovation and the paradigm shifts brought about by initiatives such as AI4S,it proposes a HQDs ecosystem that corresponds to three progressively advanced layers:the basic cognitive layer,the scenario understanding layer,and the action planning layer. On this basis,a reference framework for such an ecosystem is provided,and it suggests,by drawing on the World Model thinking and leveraging open linked data mechanisms,that we innovatively apply various emerging technologies and methods to build a multi layered,multi dimensional HQDs ecosystem that evolves from diverse scenario applications through iterative integration.
Finally,it analyzes the advantages and challenges for libraries in the construction of HQDs and offers some recommendations. Advantages for libraries to participate in HQDs development include:well aligned mission and long earned expertise in organizing high quality knowledge resources;standardized practices in data annotation,as their professional foundation;commitment to open access and open science,which supports the integration and application of HQDs in complex scenarios;and,public service role and user engagement capabilities,which provides valuable leverage for collaborative work in diverse contexts. Challenges may include:lack of full text content;multimodal data and their fusion;linked data standardization;AI readiness for data;scenario based linking;and ecosystem collaboration. The following suggestions are provided for libraries participating in HQDs development:re purpose and re position themselves;accelerate the transformation of their data resources into HQDs;develop unique,specialized,and localized HQDs;embed themselves,via multiple ways,into HQDs development of various and diverse domains,speed up organizational evolution. 3 figs. 33 refs.
查看全文   查看/发表评论  下载PDF阅读器