Text and Data Mining (TDM) generally refers to mining patterns, trends, and other valuable information from text and data by using automated analysis techniques. It is a computer based process, which derives or organizes information from text or data. TDM is of great value for improving scientific research efficiency, accelerating scientific discoveries, promoting scientific innovations, and realizing economic growth, but it is seriously hampered by market failures, legal uncertainties, and isolated information islands. Copyright uncertainty is the most significant legal obstacle faced by TDM. However, the current domestic researches on this issue focus on tracking the development of foreign countries, and relatively lack localization research on law of China. Therefore, this paper analyzes whether the TDM infringes copyright on Chinese copyright law, and believes that the TDM involves copyright controlled behaviors under certain circumstances, which the statutory infringement exemptions such as copyright exceptions and contract authorizations cannot cover completely. At the same time, combined with the international responses to TDM copyright legal issues and the Copyright Law of China, this paper proposes a legislative proposal and several policy advices.
For the legislative solution, it is recommended to add an TDM exception in the third revision of the Copyright Law of China, and to embody the following content in the clause: 1) it would only apply where justified by non commercial purposes ; 2) it would only benefit users having a lawful access to the data; 3) it would not apply if the analysis output substitutes for the pre existing works or databases; 4) mentioning the sources of the name of authors would not be an obligation; 5) it could not be overridden by contractual terms and technical measures; and 6) it would not apply to tools designed for TDM,and would not affect the application of privacy, confidentiality and special data protections rules.
For non legislative solutions, in order to promote the application and development of TDM, it is recommended to eliminate the copyright uncertainty by following measures: 1) it would respect and protect users who have a lawful access to the data, on the premise that TDM is not an alternative competition with the copyright holders original market; 2) it would understand the TDM from “transformative use” paradigm that developed in the recent judicial cases of US, and would encourage TDM within the scope of fair use; 3) it would draw support from agencies and their alliances, which have excellent negotiation experience, funding capacity and contracting ability, to emphasize that the TDM within the scope of fair uses could not be overridden by contractual terms, and even to strive for more extensive TDM rights in their contracts;4) it would promote the development of open access for a wider application of TDM technology, especially the open access with the licenses of CC by, CC-0, ODC by, ODC-0, etc. 2 figs. 26 refs.