Text mining tools for patent literature
Developed in R.
The dataset was obtained from patentscope as an excel file. An example of the dataset is saved in the folder excel_file
The WIPO search page allows to download up to 100 rows of patent data with abstracts and up to 10 000 without them. Since 100 is often an unrepresentative number to text mine any technical field the abstracts have to be obtained from another web page. About 5 or 10% of the abstracts are not obtained but if the sample is large enough that shouldn't pose a problem