Skip to content

Python utility package for scraping information on SINTA (Science and Technology Index)

License

Notifications You must be signed in to change notification settings

groaking/sintautils

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

sintautils

Python utility package for scraping information on SINTA (Science and Technology Index)

A. Documentation

A.1. Author Verification

A.1.i. Authentication

Author verification menu is a restricted menu of SINTA. You must be registered as a university administrator and obtain an admin credential in order to use this function. An author verification (AV) admin's credential consists of an email-based username and a password.

To use the AV scraper, you must first import it. And then, a scraper object called AV must be initialized and passed with AV admin's username and password. Finally, perform login using the scarper object in order to retrieve requests session cookie with the SINTA host.

from sintautils import AV
scraper = AV('admin@university.edu', 'password1234')
scraper.login()

This can be done in two lines as follows:

from sintautils import AV
scraper = AV('admin@university.edu', 'password1234', autologin=True)

B. To-Do

B.1. New Features

  • Add scopus, comm. service, and research scraper of each author.
  • Add scopus, research and comm. service sync per author.
  • Add scraper for IPR and book of each author.
  • Add garuda scraper per author.
  • Add author info dumper.
  • Add author info dumper using openpyxl implementation that outputs to an Excel/spreadsheet workbook file.

B.2. Bug Fixes

  • Google Scholar scraper: no publication case.

B.3. Improvements

  • Bulk scraping of author list: return a dict with each author ID as key instead of just a plain list.
  • Move _scrape_scopus, _scrape_wos etc. functions to backend.py.

C. License Notice

This program is free software: you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the
Free Software Foundation, either version 3 of the License, or (at your
option) any later version.

This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.

You should have received a copy of the GNU General Public License along
with this program. If not, see <https://www.gnu.org/licenses/>.