This repository provides support for downloading, analyzing, and gaining insights from the National Library of Medicine (NLM) baseline datasets. The focus is on identifying trends and patterns in biomedical papers published over the years.
- Download and preprocess NLM baseline datasets.
- Analyze publication trends across years.
- Visualize key insights from the data.
Clone the repository:
git clone https://github.com/aditya30051993/NLM-Baseline-Dataset-Insights.git
Navigate to the project directory and install the required packages:
pip install -r requirements.txt
Open the Jupyter Notebooks to explore the datasets and run the analyses:
jupyter notebook
See the CONTRIBUTING.md for guidelines.
For security-related issues, please refer to the SECURITY.md.
This project is licensed under an open-source license. See LICENSE for details.
This project utilizes datasets downloaded from National Library of Medicine Data Distribution
National Library of Medicine Data Distribution The NLM Data Distribution program is the preferred access point for bulk downloading of the data for the products listed below. Downloading and use of these datasets is free of charge and implies agreement to the [https://www.nlm.nih.gov/databases/download/terms_and_conditions.html](Terms and Conditions). For questions or assistance regarding the data distributed within this program, either visit the NLM Support Center or email at custserv@nlm.nih.gov.
MEDLINE/PubMed NLM produces a baseline set of MEDLINE/PubMed citation records in XML format for download on an annual basis. Each day, NLM produces update files that include new, revised, and deleted citations.
For more information, please refer to:
- NLM Copyright Policy: https://www.nlm.nih.gov/web_policies.html#copyright
- PubMed README: https://ftp.ncbi.nlm.nih.gov/pubmed/baseline/README.txt
The following individuals are responsible for different sections of the project:
- @aditya30051993 - Owner and maintainer of the project.
If you use this project in your research, please cite it as follows:
@software{Gupta_NLM_Baseline_Dataset_Insights_2024,
author = {Gupta, Aditya Kumar},
title = {NLM Baseline Dataset Insights},
year = {2024},
url = {https://github.com/aditya30051993/NLM-Baseline-Dataset-Insights},
version = {0.1.0},
note = {Analyzing trends and insights in biomedical papers using NLM Baseline datasets},
license = {MIT}
}