Scripts for acquiring our MegaFoss dataset - a curated list of top open source projects that represent modern software development
When we want to regenerate our repo list, run this:
python .\src\github\get_repo_list.py
git clone https://github.com/CVEProject/cvelistV5 cves
- Ensure postgres is installed running
- Configure your database connection in
src/cve/config/postgres.ini
- Ensure you have a python environment installed and activated
- Install the required packages by running
pip install -r requirements.txt
- Run the following command to create the database schema:
python src/cve/create_db_tables.py
python src/cve/repos_to_nvd.py
Output will give a CSV file, a file for repos that need manual mapping, and a file for repos that are not found in the NVD database.
python src/cve/list_patches.py
Output will print out a list of patches
python src/cve/nvd_to_cve_id_assigner_name.py
Output will print out tuples of (cve_id, vendor)
python src/cve/cve_no_cwe.py
Output will give a file cve/output/cve_no_cwe.txt
with a list of CVEs with no CWEs
Ensure you have the 'Master' and 'CWE_Relative_Map' tables from the spreadsheet downloaded
as lists/rust_to_cwe.csv
and lists/cwe_child_map.csv
respectively.
The former can be downloaded using python src/cve/download_rust_cve_sheet.py
python src/cve/generate_pi_chart.py
Output can be configured to print in the console or save to a file as well as printing out CVEs with no CWEs. Output will print out tab-seperated data to be copied into the spreadsheet which will auto-update the pi chart. Output will also print out data for specific projects. Output will also display a list of CWEs that had no vote mapping.
Install MongoDB to your system, or have access to a system running MongoDB. You will also need to install the official toolkit here. Once both downloads are ready, add their bin
s to your Path or PATH environment variable. These should look something like {install location}/MongoDB/Server/{version number}/bin
and {install location}/MongoDB/Tools/{version number}/bin
respectively.
Additionally, if you haven't already, download a collection of CVEs to use. We recommend cloning this repository for a wide range of CVEs, but be warned that it is very large (277k+ CVEs at time of writing)!
Open a command line window and type mongod --dbpath="{desired database folder location}"
. This should start running a MongoDB instance on your computer at localhost:27017
. If you would like to use a database that is already running, skip this step. If you would like to use a more complicated configuration, such as on a different port, refer to the mongod documentation here. Once your database is running, DO NOT CLOSE THIS WINDOW until you are done, as this will close your database to connections.
Open either src/cve/mongodb-import.ps1
or src/cve/mongodb-import.sh
, whichever seems more appropriate for your system. Replace the default directory path with the path to your CVE collection folder. If you used the default configuration provided by mongod
, this is all that needs to be changed. If the target database is not on localhost and/or is running on a different port than the default, you will need to change the mongoHost
and mongoPort
fields respectively.
Once your chosen script is set up for your system, you can open a new terminal window and run it. Be warned that if your CVE collection is very large, this may take a long time to run!