This project provides a Python script for extracting and saving filenames of JavaScript files (index.js
) from a given webpage. It leverages the requests
library for making HTTP requests and loguru
for enhanced logging.
- URL Validation: Ensures the provided base URL is valid.
- Retry Mechanism: Automatically retries failed HTTP requests.
- JavaScript Filename Extraction: Uses regular expressions to locate JavaScript files in the HTML source of a webpage.
- Storage Support: Saves extracted filenames to a specified file.
- Customizable Base URL and Output File: The script allows configuring the base URL and output file path.
Ensure the following Python packages are installed:
requests
loguru
Install them via pip:
pip install requests loguru
- Clone this repository:
git clone https://github.com/Enukio/Update-Index.git
cd Update-Index
- Edit the script to set the correct
BASE_URL
andOUTPUT_FILE
in theIndex.py
file:
BASE_URL = "https://example.com" # Replace with your target URL
OUTPUT_FILE = "./cgi" # File where filenames will be saved
- Run the script:
python Index.py
- Output files will be saved to the specified
OUTPUT_FILE
path.
When run with a valid URL containing JavaScript files, the script will find <script>
tags with src
attributes like:
<script src="/assets/index-abc123.js"></script>
index-abc123.js
The script outputs detailed logs with colored formatting, including:
- The number of JavaScript files found.
- Success or error messages for file-saving operations.
- Issues with fetching the webpage or unexpected content types.
- Assumes JavaScript filenames match the pattern
/index*.js
. - Designed for basic extraction and may require adjustments for complex or dynamically loaded webpages.
This project is licensed under the MIT License.