Introduction: This Python script performs web crawling on a specified domain, clusters the crawled pages based on text similarity using K-means clustering, and conducts sentiment analysis on the clusters. The project offers insights into the content structure, prevalent themes, and emotional tones of a website.
Prerequisites: Python 3.x Required Python libraries: requests, beautifulsoup4, scikit-learn, matplotlib, afinn, nltk
Install the required libraries using: pip install requests beautifulsoup4 scikit-learn matplotlib afinn nltk
Usage: After going to the directory of the downloaded project Run the script: python main.py