Skip to content

devpatel30/web-crawler

Repository files navigation

Introduction: This Python script performs web crawling on a specified domain, clusters the crawled pages based on text similarity using K-means clustering, and conducts sentiment analysis on the clusters. The project offers insights into the content structure, prevalent themes, and emotional tones of a website.

Prerequisites: Python 3.x Required Python libraries: requests, beautifulsoup4, scikit-learn, matplotlib, afinn, nltk

Install the required libraries using: pip install requests beautifulsoup4 scikit-learn matplotlib afinn nltk

Usage: After going to the directory of the downloaded project Run the script: python main.py

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages