Skip to content

A versatile Python script for scraping data from websites. This script automates data extraction, processes the information, and saves it in a structured format like CSV. Ideal for data collection, research, and analysis tasks.

Notifications You must be signed in to change notification settings

chathumiamarasinghe/web-scraping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Academic Staff Scraper

This Python script scrapes academic staff information from the Faculty of Science, University of Kelaniya's website, specifically the staff details page. The script retrieves each staff member's name, position, room number, phone, fax, email, and specialization (if available) and exports the information into a CSV file.

Prerequisites

Make sure you have the following Python packages installed before running the script:

  • requests: For sending HTTP requests to fetch the webpage.
  • beautifulsoup4: For parsing the HTML content of the webpage.
  • csv: For writing the extracted data to a CSV file.

You can install the required packages using pip:

How It Works

  • Extract Data from URL: The script sends a request to the webpage containing the academic staff details.
  • Parse HTML: It uses BeautifulSoup to parse the HTML and identify the relevant sections for staff data.
  • Retrieve Staff Information: For each academic staff member, the script extracts:
    • Name
    • Position
    • Room number
    • Phone number
    • Fax
    • Email
    • Specialization (scraped from a link if available)
  • CSV Output: The data is written to a CSV file named academic_staff.csv.

Example Output

-

Name Position Room Phone Fax Email Specialization
Prof.Janaka Wijanayake Professorr Room 201 011-2233445 011-2233446 janaka@stu.kln.ac.lk Computer Science
Dr. Thilini Mahanama Senior Lecture Room 202 011-1234567 Not available thilinie@uni.lk Physics

Usage

  1. Clone or download the repository containing this script.
  2. Make sure you have Python installed on your system.
  3. Install the required Python libraries using the following command:
    pip install requests beautifulsoup4

About

A versatile Python script for scraping data from websites. This script automates data extraction, processes the information, and saves it in a structured format like CSV. Ideal for data collection, research, and analysis tasks.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages