This project aims to scrape job listings for data analyst positions in π¨π¦ Canada from Indeed using Python. It utilizes web scraping techniques to extract job titles, company names, and locations from the Indeed job search results.
As I was searching for a dataset of data analyst job postings in Canada, I encountered challenges in finding a comprehensive and up-to-date source. Unable to locate an existing dataset, I embarked on building a solution using Python. Leveraging online guides and seeking assistance from ChatGPT for debugging, I developed a web scraping script to extract job listings from Indeed. This project serves not only as a tool for personal use but also contributes to filling the gap in available datasets for data analyst job postings in Canada.
- Python
- Requests library for making HTTP requests
- BeautifulSoup library for parsing HTML content
- Pandas library for data manipulation and analysis
1. Navigate to final_code.py
2. Before running the code below, please ensure you have installed all required libraries separately.
# Step 1: Install requests library
import requests
# Step 2: Install BeautifulSoup from bs4
from bs4 import BeautifulSoup
#Step 3: Install BeautifulSoup
import bs4
# Step 4: Install Pandas library
import pandas as pd
Edit the following line in the code to replace 'your_own_link_here'
with your actual user agent link:
url = f'your_own_link_here'
You can adjust the number of pages you want to scrape and the number of job listings per page. For example:
for i in range(0, 70 * 15, 15): # Scrape 70 pages, each page has 15 listings
Once you've made the necessary adjustments, run the script to scrape job listings from Indeed. The extracted data will be stored in a CSV file named job_listings.csv
in the same directory.
The main steps involved in the analysis are:
- Extracting job titles, company names, and locations from Indeed job search results.
- Parsing the HTML content of the job listings page using BeautifulSoup.
- Transforming the extracted data into a structured format.
- Storing the data in a Pandas DataFrame.
- Saving the DataFrame to a CSV file for further analysis.
Through this project, I gained experience in:
- Web scraping techniques using Python.
- Parsing HTML content with BeautifulSoup.
- Data manipulation and analysis with Pandas.
Scraping job listings from Indeed provides valuable insights into the current job market for data analysts. This project demonstrates the power of web scraping and data analysis using Python for extracting useful information from online sources.