Skip to content

🐍 πŸ‡¨πŸ‡¦ Scrapes job listings from Indeed to extract data analyst positions in Canada. Empower your job search with Python web scraping and data analysis!

License

Notifications You must be signed in to change notification settings

andrewhryn/DA_Indeed_Job_Scraping_Canada

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

15 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🍲 Indeed Data Analyst Job Scraping Project using Python

alt text

Python Pandas BeautifulSoup Web Data Analyst

πŸš€ Introduction

This project aims to scrape job listings for data analyst positions in πŸ‡¨πŸ‡¦ Canada from Indeed using Python. It utilizes web scraping techniques to extract job titles, company names, and locations from the Indeed job search results.

πŸ“‹ Background

As I was searching for a dataset of data analyst job postings in Canada, I encountered challenges in finding a comprehensive and up-to-date source. Unable to locate an existing dataset, I embarked on building a solution using Python. Leveraging online guides and seeking assistance from ChatGPT for debugging, I developed a web scraping script to extract job listings from Indeed. This project serves not only as a tool for personal use but also contributes to filling the gap in available datasets for data analyst job postings in Canada.

πŸ”§ Tools I Used

  • Python
  • Requests library for making HTTP requests
  • BeautifulSoup library for parsing HTML content
  • Pandas library for data manipulation and analysis

πŸ’» How to Use This Code

1. Navigate to final_code.py

2. Before running the code below, please ensure you have installed all required libraries separately.

# Step 1: Install requests library
import requests

# Step 2: Install BeautifulSoup from bs4
from bs4 import BeautifulSoup

#Step 3: Install BeautifulSoup
import bs4

# Step 4: Install Pandas library
import pandas as pd

3. Paste Your Own User Agent:

Edit the following line in the code to replace 'your_own_link_here' with your actual user agent link:

url = f'your_own_link_here'

NOTE: To find your user agent link, simply search "my user agent" in Google.

4. Customize the Number of Pages and Job Listings:

You can adjust the number of pages you want to scrape and the number of job listings per page. For example:

for i in range(0, 70 * 15, 15):  # Scrape 70 pages, each page has 15 listings

πŸƒ Run the Script:

Once you've made the necessary adjustments, run the script to scrape job listings from Indeed. The extracted data will be stored in a CSV file named job_listings.csv in the same directory.

πŸ“Š The Analysis

The main steps involved in the analysis are:

  1. Extracting job titles, company names, and locations from Indeed job search results.
  2. Parsing the HTML content of the job listings page using BeautifulSoup.
  3. Transforming the extracted data into a structured format.
  4. Storing the data in a Pandas DataFrame.
  5. Saving the DataFrame to a CSV file for further analysis.

πŸ“š What I Learned

Through this project, I gained experience in:

  • Web scraping techniques using Python.
  • Parsing HTML content with BeautifulSoup.
  • Data manipulation and analysis with Pandas.

πŸ“£ Conclusion

Scraping job listings from Indeed provides valuable insights into the current job market for data analysts. This project demonstrates the power of web scraping and data analysis using Python for extracting useful information from online sources.

About

🐍 πŸ‡¨πŸ‡¦ Scrapes job listings from Indeed to extract data analyst positions in Canada. Empower your job search with Python web scraping and data analysis!

Topics

Resources

License

Stars

Watchers

Forks