OpenTable Reviews Collection with Web Scraping for Machine Learning

Collect OpenTable reviews with Web Scraping

About the script

This repo provides the python script to scrape OpenTable reviews.
There are 4 user ratings in each review, the main script will get all the reviews and corresponding overall rating from the target restaurants.

Sample format:

                                                  review  overall rating
0      Great ambiance and service. Lots of menu choic...               3
1      Exceptional service, cuisine, ambience.  Windo...               4
2      Our server Darcy was wonderful!  She accommoda...               2
3      Great food choices for lunch and excellent ser...               3
4      Always reliable and great place to go for lunc...               4
...                                                  ...             ...
13438  Our first visit to Chophouse. We will not go b...               3
13439  Friendly and attentive service and the food an...               4
13440  My family and I had an amazing time! Not only ...               4
13441                                         Great food               4
13442  Great food and excellent service. We’ll be back!!               4

Usage

Find the target restaurant website in OpenTable, go to page 2 of review page.
Copy the url of review page 2, e.g. https://www.opentable.ca/r/chez-mal-manchester?page=2&sortBy=newestReview
Place the urls in url_list for training dataset and eval_url for validation dataset.
Run the main script, it will save the df as csv with two columns: reviews and overall rating of the restaurants.

How to import the csv as dataset with Hugging Face API

You may use below function to read the .csv file

def load_data(path, name):
    df = pd.read_csv(path)  
    df = df.rename(columns={'review': 'text', 'overall rating': 'label'})
    dataset = Dataset.from_pandas(df, split=name)
    return dataset

For more examples, please refer to:
https://huggingface.co/docs/datasets/main/en/loading#pandas-dataframe https://huggingface.co/docs/datasets/main/en/tabular_load

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

OpenTable Reviews Collection with Web Scraping for Machine Learning

About the script

Usage

How to import the csv as dataset with Hugging Face API

Files

README.md

Latest commit

History

README.md

File metadata and controls

OpenTable Reviews Collection with Web Scraping for Machine Learning

About the script

Usage

How to import the csv as dataset with Hugging Face API