This repo contains a feature extraction from Twitter content (NLP and Sentiment Analysis) of the 109 most advanced Smart-Cities worldwide. (cf. Smart City Index 2020 by IMD Business School https://www.imd.org/smart-city-observatory/smart-city-index/#:~:text=Singapore%2C%20Helsinki%20and%20Zurich%20have,%E2%80%9Csmart%E2%80%9D%20their%20cities%20are)
Features extracted from this dataset assume to be representative of the whole population of existing Smart-Cities nowadays. It will be used as a comparison point to train predictive models and test probabilistic hypothesis on samples.
In this repo you can find:
- the raw .csv file of 110,862 smartcities tweets
- three notebooks : feature extraction, statistics and correlation, Machine Learning models
To read more about the "Weight of BoWs for urban studies" technique, you can go here : https://github.com/DemocracyStudio/smartcityBoWs