Data mining and knowledge discovery from social media:
Implementation data analysis methods on data collection from Twitter
In this repository lives the source code of a try
- grabbing and storage tweets from Twitter's streaming API
- clustering any corpus of tweets (unsupervised)
- exporting the fundamental topics of them
- counting the efficacy of my machine learning efforts.
Also, I developed a method which remove duplicates documents from corpus (retweets problem).
This material constituted the implemented part of my Bachelor Dissertation.
Bachelor Dissertation Page(GR)