My code and documentation while exploring NLP under an internship.
I was attempting a content-based approach for fake-news detection, using an algorithm designed to extract queries from a text and google search them, before using entailment to detect contradictions.
- Final Presentation Slides: my slides probably explains it better.
- Plan.ipynb: original plan, couldn't finish it completely.
- research.ipynb: some of what I read up on.
- how_to_query_a_database_EX_edition.ipynb: A collation of many methods of embedding and similarity measurement during experimenting.
- The_Pipe.ipynb: Completed SpaCy pipeline adapter for YangNLP's BERTSumEXT model, demo of how query extractor works
- true_news_scraper.ipynb: Code utilizing newspaper3k to scrape from multiple news sources in parallel
- ALBERT_for_SNLI.ipynb: Code for training Transformers library ALBERT on SNLI dataset
- BERTsum.ipynb: Experimental attempt at essentially a VAE using transformers. It didn't work very well.
- LIAR_dataset_classifying_using_svm.ipynb: Classifying the LIAR dataset using SVM.
- Query_extraction_and_BERTSumEXT.ipynb: Experimenting with using them