Project work for the course Text Analysis and Retrieval at University of Zagreb.
For non-native speakers, people with low literacy or intellectual disabilities, and language-impaired people (e.g., autistic, aphasic, congenitally deaf), regular texts (e.g., news) are often difficult to comprehend. Stylistically decorated sentences utilizing sophisticated vocabulary of infrequent words pose a particular difficulty for these groups of people. The goal of this project is to build a system for lexical simplification of text. Such a system should replace complex and hard-to-comprehend words with less complex semantically-matching words (e.g., “to leverage scarce resources” -> “to use rare resources”).
- https://aclweb.org/anthology/P/P15/P15-2011.pdf
- http://academiccommons.columbia.edu/download/fedora_content/download/ac:159833/CONTENT/biran_brody_elhadad_acl2011.pdf
- http://www.aclweb.org/anthology/S12-1#page=379
The lexical simplification approach is evaluated on the dataset from SemEval 2012 Task 1.