Skip to content

Latest commit

 

History

History
97 lines (81 loc) · 4.32 KB

README.md

File metadata and controls

97 lines (81 loc) · 4.32 KB

nel_errors

This project contains small scripts that were used for the error analysis work described in:

@inproceedings{brasoveanu2018lrec,
  author = {Adrian M.P. Bra{\c{s}}oveanu and Giuseppe Rizzo and Philipp Kuntschick and Albert Weichselbraun and Lyndon J.B. Nixon},
  title = {Framing Named Entity Linking Error Types},
  booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)},
  editor = {Nicoletta Calzolari and Khalid Choukri and Christopher Cieri and Thierry Declerck and Sara Goggi and Koiti Hasida and Hitoshi Isahara and Bente Maegaard and Joseph Mariani and Hélène Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis and Takenobu Tokunaga},
  publisher = {European Language Resources Association (ELRA)},
  address = {Paris, France},
  isbn = {979-10-95546-00-9},
  language = {english},
  pages = {266-271},
  year = {2018},
  month = {may},
  date = {7-12},
  location = {Miyazaki, Japan},
  url = {http://www.lrec-conf.org/proceedings/lrec2018/summaries/612.html}
}

FOLDERS

  • src - source code files
  • batch - batch files used for running evaluations
  • guideline - annotation guideline
  • examples - examples discussed in the paper

DOCUMENTATION

Documentation

WORKFLOW

Current

  1. create runs from the annotators and extract gold standards from NIF (methods: get_annotations, get_dbpedia_type and convert_annotations)
    • run - usually contains the normal results directly in a format similar to TAC-KBP
    • run2 - contains normal results + surfaceForms
  2. keep only PER, ORG, LOC from the gold standard and runs
    • Why 3 main types? Types commonly and largely used - motivation? why?
  3. Analyze the entities of the GS with notype/type and check if one of them actually belongs to the 3 types that we’re interested in. Hunt for the following:
  4. run neleval evaluate script
    • use normal runs
    • convert normal runs into the TAC-KBP format by using the converter.py script
    • get the P,R,F1 results for different types of run from the evaluate script
  5. run neleval analyze script
    • use the normal results + surfaceForms
  6. create superset of all errors - Google Docs
    • mergerun - creates the unified tac - tool output that is ready for merging. Uses same header all the time.
    • create combined run via the sort -u file1 file2 ... filen > combined.csv format
    • combineruns - passes the individual runs of each tool through the combined runs and gets the correct counts for each error
  7. preselect only KB and GS errors
  8. Manual annotation of the errors
  9. Create tables/analytics with results

INTERFACE FOR CLIENTS

  • get_annotations - extracts annotations from the results
  • get_dbpedia_type - gets dbpedia types for a certain entity (it can use a list of types or an url)
  • convert_annotations(folder,run) - converts annotations to a format close to the TAC-KBP evaluation format

CURENT CORPORA

ENGLISH
  • Reuters128 -> OK
  • KORE50
GERMAN
  • RBB150 (RBB)
FRENCH
SPANISH

CURRENT ANNOTATORS

  • AIDA
  • Spotlight
  • Babelnet
  • Recognyze (German)

GERBIL RUNS

Later

  • Include NIL Clustering