EmbEval is a framework that aims to provide a way to evaluate an arbitrary amount of word embeddings in an arbitrary amount of tasks, in parallel.
To aid with the interpretability of the results, embeval resorts to graphs to visualize the performance of the different type of embeddings across each task.
Install embeval with pip:
pip3 install embeval
embeval --help
Usage: embeval [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
semantic-similarity
embeval semantic-similarity --help
Usage: embeval semantic-similarity [OPTIONS] EMBEDDING_DIR TESTSET_DIR
Options:
--workers INTEGER Number of worker processes to use.
--output_path TEXT Path to write output files to.
--output_format [text|graph|both]
--help Show this message and exit.
embeval semantic-similarity --output_path output/ embeddings/ testsets/
To extend the code to include tasks not provided in the current implementation (contributions would be most welcome), n concepts must be implemented:
- Command (See Semantic Similarity Command) -- This is what will make your task available under the CLI and also will command the flow of execution when called upon. Click is used as the CLI package. The entrypoint for an extended application must import the main cli object and register all the available commands (See main).
- Processing Pipeline (See generics and Semantic Similarity Pipeline -- This is where the producer, processor and consumer are implemented to execute tasks. The implementation makes use of the library and methodology of pseq.
- Store (See Semantic Similarity Store) -- Simple object to keep track of evaluation results obtained during the processing pipeline.
- Task (See Semantic Similarity Task) -- A task object which encapsulates needed information to be shared in the pipeline, such as paths to files.
- Visualization (See text visualization) -- Defines a method of visualization.
- ☐ Finish Semantic Similarity visualization.
- ☐ Integrate GLUE tasks via jiant framework.
Distributed under GPL-3.0 License. See the LICENSE file for details.