Automatic Evaluation Metrics for Multi-Sentence Text

Human evaluations of model generated text are accurate, but expensive and slow for the purpose of model development. Evaluating the output ofsuch systems automatically, saves time, accelerates further research on the text generation tasks and it will also be free of human bias. We provide an in-depth review and comparison of traditional metrics which is based on n-gram word matching to the recently published ones where textual embeddings are compared. We also provide their correlations of these metrics with human evaluation.

Name		Name	Last commit message	Last commit date
Latest commit History 189 Commits
1_BERTScore		1_BERTScore
2_BLEU		2_BLEU
3_SMS		3_SMS
4_ROUGE		4_ROUGE
data		data
postproc		postproc
results		results
.gitignore		.gitignore
AutomaticEvaluationMetrics_PPT.pdf		AutomaticEvaluationMetrics_PPT.pdf
AutomaticEvaluationMetrics_Report.pdf		AutomaticEvaluationMetrics_Report.pdf
README.md		README.md
demo.log		demo.log
demo.sh		demo.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Automatic Evaluation Metrics for Multi-Sentence Text

About

Releases

Packages

Contributors 3

Languages

EshwarSR/AutomaticEvaluationMetrics

Folders and files

Latest commit

History

Repository files navigation

Automatic Evaluation Metrics for Multi-Sentence Text

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages