NAIC Entity Matching Demonstrator

Applying Prompt Engineering for Entity Matching

This repository contains the code for experimenting with various prompt engineering techniques for entity matching.

Setup

Clone the repository
Install the required packages using the following command:

    pip install -r requirements.txt

Download the datasets

    python utils/download_datasets.py

Ensure that OPENAI_API_KEY is set in the environment variables.

Notebook

The notebook Demonstrator.ipynb contains examples for how to run some experiments and visualize the results.

Running experiments

To run a zero-shot experiment on the abt-buy dataset using the natural prompt, 200 pairs, GPT-4o as the language model, context and "Be lenient in your judgement" added, run the following command:

    python main.py -d abt-buy -pf natural -n 200 -k 0 -llm gpt-4o -imp leneint -ctx

To run a few-shot experiment on the dblp_gs_dirty dataset using the tabular prompt, all pairs (if the number is larger than the dataset, the entire dataset will be used), 10 examples in the prompt with a 30/70 split of positive and negative examples respectively, GPT-3 as the language model, basic prompt format, and no sublte context, the following command can be used:

    python main.py -d dblp_gs_dirty -pf tabular -n 10000 -k 10 -llm gpt-3 -imp basic -ctp '(P,N)' -pn "30/70"

To get a list of all the available options, run the following command:

    python main.py --help

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
category_selections		category_selections
utils		utils
.gitignore		.gitignore
Demonstrator.ipynb		Demonstrator.ipynb
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NAIC Entity Matching Demonstrator

Applying Prompt Engineering for Entity Matching

Setup

Notebook

Running experiments

About

Releases

Packages

Languages

SINTEF/naic-em-demonstrator

Folders and files

Latest commit

History

Repository files navigation

NAIC Entity Matching Demonstrator

Applying Prompt Engineering for Entity Matching

Setup

Notebook

Running experiments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages