Skip to content

Commit

Permalink
Add pure local mode (#20)
Browse files Browse the repository at this point in the history
* add local mode and readme updates

* ensure using newest lambdaprompt

* fix typo
  • Loading branch information
bluecoconut authored May 12, 2023
1 parent b4d5f25 commit 5f79d21
Show file tree
Hide file tree
Showing 3 changed files with 15 additions and 5 deletions.
9 changes: 8 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,14 @@ df['capitol'] = pd.DataFrame({'State': ['Colorado', 'Kansas', 'California', 'New

## Sketch currently uses `prompts.approx.dev` to help run with minimal setup

In the future, we plan to update the prompts at this endpoint with our own custom foundation model, built to answer questions more accurately than GPT-3 can with its minimal data context.
You can also directly use a few pre-built hugging face models (right now `MPT-7B` and `StarCoder`), which will run entirely locally (once you download the model weights from HF).
Do this by setting environment 3 variables:

```python
os.environ['LAMBDAPROMPT_BACKEND'] = 'StarCoder'
os.environ['SKETCH_USE_REMOTE_LAMBDAPROMPT'] = 'False'
os.environ['HF_ACCESS_TOKEN'] = 'your_hugging_face_token'
```

You can also directly call OpenAI directly (and not use our endpoint) by using your own API key. To do this, set 2 environment variables.

Expand Down
5 changes: 4 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,15 @@ dependencies = [
"datasketch>=1.5.8",
"datasketches>=4.0.0",
"ipython",
"lambdaprompt",
"lambdaprompt>=0.5.2",
"packaging"
]
urls = {homepage = "https://github.com/approximatelabs/sketch"}
dynamic = ["version"]

[project.optional-dependencies]
local = ["lambdaprompt[local]"]
all = ["sketch[local]"]

[tool.setuptools_scm]

Expand Down
6 changes: 3 additions & 3 deletions sketch/pandas_extension.py
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@ def call_prompt_on_dataframe(df, prompt, **kwargs):
return text_to_copy


howto_prompt = lambdaprompt.GPT3Prompt(
howto_prompt = lambdaprompt.Completion(
"""
For the pandas dataframe ({{ dfname }}) the user wants code to solve a problem.
Summary statistics and descriptive data of dataframe [`{{ dfname }}`]:
Expand Down Expand Up @@ -234,7 +234,7 @@ def howto_from_parts(
return code


ask_prompt = lambdaprompt.GPT3Prompt(
ask_prompt = lambdaprompt.Completion(
"""
For the pandas dataframe ({{ dfname }}) the user wants an answer to a question about the data.
Summary statistics and descriptive data of dataframe [`{{ dfname }}`]:
Expand Down Expand Up @@ -338,7 +338,7 @@ def apply(self, prompt_template_string, **kwargs):
raise RuntimeError(
f"Too many rows for apply \n (SKETCH_ROW_OVERRIDE_LIMIT: {row_limit}, Actual: {len(self._obj)})"
)
new_gpt3_prompt = lambdaprompt.GPT3Prompt(prompt_template_string)
new_gpt3_prompt = lambdaprompt.Completion(prompt_template_string)
named_args = new_gpt3_prompt.get_named_args()
known_args = set(self._obj.columns) | set(kwargs.keys())
needed_args = set(named_args)
Expand Down

0 comments on commit 5f79d21

Please sign in to comment.