Skip to content

Commit

Permalink
Merge branch 'master' into feat/huggingface-inference
Browse files Browse the repository at this point in the history
  • Loading branch information
soumik12345 authored Jan 17, 2025
2 parents 6138efd + 607f9c8 commit cb0c748
Show file tree
Hide file tree
Showing 476 changed files with 28,581 additions and 8,486 deletions.
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
* @wandb/weave-team
/docs/ @wandb/docs-team @wandb/weave-team
weave-js/src/common @wandb/fe-infra-reviewers
weave-js/src/components @wandb/fe-infra-reviewers @wandb/weave-team
weave-js/src/assets @wandb/fe-infra-reviewers @wandb/weave-team
28 changes: 26 additions & 2 deletions .github/workflows/notify-wandb-core.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,38 @@ name: Notify wandb/core
on:
push:
branches:
- '**'
- "**"
workflow_dispatch:

permissions:
packages: write

jobs:
publish-package:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v2
- name: Configure npm for GitHub Packages
run: |
echo "//npm.pkg.github.com/:_authToken=${{ secrets.GITHUB_TOKEN }}" >> weave-js/.npmrc
- name: Publish package
run: |
cd weave-js
yarn install --frozen-lockfile
npm version 0.0.0-${{ github.sha }} --no-git-tag-version
yarn generate
cp package.json README.md .npmrc src/
cd src
if [ "${{ github.ref }}" = "refs/heads/master" ]; then
npm publish
else
npm publish --tag prerelease
fi
check-which-tests-to-run:
uses: ./.github/workflows/check-which-tests-to-run.yaml
notify-wandb-core:
needs: check-which-tests-to-run
needs: [check-which-tests-to-run, publish-package]
runs-on: ubuntu-latest
steps:
- name: Repository dispatch
Expand Down
90 changes: 54 additions & 36 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -81,11 +81,16 @@ jobs:
env:
CI: 1
WANDB_ENABLE_TEST_CONTAINER: true
LOGGING_ENABLED: true
ports:
- '8080:8080'
- '8083:8083'
- '9015:9015'
options: --health-cmd "curl --fail http://localhost:8080/healthz || exit 1" --health-interval=5s --health-timeout=3s
- "8080:8080"
- "8083:8083"
- "9015:9015"
options: >-
--health-cmd "wget -q -O /dev/null http://localhost:8080/healthz || exit 1"
--health-interval=5s
--health-timeout=3s
--health-start-period=10s
outputs:
tests_should_run: ${{ steps.test_check.outputs.tests_should_run }}
steps:
Expand Down Expand Up @@ -160,7 +165,10 @@ jobs:
- uses: actions/setup-node@v1
if: steps.check_run.outputs.should_lint_and_compile == 'true'
with:
node-version: '18.x'
node-version: "18.x"
- name: Configure npm for GitHub Packages
run: |
echo "//npm.pkg.github.com/:_authToken=${{ secrets.GITHUB_TOKEN }}" >> .npmrc
- name: Run WeaveJS Lint and Compile
if: steps.check_run.outputs.should_lint_and_compile == 'true'
run: |
Expand Down Expand Up @@ -213,37 +221,37 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version-major: ['3']
python-version-major: ["3"]
python-version-minor: [
'9',
'10',
'11',
'12',
'13',
"9",
"10",
"11",
"12",
"13",
#
]
nox-shard:
[
'trace',
'trace_server',
'anthropic',
'cerebras',
'cohere',
'dspy',
'groq',
'google_ai_studio',
'instructor',
'langchain',
'litellm',
'llamaindex',
'mistral0',
'mistral1',
'notdiamond',
'openai',
'vertexai',
'scorers_tests',
'pandas-test',
'huggingface',
"trace",
"trace_server",
"anthropic",
"cerebras",
"cohere",
"dspy",
"groq",
"google_ai_studio",
"instructor",
"langchain",
"litellm",
"llamaindex",
"mistral0",
"mistral1",
"notdiamond",
"openai",
"vertexai",
"scorers_tests",
"pandas-test",
"huggingface",
]
fail-fast: false
services:
Expand All @@ -255,19 +263,28 @@ jobs:
env:
CI: 1
WANDB_ENABLE_TEST_CONTAINER: true
LOGGING_ENABLED: true
ports:
- '8080:8080'
- '8083:8083'
- '9015:9015'
options: --health-cmd "curl --fail http://localhost:8080/healthz || exit 1" --health-interval=5s --health-timeout=3s
- "8080:8080"
- "8083:8083"
- "9015:9015"
options: >-
--health-cmd "wget -q -O /dev/null http://localhost:8080/healthz || exit 1"
--health-interval=5s
--health-timeout=3s
--health-start-period=10s
weave_clickhouse:
image: clickhouse/clickhouse-server
ports:
- '8123:8123'
- "8123:8123"
options: --health-cmd "wget -nv -O- 'http://localhost:8123/ping' || exit 1" --health-interval=5s --health-timeout=3s
steps:
- name: Checkout
uses: actions/checkout@v3
- name: Enable debug logging
run: echo "ACTIONS_STEP_DEBUG=true" >> $GITHUB_ENV
- name: Install SQLite dev package
run: sudo apt update && sudo apt install -y libsqlite3-dev
- name: Set up Python ${{ matrix.python-version-major }}.${{ matrix.python-version-minor }}
uses: actions/setup-python@v5
with:
Expand All @@ -294,6 +311,7 @@ jobs:
WB_SERVER_HOST: http://wandbservice
WF_CLICKHOUSE_HOST: weave_clickhouse
WEAVE_SERVER_DISABLE_ECOSYSTEM: 1
WANDB_API_KEY: ${{ secrets.WANDB_API_KEY }}
GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
MISTRAL_API_KEY: ${{ secrets.MISTRAL_API_KEY }}
Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ repos:
hooks:
- id: mypy
additional_dependencies:
[types-pkg-resources==0.1.3, types-all, wandb>=0.15.5]
[types-pkg-resources==0.1.3, types-all, wandb>=0.15.5, wandb<0.19.0]
# Note: You have to update pyproject.toml[tool.mypy] too!
args: ["--config-file=pyproject.toml"]
exclude: (.*pyi$)|(weave_query)|(tests)|(examples)
Expand Down
30 changes: 15 additions & 15 deletions dev_docs/BaseObjectClasses.md → dev_docs/BuiltinObjectClasses.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# BaseObjectClasses
# BuiltinObjectClasses

## Refresher on Objects and object storage

Expand Down Expand Up @@ -79,11 +79,11 @@ While many Weave Objects are free-form and user-defined, there is often a need f

Here's how to define and use a validated base object:

1. **Define your schema** (in `weave/trace_server/interface/base_object_classes/your_schema.py`):
1. **Define your schema** (in `weave/trace_server/interface/builtin_object_classes/your_schema.py`):

```python
from pydantic import BaseModel
from weave.trace_server.interface.base_object_classes import base_object_def
from weave.trace_server.interface.builtin_object_classes import base_object_def

class NestedConfig(BaseModel):
setting_a: int
Expand Down Expand Up @@ -116,7 +116,7 @@ curl -X POST 'https://trace.wandb.ai/obj/create' \
"project_id": "user/project",
"object_id": "my_config",
"val": {...},
"set_base_object_class": "MyConfig"
"object_class": "MyConfig"
}
}'

Expand Down Expand Up @@ -154,38 +154,38 @@ Run `make synchronize-base-object-schemas` to ensure the frontend TypeScript typ

### Architecture Flow

1. Define your schema in a python file in the `weave/trace_server/interface/base_object_classes/test_only_example.py` directory. See `weave/trace_server/interface/base_object_classes/test_only_example.py` as an example.
2. Make sure to register your schemas in `weave/trace_server/interface/base_object_classes/base_object_registry.py` by calling `register_base_object`.
1. Define your schema in a python file in the `weave/trace_server/interface/builtin_object_classes/test_only_example.py` directory. See `weave/trace_server/interface/builtin_object_classes/test_only_example.py` as an example.
2. Make sure to register your schemas in `weave/trace_server/interface/builtin_object_classes/builtin_object_registry.py` by calling `register_base_object`.
3. Run `make synchronize-base-object-schemas` to generate the frontend types.
* The first step (`make generate_base_object_schemas`) will run `weave/scripts/generate_base_object_schemas.py` to generate a JSON schema in `weave/trace_server/interface/base_object_classes/generated/generated_base_object_class_schemas.json`.
* The second step (yarn `generate-schemas`) will read this file and use it to generate the frontend types located in `weave-js/src/components/PagePanelComponents/Home/Browse3/pages/wfReactInterface/generatedBaseObjectClasses.zod.ts`.
* The first step (`make generate_base_object_schemas`) will run `weave/scripts/generate_base_object_schemas.py` to generate a JSON schema in `weave/trace_server/interface/builtin_object_classes/generated/generated_builtin_object_class_schemas.json`.
* The second step (yarn `generate-schemas`) will read this file and use it to generate the frontend types located in `weave-js/src/components/PagePanelComponents/Home/Browse3/pages/wfReactInterface/generatedBuiltinObjectClasses.zod.ts`.
4. Now, each use case uses different parts:
1. `Python Writing`. Users can directly import these classes and use them as normal Pydantic models, which get published with `weave.publish`. The python client correct builds the requisite payload.
2. `Python Reading`. Users can `weave.ref().get()` and the weave python SDK will return the instance with the correct type. Note: we do some special handling such that the returned object is not a WeaveObject, but literally the exact pydantic class.
3. `HTTP Writing`. In cases where the client/user does not want to add the special type information, users can publish base objects by setting the `set_base_object_class` setting on `POST obj/create` to the name of the class. The weave server will validate the object against the schema, update the metadata fields, and store the object.
3. `HTTP Writing`. In cases where the client/user does not want to add the special type information, users can publish builtin objects (set of weave.Objects provided by Weave) by setting the `builtin_object_class` setting on `POST obj/create` to the name of the class. The weave server will validate the object against the schema, update the metadata fields, and store the object.
4. `HTTP Reading`. When querying for objects, the server will return the object with the correct type if the `base_object_class` metadata field is set.
5. `Frontend`. The frontend will read the zod schema from `weave-js/src/components/PagePanelComponents/Home/Browse3/pages/wfReactInterface/generatedBaseObjectClasses.zod.ts` and use that to provide compile time type safety when using `useBaseObjectInstances` and runtime type safety when using `useCreateBaseObjectInstance`.
5. `Frontend`. The frontend will read the zod schema from `weave-js/src/components/PagePanelComponents/Home/Browse3/pages/wfReactInterface/generatedBuiltinObjectClasses.zod.ts` and use that to provide compile time type safety when using `useBaseObjectInstances` and runtime type safety when using `useCreateBaseObjectInstance`.
* Note: it is critical that all techniques produce the same digest for the same data - which is tested in the tests. This way versions are not thrashed by different clients/users.

```mermaid
graph TD
subgraph Schema Definition
F["weave/trace_server/interface/<br>base_object_classes/your_schema.py"] --> |defines| P[Pydantic BaseObject]
P --> |register_base_object| R["base_object_registry.py"]
P --> |register_base_object| R["builtin_object_registry.py"]
end
subgraph Schema Generation
M["make synchronize-base-object-schemas"] --> G["make generate_base_object_schemas"]
G --> |runs| S["weave/scripts/<br>generate_base_object_schemas.py"]
R --> |import registered classes| S
S --> |generates| J["generated_base_object_class_schemas.json"]
M --> |yarn generate-schemas| Z["generatedBaseObjectClasses.zod.ts"]
S --> |generates| J["generated_builtin_object_class_schemas.json"]
M --> |yarn generate-schemas| Z["generatedBuiltinObjectClasses.zod.ts"]
J --> Z
end
subgraph "Trace Server"
subgraph "HTTP API"
R --> |validates using| HW["POST obj/create<br>set_base_object_class"]
R --> |validates using| HW["POST obj/create<br>object_class"]
HW --> DB[(Weave Object Store)]
HR["POST objs/query<br>base_object_classes"] --> |Filters base_object_class| DB
end
Expand All @@ -203,7 +203,7 @@ graph TD
Z --> |import| UBI["useBaseObjectInstances"]
Z --> |import| UCI["useCreateBaseObjectInstance"]
UBI --> |Filters base_object_class| HR
UCI --> |set_base_object_class| HW
UCI --> |object_class| HW
UI[React UI] --> UBI
UI --> UCI
end
Expand Down
5 changes: 3 additions & 2 deletions dev_docs/RELEASE.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@ This document outlines how to publish a new Weave release to our public [PyPI pa

2. You should also run through this [sample notebook](https://colab.research.google.com/drive/1DmkLzhFCFC0OoN-ggBDoG1nejGw2jQZy#scrollTo=29hJrcJQA7jZ) remember to install from master. You can also just run the [quickstart](http://wandb.me/weave_colab).

3. To prepare a PATCH release, go to GitHub Actions and run the `bump-python-sdk-version` workflow on master. This will:
3. To prepare a PATCH release, go to GitHub Actions and run the [bump-python-sdk-version](https://github.com/wandb/weave/actions/workflows/bump_version.yaml) workflow on master. This will:

- Create a new patch version by dropping the pre-release (e.g., `x.y.z-dev0` -> `x.y.z`) and tag this commit with `x.y.z`
- Create a new dev version by incrementing the dev version (e.g., `x.y.z` -> `x.y.(z+1)-dev0`) and commit this to master
- Both of these commits will be pushed to master
Expand All @@ -16,6 +17,6 @@ This document outlines how to publish a new Weave release to our public [PyPI pa

5. Verify the new version of Weave exists in [PyPI](https://pypi.org/project/weave/) once it is complete.

6. Go to GitHub, click the release tag, and click `Draft a New Release`. Select the new tag, and click generate release notes. Publish the release.
6. Go to the [GitHub new release page](https://github.com/wandb/weave/releases/new). Select the new tag, and click "Generate release notes". Publish the release.

7. Finally, announce that the merge freeze is over.
18 changes: 0 additions & 18 deletions docs/docs/guides/cookbooks/prod_dashboard.md

This file was deleted.

1 change: 0 additions & 1 deletion docs/docs/guides/cookbooks/summarization/.gitignore

This file was deleted.

2 changes: 1 addition & 1 deletion docs/docs/guides/core-types/datasets.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ This guide will show you how to:

## Sample code

<Tabs groupId="programming-language">
<Tabs groupId="programming-language" queryString>
<TabItem value="python" label="Python" default>
```python
import weave
Expand Down
28 changes: 28 additions & 0 deletions docs/docs/guides/core-types/env-vars.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Environment variables

Weave provides a set of environment variables to configure and optimize its behavior. You can set these variables in your shell or within scripts to control specific functionality.

```bash
# Example of setting environment variables in the shell
WEAVE_PARALLELISM=10 # Controls the number of parallel workers
WEAVE_PRINT_CALL_LINK=false # Disables call link output
```

```python
# Example of setting environment variables in Python
import os

os.environ["WEAVE_PARALLELISM"] = "10"
os.environ["WEAVE_PRINT_CALL_LINK"] = "false"
```

## Environment variables reference

| Variable Name | Description |
|--------------------------|-----------------------------------------------------------------|
| WEAVE_CAPTURE_CODE | Disable code capture for `weave.op` if set to `false`. |
| WEAVE_DEBUG_HTTP | If set to `1`, turns on HTTP request and response logging for debugging. |
| WEAVE_DISABLED | If set to `true`, all tracing to Weave is disabled. |
| WEAVE_PARALLELISM | In evaluations, the number of examples to evaluate in parallel. `1` runs examples sequentially. Default value is `20`. |
| WEAVE_PRINT_CALL_LINK | If set to `false`, call URL printing is suppressed. Default value is `false`. |
| WEAVE_TRACE_LANGCHAIN | When set to `false`, explicitly disable global tracing for LangChain. | |
4 changes: 2 additions & 2 deletions docs/docs/guides/core-types/media.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Weave supports logging and displaying multiple first class media types. Log imag

Logging type: `PIL.Image.Image`. Here is an example of logging an image with the OpenAI DALL-E API:

<Tabs groupId="programming-language">
<Tabs groupId="programming-language" queryString>
<TabItem value="python" label="Python" default>

```python
Expand Down Expand Up @@ -83,7 +83,7 @@ This image will be logged to weave and automatically displayed in the UI. The fo

Logging type: `wave.Wave_read`. Here is an example of logging an audio file using openai's speech generation API.

<Tabs groupId="programming-language">
<Tabs groupId="programming-language" queryString>
<TabItem value="python" label="Python" default>

```python
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/guides/core-types/models.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ import TabItem from '@theme/TabItem';

# Models

<Tabs groupId="programming-language">
<Tabs groupId="programming-language" queryString>
<TabItem value="python" label="Python" default>
A `Model` is a combination of data (which can include configuration, trained model weights, or other information) and code that defines how the model operates. By structuring your code to be compatible with this API, you benefit from a structured way to version your application so you can more systematically keep track of your experiments.

Expand Down
Loading

0 comments on commit cb0c748

Please sign in to comment.