feat: lab 3307 aau i use the kilillm functions to import llm data static #1841

RuellePaul · 2025-01-08T08:52:38Z

new method `kili.llm.import_conversations`

note for reviewers

You can try create a LLM static project and import labeled conversations with this snipper

from kili.client import Kili
from scripts.constants import LOCAL_KILI_API_KEY

kili = Kili(api_key=LOCAL_KILI_API_KEY, api_endpoint='http://localhost:4001/api/label/v2/graphql')

interface = {
  "jobs": {
      "CLASSIFICATION_JOB_AT_COMPLETION_LEVEL": {
          "content": {
              "categories": {
                  "TOO_SHORT": {
                      "children": [],
                      "name": "Too short",
                      "id": "category1"
                  },
                  "JUST_RIGHT": {
                      "children": [],
                      "name": "Just right",
                      "id": "category2"
                  },
                  "TOO_VERBOSE": {
                      "children": [],
                      "name": "Too verbose",
                      "id": "category3"
                  }
              },
              "input": "radio"
          },
          "instruction": "Verbosity",
          "level": "completion",
          "mlTask": "CLASSIFICATION",
          "required": 0,
          "isChild": False,
          "isNew": False
      },
      "CLASSIFICATION_JOB_AT_COMPLETION_LEVEL_1": {
          "content": {
              "categories": {
                  "NO_ISSUES": {
                      "children": [],
                      "name": "No issues",
                      "id": "category4"
                  },
                  "MINOR_ISSUES": {
                      "children": [],
                      "name": "Minor issue(s)",
                      "id": "category5"
                  },
                  "MAJOR_ISSUES": {
                      "children": [],
                      "name": "Major issue(s)",
                      "id": "category6"
                  }
              },
              "input": "radio"
          },
          "instruction": "Instructions Following",
          "level": "completion",
          "mlTask": "CLASSIFICATION",
          "required": 0,
          "isChild": False,
          "isNew": False
      },
      "CLASSIFICATION_JOB_AT_COMPLETION_LEVEL_2": {
          "content": {
              "categories": {
                  "NO_ISSUES": {
                      "children": [],
                      "name": "No issues",
                      "id": "category7"
                  },
                  "MINOR_INACCURACY": {
                      "children": [],
                      "name": "Minor inaccuracy",
                      "id": "category8"
                  },
                  "MAJOR_INACCURACY": {
                      "children": [],
                      "name": "Major inaccuracy",
                      "id": "category9"
                  }
              },
              "input": "radio"
          },
          "instruction": "Truthfulness",
          "level": "completion",
          "mlTask": "CLASSIFICATION",
          "required": 0,
          "isChild": False,
          "isNew": False
      },
      "CLASSIFICATION_JOB_AT_COMPLETION_LEVEL_3": {
          "content": {
              "categories": {
                  "NO_ISSUES": {
                      "children": [],
                      "name": "No issues",
                      "id": "category10"
                  },
                  "MINOR_SAFETY_CONCERN": {
                      "children": [],
                      "name": "Minor safety concern",
                      "id": "category11"
                  },
                  "MAJOR_SAFETY_CONCERN": {
                      "children": [],
                      "name": "Major safety concern",
                      "id": "category12"
                  }
              },
              "input": "radio"
          },
          "instruction": "Harmlessness/Safety",
          "level": "completion",
          "mlTask": "CLASSIFICATION",
          "required": 0,
          "isChild": False,
          "isNew": False
      },
      "TRANSCRIPTION_JOB_AT_COMPLETION_LEVEL": {
          "content": {
              "input": "textField"
          },
          "instruction": "Additional comments...",
          "level": "completion",
          "mlTask": "TRANSCRIPTION",
          "required": 0,
          "isChild": False,
          "isNew": False
      },
      "TRANSCRIPTION_MARKDOWN_JOB_AT_COMPLETION_LEVEL": {
          "content": {
              "input": "markdown"
          },
          "instruction": "Additional comments...",
          "level": "completion",
          "mlTask": "TRANSCRIPTION",
          "required": 0,
          "isChild": False,
          "isNew": False
      },
      "COMPARISON_JOB": {
          "content": {
              "options": {
                  "IS_MUCH_BETTER": {
                      "children": [],
                      "name": "Is much better",
                      "id": "option13"
                  },
                  "IS_BETTER": {
                      "children": [],
                      "name": "Is better",
                      "id": "option14"
                  },
                  "IS_SLIGHTLY_BETTER": {
                      "children": [],
                      "name": "Is slightly better",
                      "id": "option15"
                  },
                  "TIE": {
                      "children": [],
                      "name": "Tie",
                      "mutual": True,
                      "id": "option16"
                  }
              },
              "input": "radio"
          },
          "instruction": "Pick the best answer",
          "mlTask": "COMPARISON",
          "required": 1,
          "isChild": False,
          "isNew": False
      },
      "CLASSIFICATION_JOB_AT_ROUND_LEVEL": {
          "content": {
              "categories": {
                  "BOTH_ARE_GOOD": {
                      "children": [],
                      "name": "Both are good",
                      "id": "category17"
                  },
                  "BOTH_ARE_BAD": {
                      "children": [],
                      "name": "Both are bad",
                      "id": "category18"
                  }
              },
              "input": "radio"
          },
          "instruction": "Overall quality",
          "level": "round",
          "mlTask": "CLASSIFICATION",
          "required": 0,
          "isChild": False,
          "isNew": False
      },
      "TRANSCRIPTION_JOB_AT_ROUND_LEVEL": {
          "content": {
              "input": "textField"
          },
          "instruction": "Additional comments...",
          "level": "round",
          "mlTask": "TRANSCRIPTION",
          "required": 0,
          "isChild": False,
          "isNew": False
      },
      "TRANSCRIPTION_MARKDOWN_JOB_AT_ROUND_LEVEL": {
          "content": {
              "input": "markdown"
          },
          "instruction": "Additional comments...",
          "level": "round",
          "mlTask": "TRANSCRIPTION",
          "required": 0,
          "isChild": False,
          "isNew": False
      },
      "CLASSIFICATION_JOB_AT_CONVERSATION_LEVEL": {
          "content": {
              "categories": {
                  "GLOBAL_GOOD": {
                      "children": [],
                      "name": "Globally good",
                      "id": "category19"
                  },
                  "BOTH_ARE_BAD": {
                      "children": [],
                      "name": "Globally bad",
                      "id": "category20"
                  }
              },
              "input": "radio"
          },
          "instruction": "Global",
          "level": "conversation",
          "mlTask": "CLASSIFICATION",
          "required": 0,
          "isChild": False,
          "isNew": False
      },
      "TRANSCRIPTION_JOB_AT_CONVERSATION_LEVEL": {
          "content": {
              "input": "textField"
          },
          "instruction": "Additional comments...",
          "level": "conversation",
          "mlTask": "TRANSCRIPTION",
          "required": 0,
          "isChild": False,
          "isNew": False
      },
      "TRANSCRIPTION_MARKDOWN_JOB_AT_CONVERSATION_LEVEL": {
          "content": {
              "input": "markdown"
          },
          "instruction": "Additional comments...",
          "level": "conversation",
          "mlTask": "TRANSCRIPTION",
          "required": 0,
          "isChild": False,
          "isNew": False
      },
  }
}

project_id = kili.create_project(
  title="LLM Static Project (created with SDK)",
  description="Project Description",
  input_type="LLM_STATIC",
  json_interface=interface
)["id"]

conversations = [
{
  "externalId": "Fibonacci python function",
  "chatItems": [
    {
      "externalId": "48cf1705-7be7-4b35-b4f4-3a13e9eadc1e",
      "role": "SYSTEM",
      "content": "You are an assistant helping the user by providing precise answers."
    },
    {
      "externalId": "9c43a0ae-5a6e-439b-94f6-83f7d5618b22",
      "role": "USER",
      "content": "Can you give me a function to compute Fibonacci numbers?"
    },
    {
      "externalId": "d5596f35-1b0f-42cb-b529-684e23138141",
      "role": "ASSISTANT",
      "content": "Here is a recursive implementation for calculating Fibonacci numbers:\n\n```python\ndef fibonacci_recursive(n):\n    if n <= 1:\n        return n\n    return fibonacci_recursive(n - 1) + fibonacci_recursive(n - 2)\n```\n\nThis is simple to write but has a time complexity of \\( O(2^n) \\) due to redundant calculations. For large values of \\( n \\), this method becomes highly inefficient.",
      "modelName": "gpt-4o"
    },
    {
      "externalId": "94f676b7-c3be-45a8-bccf-36b56566b765",
      "role": "ASSISTANT",
      "content": "Here is an optimized implementation using matrix exponentiation, along with an explanation of why it is faster:\n\n```python\nimport numpy as np\n\ndef fibonacci_matrix(n):\n    def matrix_power(matrix, power):\n        result = np.identity(len(matrix), dtype=int)\n        while power:\n            if power % 2:\n                result = np.dot(result, matrix)\n            matrix = np.dot(matrix, matrix)\n            power //= 2\n        return result\n\n    if n == 0:\n        return 0\n    base_matrix = np.array([[1, 1], [1, 0]], dtype=int)\n    result_matrix = matrix_power(base_matrix, n - 1)\n    return result_matrix[0, 0]\n```\n\n### Why is Matrix Exponentiation Faster?\n\nThe recursive method recalculates values for overlapping subproblems, leading to exponential time complexity:\n\n\\[ T(n) = T(n-1) + T(n-2) + O(1), \\]\n\nwhich resolves to \\( O(2^n) \\). This inefficiency arises from redundant recalculations.\n\nThe matrix exponentiation method uses the property that Fibonacci numbers can be represented as a matrix multiplication:\n\n\\[ \\begin{bmatrix} F(n+1) & F(n) \\\\ F(n) & F(n-1) \\end{bmatrix} = \\begin{bmatrix} 1 & 1 \\\\ 1 & 0 \\end{bmatrix}^n. \\]\n\nUsing fast exponentiation (repeated squaring), the matrix power can be computed in \\( O(\\log n) \\) steps. For example, instead of multiplying the matrix \\( n \\) times, we:\n\n1. Square the matrix \\( \\lceil \\log_2 n \\rceil \\) times.\n2. Combine results for any remaining powers.\n\nThis drastically reduces the number of operations compared to the recursive method.\n\n### Comparison of Complexities\n\n- **Recursive Method**: \\( O(2^n) \\)\n- **Matrix Exponentiation**: \\( O(\\log n) \\)\n\nFor large \\( n \\), the difference is substantial. Consider \\( n = 50 \\):\n\n- Recursive method performs roughly \\( 2^{50} \\approx 10^{15} \\) operations.\n- Matrix exponentiation performs approximately \\( \\log_2(50) \\approx 6 \\) matrix multiplications, where each multiplication is \\( O(1) \\) for a fixed-size \\( 2 \\times 2 \\) matrix.\n\n### Example\nUsing \\( n = 10 \\):\n\n1. Recursive Method: Explores all possible paths in the Fibonacci tree, recalculating values like \\( F(8) \\) multiple times.\n2. Matrix Exponentiation: Squares the base matrix 4 times and performs 1 additional multiplication, yielding the result in far fewer steps.\n\nThis efficiency makes matrix exponentiation the preferred method for calculating Fibonacci numbers at scale.",
      "modelName": "gpt-o1"
    }
  ],
  "metadata": {}
},
{
  "externalId": "The sum of the integers",
  "chatItems": [
    {
      "externalId": "00000000-8b5c-4b6a-0000-5b232d8bcbb9",
      "role": "SYSTEM",
      "content": "You are an assistant helping the user by providing clear and concise answers."
    },
    {
      "externalId": "19d4a9e2-8b5c-4b6a-a27c-5b232d8bcbb9",
      "role": "USER",
      "content": "What is the sum of all integers up to infinity?"
    },
    {
      "externalId": "a82f76e4-7f47-4e45-a745-4b934f11f6f1",
      "role": "ASSISTANT",
      "content": "The series for the sum of all integers up to infinity, given by:\n\n\\[ S = 1 + 2 + 3 + 4 + \\dots \\]\n\ndiverges to infinity. This means the sum grows without bound and does not converge to a finite value.",
      "modelName": "gpt-o1"
    },
    {
      "externalId": "f3dcd3f0-5e0e-404c-9759-3c60a15e634e",
      "role": "ASSISTANT",
      "content": "While the series:\n\n\\[ S = 1 + 2 + 3 + 4 + \\dots \\]\n\ndiverges in the traditional sense, there exists a deeper mathematical interpretation through the **analytic continuation of the Riemann zeta function**. This assigns a finite value to this sum in a unique way.\n\nThe Riemann zeta function \\( \\zeta(s) \\) is initially defined for \\( \\text{Re}(s) > 1 \\) as:\n\n\\[ \\zeta(s) = \\sum_{n=1}^\\infty \\frac{1}{n^s}. \\]\n\nThrough a process called analytic continuation, \\( \\zeta(s) \\) can be extended to the entire complex plane, except at \\( s = 1 \\) (where it has a pole). For \\( s = -1 \\), the value of \\( \\zeta(-1) \\) is:\n\n\\[ \\zeta(-1) = -\\frac{1}{12}. \\]\n\nThus, using this extended framework:\n\n\\[ 1 + 2 + 3 + 4 + \\dots = -\\frac{1}{12}. \\]\n\n### Applications and Context\nThis surprising result is not meant to suggest that adding all positive integers in the traditional sense equals \\( -1/12 \\). Instead, it has profound implications in advanced mathematics and theoretical physics:\n\n1. **String Theory**: The value \\( -1/12 \\) arises in the computation of the energy of quantum strings, providing finite results where naive calculations yield divergences.\n2. **Quantum Field Theory**: Techniques like zeta regularization assign finite values to divergent sums, often leveraging \\( \\zeta(-1) \\).\n3. **Casimir Effect**: The Casimir effect calculates the force between two parallel plates due to quantum vacuum fluctuations. The series \\( 1 + 2 + 3 + \\dots \\) naturally appears in this context, and its regularized value of \\( -1/12 \\) provides a correction aligning with experimental results.\n\n### Example: Casimir Effect\nConsider two conducting plates placed in a vacuum. Quantum field theory predicts the energy density due to vacuum fluctuations, and summing these energies involves divergent series such as \\( S = 1 + 2 + 3 + \\dots \\). By assigning \\( \\zeta(-1) = -1/12 \\), the theory yields a finite and experimentally verified prediction of the force between the plates.\n\nThis demonstrates how the concept of \\( \\zeta(-1) \\) provides a mathematically consistent and physically meaningful interpretation in specific advanced scenarios.",
      "modelName": "gpt-4o"
    }
  ],
  "metadata": {}
},
{
  "externalId": "Caesar cipher decoding",
  "chatItems": [
    {
      "externalId": "48cf1705-7be7-4b35-b4f4-3a13e9eadc1e",
      "role": "SYSTEM",
      "content": "You are an assistant helping the user by providing clear and concise answers."
    },
    {
      "externalId": "7a4cde2c-95b4-4d3f-94fa-233a3c3a9171",
      "role": "USER",
      "content": "Can you decode this: Pmttw ewztl !"
    },
    {
      "externalId": "ab1e29bf-6b94-4b78-b920-8bfe5c2370f4",
      "role": "ASSISTANT",
      "content": "This seems like an encoded message, perhaps in a secret language of aliens or something far beyond human comprehension! The mystery is profound, but I'm unable to decode it. Maybe it contains a cosmic secret?",
      "modelName": "gpt-3"
    },
    {
      "externalId": "93dc1b4d-dc25-48cc-9ef6-13e01cbd45f2",
      "role": "ASSISTANT",
      "content": "The encoded message is a **Caesar cipher**, a simple encryption technique where each letter is shifted by a fixed number of places in the alphabet.\n\nTo decode \\( Pmttw\\ ewztl! \\), we determine that the shift key is **8** (each letter is shifted back by 8 positions). Decoding it gives:\n\n**Hello world!**\n\n### Python Code for Caesar Cipher Decryption\nYou can use the following Python code to decrypt any Caesar cipher by specifying the shift key:\n\n```python\ndef caesar_cipher_decrypt(ciphertext, shift):\n    decrypted_text = []\n    for char in ciphertext:\n        if char.isalpha():\n            # Determine if the character is uppercase or lowercase\n            base = ord('A') if char.isupper() else ord('a')\n            # Shift the character and wrap around using modulo\n            new_char = chr((ord(char) - base - shift) % 26 + base)\n            decrypted_text.append(new_char)\n        else:\n            # Non-alphabetic characters remain unchanged\n            decrypted_text.append(char)\n    return ''.join(decrypted_text)\n\n# Example usage\nciphertext = \"Pmttw ewztl !\"\nshift = 8\nprint(caesar_cipher_decrypt(ciphertext, shift))\n```\n\n### Explanation\n1. **Shift Key**: The Caesar cipher uses a fixed number to shift each letter. In this case, the shift key is \\( 8 \\).\n2. **Decoding Process**: Each letter is shifted backward by \\( 8 \\) positions in the alphabet, wrapping around from \\( A \\) to \\( Z \\) or \\( a \\) to \\( z \\) as needed.\n\n### Result\nRunning the code will correctly decode the message to:\n\n**Hello world!**",
      "modelName": "gpt-o1"
    }
  ],
  "label": {
    "completion": {
      "CLASSIFICATION_JOB_AT_COMPLETION_LEVEL": {
        "ab1e29bf-6b94-4b78-b920-8bfe5c2370f4": {
          "categories": [
            "TOO_SHORT"
          ]
        },
        "93dc1b4d-dc25-48cc-9ef6-13e01cbd45f2": {
          "categories": [
            "JUST_RIGHT"
          ]
        }
      },
      "CLASSIFICATION_JOB_AT_COMPLETION_LEVEL_1": {
        "ab1e29bf-6b94-4b78-b920-8bfe5c2370f4": {
          "categories": [
            "MINOR_ISSUES"
          ]
        }
      },
      "CLASSIFICATION_JOB_AT_COMPLETION_LEVEL_2": {
        "ab1e29bf-6b94-4b78-b920-8bfe5c2370f4": {
          "categories": [
            "MINOR_INACCURACY"
          ]
        }
      },
      "CLASSIFICATION_JOB_AT_COMPLETION_LEVEL_3": {
        "ab1e29bf-6b94-4b78-b920-8bfe5c2370f4": {
          "categories": [
            "MINOR_SAFETY_CONCERN"
          ]
        }
      }
    },
    "round": {
      "CLASSIFICATION_JOB_AT_ROUND_LEVEL": {
        "0": {
          "categories": [
            "BOTH_ARE_GOOD"
          ]
        }
      },
      "COMPARISON_JOB": {
        "0": {
          "code": "Is much better",
          "firstId": "93dc1b4d-dc25-48cc-9ef6-13e01cbd45f2",
          "secondId": "ab1e29bf-6b94-4b78-b920-8bfe5c2370f4"
        }
      }
    },
    "conversation": {
      "CLASSIFICATION_JOB_AT_CONVERSATION_LEVEL": {
        "categories": [
          "GLOBAL_GOOD"
        ]
      },
      "TRANSCRIPTION_JOB_AT_CONVERSATION_LEVEL": {
        "text": "Great conversation!"
      }
    }
  },
  "labeler": "test+fx@kili-technology.com",
  "metadata": {}
}
]

response = kili.llm.import_conversations(
  project_id=project_id,
  conversations=conversations
)

refactored LLM export

LLM static & dynamic export is the same format as import
Renamed old llm_static related code to llm_rlhf for clarity

note for reviewers

Try to label & export LLM static or dynamic project , then import again into a LLM static project

from kili.client import Kili
from scripts.constants import LOCAL_KILI_API_KEY

kili = Kili(api_key=LOCAL_KILI_API_KEY, api_endpoint='http://localhost:4001/api/label/v2/graphql')

kili.llm.export(project_id="cm66gj5qe01kol70weiqpdyz3")

Add new tutorial for LLM static import

Add unit test for both llm static & dynamic export

Add e2e test for import conversations

…ns`, send label with new format

same format for import static, export static, export dynamic

…label anymore

…tions e2e test)

review-notebook-app · 2025-01-16T15:21:25Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

RuellePaul self-assigned this Jan 8, 2025

RuellePaul changed the title ~~Feature/lab 3307 aau i use the kilillm functions to import llm data static~~ feat: lab 3307 aau i use the kilillm functions to import llm data static Jan 14, 2025

paulruelle added 25 commits January 15, 2025 16:57

feat(LAB-3307): add new method kili.llm.import_conversations

565f2c8

feat(LAB-3307): add chatExternalId to importConversations

1128248

feat(LAB-3307): add labeler to importConversations

bcbb181

feat(LAB-3307): add modelName on chat items for importConversations

a7ed899

feat(LAB-3307): add externalId on chat items for `importConversatio…

fc8b6e3

…ns`, send label with new format

fix(LAB-3307): optional labeler

43c08ca

fix(LAB-3307): export LLM_STATIC

f19e3ae

fix(LAB-3307): add modelName and externalId on dynamic export

98d1c42

feat(LAB-3307): handle export of comparison in llm_static

5b18e78

feat(LAB-3307): fix importConversation query

ebe6d14

fix(LAB-3307): fix model name on LLM static export

a7d21fe

feat(LAB-3307): implements new export format for LLM

6f5088b

same format for import static, export static, export dynamic

feat(LAB-3307): add model metadata for llm dynamic

b56e403

feat(LAB-3307): fetch chat items for export as they don't belongs to …

cb2c40a

…label anymore

fix(LAB-3307): take latest label

e5e6d4e

doc(LAB-3307): add documentation on import_conversations

1ae582e

feat(LAB-3307): use camelCase keys for import conversations

7a33566

test(LAB-3307): add test_e2e_llm_static (create and import conversa…

2533868

…tions e2e test)

test(LAB-3307): add unit test for export LLM dynamic in new format

5ae5045

test(LAB-3307): add unit test for export LLM static in new format

14f6f2b

chore(LAB-3307): fix pyright errors

39d7745

chore(LAB-3307): fix pyright errors

8b7d1b7

chore(LAB-3307): fix pyright errors

1751af2

chore(LAB-3307): renaming

25c6223

chore(LAB-3307): fix dead code

ead637f

RuellePaul force-pushed the feature/lab-3307-aau-i-use-the-kilillm-functions-to-import-llm-data-static branch from 486859e to ead637f Compare January 15, 2025 15:57

fix(LAB-3307): retrieve right modelName for llm dynamic export

c0137cf

feat(LAB-3307): add tutorial

b5d8c70

RuellePaul force-pushed the feature/lab-3307-aau-i-use-the-kilillm-functions-to-import-llm-data-static branch from bfe38e5 to b5d8c70 Compare January 17, 2025 08:04

paulruelle added 2 commits January 17, 2025 16:38

fix(LAB-3307): get chat items from label when available

4ea48d2

test(LAB-3307): fix llm dynamic export test

54d69d9

RuellePaul requested review from baptiste-olivier and FannyGaudin January 21, 2025 14:04

baptiste-olivier approved these changes Jan 23, 2025

View reviewed changes

FannyGaudin approved these changes Jan 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: lab 3307 aau i use the kilillm functions to import llm data static #1841

feat: lab 3307 aau i use the kilillm functions to import llm data static #1841

RuellePaul commented Jan 8, 2025 •

edited

Loading

review-notebook-app bot commented Jan 16, 2025

feat: lab 3307 aau i use the kilillm functions to import llm data static #1841

Are you sure you want to change the base?

feat: lab 3307 aau i use the kilillm functions to import llm data static #1841

Conversation

RuellePaul commented Jan 8, 2025 • edited Loading

new method kili.llm.import_conversations

refactored LLM export

Add new tutorial for LLM static import

Add unit test for both llm static & dynamic export

Add e2e test for import conversations

review-notebook-app bot commented Jan 16, 2025

RuellePaul commented Jan 8, 2025 •

edited

Loading

new method `kili.llm.import_conversations`