Random KeyError happening during code execution with correct code #1544

hbrantner-auvaria · 2025-01-23T10:18:22Z

System Info

pandas AI 2.4.2
mac OS 15.2
Python 3.12.8
AWS Bedrock as LLM with Claude 3.5 Sonnet v2

🐛 Describe the bug

This is the log that I get in my console if I ask my agent: "What is the current date?" I have no data sources attached ...
Why do I get the KeyError here, code seems to be just fine. I do not understand why such easy use case is not working.

You generated this python code:
current_date = datetime.datetime.now(datetime.timezone.utc)
formatted_date = current_date.strftime('%A, %B %d, %Y')
result = {'type': 'string', 'value': f"Today's date is {formatted_date}."}
It fails with the following error:
Traceback (most recent call last):
File "/Users/hannes_brantner/Library/Caches/pypoetry/virtualenvs/chatbot-fXCp8fhr-py3.12/lib/python3.12/site-packages/pandasai/pipelines/chat/code_execution.py", line 85, in execute
result = self.execute_code(code_to_run, code_context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/hannes_brantner/Library/Caches/pypoetry/virtualenvs/chatbot-fXCp8fhr-py3.12/lib/python3.12/site-packages/freezegun/api.py", line 885, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/hannes_brantner/Library/Caches/pypoetry/virtualenvs/chatbot-fXCp8fhr-py3.12/lib/python3.12/site-packages/pandasai/pipelines/chat/code_execution.py", line 174, in execute_code
exec(code, environment)
File "", line 2, in
KeyError: 'import'

Do you have an idea why this happens?

dosubot · 2025-01-23T10:19:27Z

Hey @hbrantner-auvaria! I'm here to help you with your bug. I'm a bot that can assist you with bug fixes, answer your questions, and help you become a contributor. Let's tackle this issue together!

A similar issue has been reported where a KeyError: '__import__' occurs during code execution in PandasAI. The suggested solution is to ensure that the code does not contain any references to dangerous built-ins, including __import__. The CodeCleaning class in PandasAI is designed to check for such dangerous built-ins and remove them. Make sure that the _is_jailbreak method is correctly identifying and removing any references to __import__ in the generated code. Additionally, verify that the CodeCleaning step is properly integrated into the GenerateChatPipeline ^[1].

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other}

hbrantner-auvaria · 2025-01-23T10:21:48Z

also while giving the agent information about a dataframe, the common type pd.StringDtype() is displayed as object, but should be displayed as string in my opinion, furthermore the addition of a nullable property for each column would be nice:

name: PortName
type: object
samples:
- Helsinki
- Stockholm
- Visby (Gotland)

hbrantner-auvaria · 2025-01-23T10:22:33Z

Another thing that I found out is that timestamps are incorrectly presented to the LLM in the samples:

name: EstimatedArrivalTime
type: datetime64[ns, UTC]
samples:
- &id002 !!python/object/apply:pandas._libs.tslibs.nattype.__nat_unpickle
- null
- !!python/object/apply:pandas._libs.tslibs.timestamps._unpickle_timestamp
- 1719210600000000000
- null
- *id001
- 10
- !!python/object/apply:pandas._libs.tslibs.timestamps._unpickle_timestamp
- 1719471600000000000
- null
- *id001
- 10

hbrantner-auvaria · 2025-01-23T10:23:12Z

If you point me to the locations I may be able to fix it myself, I think the df is inserted with the templates, but I currently have no good overview over the project

dosubot · 2025-01-23T10:24:23Z

The conversion of pandas data types to string representations for the LLM is handled by the DataframeSerializer class. This class provides methods to serialize a pandas DataFrame into different string formats, such as JSON, YML, CSV, and SQL ^[1].

For the issue with pd.StringDtype() being displayed as object, it seems there isn't specific handling for StringDtype in the current codebase. You might need to modify the serialization logic to correctly identify and represent StringDtype as string.

Regarding the incorrect presentation of timestamps, the TimestampMixin in server/core/database/mixins/timestamp.py is responsible for handling and formatting timestamp data for the LLM ^[2]. You may need to adjust how timestamps are serialized to ensure they are presented correctly.

The templates used to insert dataframe information into the chat pipeline are located in pandasai/ee/agents/semantic_agent/prompts/templates/shared/dataframe.tmpl ^[3]. You can start by looking into these areas to make the necessary adjustments.

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other}

hbrantner-auvaria · 2025-01-23T10:27:10Z

@dosu Can I skip the code cleaning step?

dosubot · 2025-01-23T10:27:28Z

Skipping the code cleaning step is not recommended, as it helps prevent the execution of potentially dangerous code by removing unsafe built-ins like __import__. However, if you still want to bypass it, you would need to modify the code execution pipeline to exclude the CodeCleaning step. This involves altering the GenerateChatPipeline to remove or bypass the code cleaning logic. Be cautious with this approach, as it may expose your system to security risks.

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other}

hbrantner-auvaria · 2025-01-23T11:27:38Z

Furthermore, also the kind of errors are happening @dosu:

the "Unfortunately, I was not able to answer your question, because of the following error:\n\n'import'\n" error from before
this return error "Unfortunately, I was not able to answer your question, because of the following error:\n\n'return' outside function (, line 10)\n" - fixable with system prompt probably
this wrong result type error - "Unfortunately, I was not able to answer your question, because of the following error:\n\n'str' object has no attribute 'get'\n" (should also be fixable with a system prompt)

What do you suggest?

dosubot · 2025-01-23T11:30:04Z

For the errors you're encountering:

KeyError: '__import__': This error is likely due to the presence of unsafe built-ins in the code. The CodeCleaning step in the GenerateChatPipeline is designed to remove such references. Ensure that this step is correctly identifying and removing __import__ from the generated code. Skipping this step is not recommended due to security concerns ^[1].
'return' outside function: This error can be addressed by using system prompts to guide the LLM in generating correct code structures. The ErrorPromptGeneration class can be configured to handle such errors by generating prompts that correct the code structure ^[2].
'str' object has no attribute 'get': This error suggests a type mismatch or incorrect handling of string objects. Similar to the previous error, system prompts can be used to guide the LLM in generating the correct code. The ErrorPromptGeneration class can help by providing context-specific prompts to correct this issue ^[2].

Consider reviewing the ErrorPromptGeneration class and its associated prompt classes to ensure they are configured to handle these specific errors effectively.

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other}

hbrantner-auvaria · 2025-01-23T11:35:32Z

@dosu What is wrong here, I do not get it ...

You generated this python code:
current_date = datetime.datetime.now(datetime.timezone.utc).strftime('%A, %B %d, %Y')
result = {'type': 'string', 'value': f"Today's date is {current_date}."}
It fails with the following error:
Traceback (most recent call last):
File "/Users/hannes_brantner/Library/Caches/pypoetry/virtualenvs/chatbot-fXCp8fhr-py3.12/lib/python3.12/site-packages/pandasai/pipelines/chat/code_execution.py", line 85, in execute
result = self.execute_code(code_to_run, code_context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/hannes_brantner/Library/Caches/pypoetry/virtualenvs/chatbot-fXCp8fhr-py3.12/lib/python3.12/site-packages/freezegun/api.py", line 885, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/hannes_brantner/Library/Caches/pypoetry/virtualenvs/chatbot-fXCp8fhr-py3.12/lib/python3.12/site-packages/pandasai/pipelines/chat/code_execution.py", line 174, in execute_code
exec(code, environment)
File "", line 1, in
KeyError: 'import'
Fix the python code above and return the new python code:

dosubot · 2025-01-23T11:37:37Z

The KeyError: '__import__' you're encountering suggests that the code execution environment is trying to access the __import__ function, which is likely being blocked by the security measures in place. The CodeCleaning class is designed to prevent the use of unsafe built-ins like __import__ to ensure security ^[1].

To address this issue, you can try the following:

Review the CodeCleaning Process: Ensure that the CodeCleaning step is correctly identifying and removing any references to __import__ in the generated code. This step is crucial for maintaining security, so it's important to verify that it's functioning as intended.
Modify the Code Execution Environment: If you need to use certain built-ins that are being blocked, you might consider adjusting the environment setup to allow them, but be cautious as this can introduce security risks.
Error Handling with Prompts: Use the ErrorPromptGeneration class to handle and correct errors. This class can generate prompts to guide the LLM in producing code that avoids such errors ^[2].

If the code itself is correct and the error persists, it might be necessary to look into how the execution environment is configured and whether any security settings are overly restrictive for your use case.

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other}

hbrantner-auvaria · 2025-01-23T11:39:41Z

Another example @dosu

You generated this python code:
current_date = datetime.datetime.now(datetime.timezone.utc)
current_date_start = current_date.replace(hour=0, minute=0, second=0, microsecond=0)
current_date_end = current_date.replace(hour=23, minute=59, second=59, microsecond=999999)
daily_events_df = dfs[5]
todays_workshops = daily_events_df[(daily_events_df['EventStart'] >= current_date_start) & (daily_events_df['EventStart'] <= current_date_end) & daily_events_df['EventCategory'].str.contains('kreativ', case=False, na=False)]
if len(todays_workshops) > 0:
workshop_info = []
for _, workshop in todays_workshops.iterrows():
start_time = workshop['EventStart'].strftime('%H:%M')
workshop_info.append(f"- {workshop['EventName']} at {start_time} in {workshop['EventLocation']}")
response = "Today's workshops:\n" + '\n'.join(workshop_info)
else:
response = 'There are no workshops scheduled for today.'
result = {'type': 'string', 'value': response}
It fails with the following error:
Traceback (most recent call last):
File "/Users/hannes_brantner/Library/Caches/pypoetry/virtualenvs/chatbot-fXCp8fhr-py3.12/lib/python3.12/site-packages/pandasai/pipelines/chat/code_execution.py", line 85, in execute
result = self.execute_code(code_to_run, code_context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/hannes_brantner/Library/Caches/pypoetry/virtualenvs/chatbot-fXCp8fhr-py3.12/lib/python3.12/site-packages/freezegun/api.py", line 885, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/hannes_brantner/Library/Caches/pypoetry/virtualenvs/chatbot-fXCp8fhr-py3.12/lib/python3.12/site-packages/pandasai/pipelines/chat/code_execution.py", line 174, in execute_code
exec(code, environment)
File "", line 9, in
File "pandas/_libs/tslibs/timestamps.pyx", line 1496, in pandas._libs.tslibs.timestamps.Timestamp.strftime
KeyError: 'import'

dosubot · 2025-01-23T11:41:19Z

The KeyError: '__import__' error you're encountering suggests that the code execution environment is attempting to use the __import__ function, which is likely being blocked by the security measures in place. This is part of the security setup to prevent the use of unsafe built-ins ^[1].

To address this issue, consider the following steps:

Ensure Code Cleaning: Verify that the CodeCleaning class is correctly identifying and removing any references to __import__ in the generated code. This step is crucial for maintaining security and should be functioning as intended ^[2].
Review Execution Environment: If you need to use certain built-ins that are being blocked, you might consider adjusting the environment setup to allow them. However, be cautious as this can introduce security risks. The environment uses a whitelist of allowed built-ins, and you may need to modify this list if necessary ^[1].
Error Handling with Prompts: Use the ErrorPromptGeneration class to handle and correct errors. This class can generate prompts to guide the LLM in producing code that avoids such errors ^[1].

If the code itself is correct and the error persists, it might be necessary to look into how the execution environment is configured and whether any security settings are overly restrictive for your use case.

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other}

hbrantner-auvaria · 2025-01-23T13:34:13Z

cannot import code in exec - disabled code cleaning step and set security to 'none' - do not understand where the problem comes from @ArslanSaleem @gventuri :

Generate python code and return full updated code:

INFO:pandasai.helpers.logger:Code generated:

            import pandas as pd
import datetime
import pytz
def format_time(dt):
    if pd.isna(dt):
        return "Not specified"
    return dt.strftime("%H:%M")
# Get current date in UTC
current_date = datetime.datetime.now(pytz.UTC)
current_date_start = current_date.replace(hour=0, minute=0, second=0, microsecond=0)
current_date_end = current_date.replace(hour=23, minute=59, second=59, microsecond=999999)
# Filter events for today
daily_events_df = dfs[5]  # DailyEvents table
todays_events = daily_events_df[
    (daily_events_df['EventStart'] >= current_date_start) &
    (daily_events_df['EventStart'] <= current_date_end)
]
if len(todays_events) == 0:
    response = "There are no events scheduled for today."
else:
    # Sort events by start time
    todays_events = todays_events.sort_values('EventStart')
    
    # Create formatted event list
    event_list = []
    for _, event in todays_events.iterrows():
        event_str = (f"{format_time(event['EventStart'])} - {format_time(event['EventEnd'])}: "
                    f"{event['EventName']} ({event['EventLocation']})")
        if pd.notna(event['EventPriceInEur']):
            event_str += f" - {event['EventPriceInEur']}€"
        event_list.append(event_str)
    
    response = "Today's events:\n\n" + "\n".join(event_list)
result = {
    "type": "string",
    "value": response
}

INFO:pandasai.helpers.logger:Executing Step 4: CachePopulation
INFO:pandasai.helpers.logger:Executing Step 5: CodeExecution
ERROR:pandasai.helpers.logger:Failed with error: Traceback (most recent call last):
File "/Users/hannes_brantner/Library/Caches/pypoetry/virtualenvs/chatbot-fXCp8fhr-py3.12/lib/python3.12/site-packages/pandasai/pipelines/chat/code_execution.py", line 85, in execute
result = self.execute_code(code_to_run, code_context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/hannes_brantner/Library/Caches/pypoetry/virtualenvs/chatbot-fXCp8fhr-py3.12/lib/python3.12/site-packages/freezegun/api.py", line 885, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/hannes_brantner/Library/Caches/pypoetry/virtualenvs/chatbot-fXCp8fhr-py3.12/lib/python3.12/site-packages/pandasai/pipelines/chat/code_execution.py", line 174, in execute_code
exec(code, environment)
File "", line 1, in
ImportError: import not found

hbrantner-auvaria · 2025-01-23T13:47:56Z

This seemed to fix it, after digging in the library code:
WHITELISTED_BUILTINS.append("__import__")

But that should not be so complicated ... or so error-prone in the default config

gventuri · 2025-01-23T15:56:37Z

Hi @hbrantner-auvaria we are aware there are some stability issues with date times and are further investigating in a robust way to fix it!

Thanks a lot for the very detailed logs, much appreciated, we'll investigate further from there!

hbrantner-auvaria changed the title ~~Random~~ Random KeyError happening during code execution with correct code Jan 23, 2025

dosubot bot added the bug Something isn't working label Jan 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Random KeyError happening during code execution with correct code #1544

Random KeyError happening during code execution with correct code #1544

hbrantner-auvaria commented Jan 23, 2025

dosubot bot commented Jan 23, 2025

hbrantner-auvaria commented Jan 23, 2025

hbrantner-auvaria commented Jan 23, 2025

hbrantner-auvaria commented Jan 23, 2025

dosubot bot commented Jan 23, 2025

hbrantner-auvaria commented Jan 23, 2025

dosubot bot commented Jan 23, 2025

hbrantner-auvaria commented Jan 23, 2025

dosubot bot commented Jan 23, 2025

hbrantner-auvaria commented Jan 23, 2025

dosubot bot commented Jan 23, 2025

hbrantner-auvaria commented Jan 23, 2025

dosubot bot commented Jan 23, 2025

hbrantner-auvaria commented Jan 23, 2025 •

edited

Loading

hbrantner-auvaria commented Jan 23, 2025 •

edited

Loading

gventuri commented Jan 23, 2025

Random KeyError happening during code execution with correct code #1544

Random KeyError happening during code execution with correct code #1544

Comments

hbrantner-auvaria commented Jan 23, 2025

System Info

🐛 Describe the bug

dosubot bot commented Jan 23, 2025

hbrantner-auvaria commented Jan 23, 2025

hbrantner-auvaria commented Jan 23, 2025

hbrantner-auvaria commented Jan 23, 2025

dosubot bot commented Jan 23, 2025

hbrantner-auvaria commented Jan 23, 2025

dosubot bot commented Jan 23, 2025

hbrantner-auvaria commented Jan 23, 2025

dosubot bot commented Jan 23, 2025

hbrantner-auvaria commented Jan 23, 2025

dosubot bot commented Jan 23, 2025

hbrantner-auvaria commented Jan 23, 2025

dosubot bot commented Jan 23, 2025

hbrantner-auvaria commented Jan 23, 2025 • edited Loading

hbrantner-auvaria commented Jan 23, 2025 • edited Loading

gventuri commented Jan 23, 2025

hbrantner-auvaria commented Jan 23, 2025 •

edited

Loading

hbrantner-auvaria commented Jan 23, 2025 •

edited

Loading