-
I am interested in the idea of multi-agent systems as they offer a solution for tackling complex problems by breaking them down into smaller tasks, particularly when it comes to handling unstructured data. However, I am curious if similar analysis can be applied to structured data that is uploaded. I am interested in building skills such as data analysis and research analysis. Do you have any thoughts or suggestions on this? Please let me know. |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 2 replies
-
I know multiple users use autogen for structured data analytics. |
Beta Was this translation helpful? Give feedback.
-
The main challenge of structured data is scale and schema understanding. More interesting, now your data is huge, can't fit inside the context. You would need to think about what information do you give to the agent? Schema, sample data, etc. Different agents for different stage of your pipeline, e.g., data cleaning agent, data validation agent, data science agent, visualization agent, etc. Even more interesting and very challenging, now your data is both huge and super weird schema on very specialized domains (e.g., organic chems, material science, etc.)... good luck now you are doing research. |
Beta Was this translation helpful? Give feedback.
-
Take a look at Microsoft Lida. It's a reasonable first step in using LLM to generate descriptions of datasets and doing analysis. I believe its primary use case is for user uploaded data. |
Beta Was this translation helpful? Give feedback.
The main challenge of structured data is scale and schema understanding.
To make things easier you can start with small datasets that can fit into the context of GPT model, and common schema on well-known domains like movies, songs, people, etc.
Just feed the data into the model and see what they can do, combine with code writing and execution.
More interesting, now your data is huge, can't fit inside the context. You would need to think about what information do you give to the agent? Schema, sample data, etc. Different agents for different stage of your pipeline, e.g., data cleaning agent, data validation agent, data science agent, visualization agent, etc.
Even more interesting and ver…