Enhanced Monitoring and Stability #48

alimosaed · 2025-01-24T21:05:04Z

🆕 What's New?

Added comprehensive monitoring system for workflow status tracking
Implemented log rotation and archiving system
Added MongoDB integration with support for typed data insertion and JSON handling
Introduced text splitter component using Langchain for document chunking
Added support for custom user property keys in request-reply messaging
Implemented error queue configuration with max depth control
Added configurable reply topic placement in agent requests
Added API for connection status monitoring

🔧 Improvements

Enhanced security by removing confidential information from logs
Improved logging system:
- Added connection/reconnection attempt logging
- Implemented log file size control and archiving
- Added exponential backoff for connection retry logging
Enhanced LLM request handling with retry mechanisms and cooling down policies:
- Added retry based on failure types
- Implemented cooling down based on error rates
- Added timeout handling
- Introduced NACK functionality for failed requests
Improved broker reconnection with "forever" retry policy
Added whitesource configuration file support in CI workflow
Enhanced error reporting during startup process
Optimized dependencies:
- Removed Langchain and LiteLLM dependencies from core
Improved system shutdown handling:
- Added proper SIGINT and SIGTERM signal handling
- Enhanced thread termination in sleep mode

🐛 Bug Fixes

Fixed infinite error logging loop during broker disconnection
Resolved graceful shutdown issues
Fixed error handling in startup process to show error location
Resolved broker reconnection timeout issues
Fixed error queue blocking issues by implementing message dropping when queue is full
Note: This release includes significant improvements to system stability, monitoring, and security, along with various bug fixes and performance enhancements.

* Allowing stream overwrite at event level for LLM Chat * Added overwrite flag

* Changes for request/response for streaming LLM access * Updated with main * update --------- Co-authored-by: Edward Funnekotter <efunneko@gmail.com>

* fix: add dependencies to the toml file * fix: handled miss configurations * fix: resolve conflicts * FEATURE: Enable stream overwrite for LLM Chat at the event level (#66) * Allowing stream overwrite at event level for LLM Chat * Added overwrite flag * AI-95: Enhance request/response handling for streaming LLM access (#69) * Changes for request/response for streaming LLM access * Updated with main * update --------- Co-authored-by: Edward Funnekotter <efunneko@gmail.com> * fix: add exception handler * fix: add exception handler --------- Co-authored-by: Art Morozov <artyom.morozov315@gmail.com> Co-authored-by: Cyrus Mobini <68962752+cyrus2281@users.noreply.github.com> Co-authored-by: Edward Funnekotter <efunneko@gmail.com>

* feat: drop error messages when the queue is full * feat: add a text splitter component * feat: updated docs * fix: return the original example

… reply topic in the message (#74) * If requested, insert the response topic according to the response_topic_insertion_expression * More fixes after testing

* FEATURE: Enable stream overwrite for LLM Chat at the event level (#66) * Allowing stream overwrite at event level for LLM Chat * Added overwrite flag * AI-95: Enhance request/response handling for streaming LLM access (#69) * Changes for request/response for streaming LLM access * Updated with main * update --------- Co-authored-by: Edward Funnekotter <efunneko@gmail.com> * Include stack dump if there is an error on startup --------- Co-authored-by: Art Morozov <artyom.morozov315@gmail.com> Co-authored-by: Cyrus Mobini <68962752+cyrus2281@users.noreply.github.com>

* Added mongodb insert component * type * added search component * applied comments * updated docs

…st response user properties (#79) * Added the option to support custom keys for reply and metadata for request reponse user properties * fixed issue

…whitesoure scan results. (#80) Investigate Solace AI connector (other solace ai libs) whitesoure scan results. (#80) --------- Co-authored-by: John Corpuz <john.corpuz@solace.com>

* feat: add the forever retry * feat: keep connecting * feat: replace the reconnection * ref: moved settings to a new yaml file * feat: update documents * ref: move common settings to base broker * feat: generate documents * fix: retrieve litellm config

* Added mongodb insert component * type * added search component * applied comments * updated docs * Added the option to support custom keys for reply and metadata for request reponse user properties * fixed issue * Updated insert with type * added docs * added config value validation * added value check for mongo insert

* feat: add monitring component * fix: resolve a bug * fix: add sleep time * fix: add sleep time * feat: add readiness and handle excessive logs * fix: handle sleep error * fix: handle sleep error * feat: gracefully exit * feat: set the log back * fix: rename log fields * fix: disabled monitoring * fix: resolve log naming * fix: resolved logging issues * fix: resolve log * fix: resolve log * feat: remove dependency to Langchain * feat: update monitoring * feat: drop error messages when the queue is full * feat: add a text splitter component * feat: updated docs * fix: resolve graceful termination issues * fix: remove payloads from logs * feat: add the forever retry * feat: keep connecting * Feat: add monitoring * feat: replace the reconnection * feat: refactor monitoring * feat: add connection metric * convert connection to async * get metrics enum * add types of metrics * use metrics rather than metric values * fix bug * update type * convert monitoring output to dictionary * fix bug * feat: add connection status * feat: add reconnecting status * feat: add reconnecting log and handled signals * fix: update status * fix: update log * fix: fix bug * fix: fix bug * fix: resolve connection logs * fix: handle threads * fix: update connection state machine * feat: add prefix to the broker logs * fix: synchronize logs with connection attempts * fix: remove datadog dependency * fix: cover an exception * ref: upgrade to latest pubsub and replace a metric * ref: capsulate some variables * ref: enable daemon for threads to close them safely * ref: remove useless variable

* feat: add monitring component * fix: resolve a bug * fix: add sleep time * fix: add sleep time * feat: add readiness and handle excessive logs * fix: handle sleep error * fix: handle sleep error * feat: gracefully exit * feat: set the log back * fix: rename log fields * fix: disabled monitoring * fix: resolve log naming * fix: resolved logging issues * fix: resolve log * fix: resolve log * feat: remove dependency to Langchain * feat: update monitoring * feat: drop error messages when the queue is full * feat: add a text splitter component * feat: updated docs * fix: resolve graceful termination issues * fix: remove payloads from logs * feat: add the forever retry * feat: keep connecting * Feat: add monitoring * feat: replace the reconnection * feat: refactor monitoring * feat: add connection metric * convert connection to async * get metrics enum * add types of metrics * use metrics rather than metric values * fix bug * update type * convert monitoring output to dictionary * fix bug * feat: add connection status * feat: add reconnecting status * feat: add reconnecting log and handled signals * fix: update status * fix: update log * fix: fix bug * fix: fix bug * fix: resolve connection logs * fix: handle threads * fix: update connection state machine * feat: add prefix to the broker logs * fix: synchronize logs with connection attempts * fix: remove datadog dependency * fix: cover an exception * ref: upgrade to latest pubsub and replace a metric * feat: add retry and timeout to litellm * feat: add nack * fix: replace exception with exception type * fix: remove useless exceptions * Create pull_request_template.md * fix: update the default nack * ref: replace nack string status with enumerations * ref: generate docs * ref: remove default value * ref: move common imports to a module * ref: update imports * ref: update import

cyrus2281 and others added 16 commits December 2, 2024 17:11

FEATURE: Enable stream overwrite for LLM Chat at the event level (#66)

f29c6e3

* Allowing stream overwrite at event level for LLM Chat * Added overwrite flag

AI-95: Enhance request/response handling for streaming LLM access (#69)

1ca1c0e

* Changes for request/response for streaming LLM access * Updated with main * update --------- Co-authored-by: Edward Funnekotter <efunneko@gmail.com>

Chore: Enable Whitesource scan

264fce2

Merge branch 'SolaceLabs:main' into main

25d5e34

feat: drop error messages when the queue is full (#75)

4f3dfe9

Add a text splitter component (#76)

b335985

* feat: drop error messages when the queue is full * feat: add a text splitter component * feat: updated docs * fix: return the original example

AI-354: Add configuration for broker-request-response for placing the…

f04ad51

… reply topic in the message (#74) * If requested, insert the response topic according to the response_topic_insertion_expression * More fixes after testing

JDE: Add MongoDB insert component. (#78)

299c04a

* Added mongodb insert component * type * added search component * applied comments * updated docs

REQUEST-RESPONSE: Support custom keys for reply and metadata in reque…

8c42f7e

…st response user properties (#79) * Added the option to support custom keys for reply and metadata for request reponse user properties * fixed issue

DATAGO-91907: Investigate Solace AI connector (other solace ai libs) …

e996822

…whitesoure scan results. (#80) Investigate Solace AI connector (other solace ai libs) whitesoure scan results. (#80) --------- Co-authored-by: John Corpuz <john.corpuz@solace.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhanced Monitoring and Stability #48

Enhanced Monitoring and Stability #48

alimosaed commented Jan 24, 2025

Enhanced Monitoring and Stability #48

Are you sure you want to change the base?

Enhanced Monitoring and Stability #48

Conversation

alimosaed commented Jan 24, 2025

🆕 What's New?

🔧 Improvements

🐛 Bug Fixes