You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed natural is being shimmed in but it's the older version which is buggy and making the try/catch necessary. The new Sentence Splitter works quite well and doesn't have the same issues that would cause the tokenizer to fail.
Using a deprecated tokenizer and hiding the errors from it is quite messy which isn't ideal.
natural is CommonJS only and so that makes sense as why it's being shimmed in but could we update to the latest natural and do something like the following inside splitBySentenceTokenizer while we wait for NaturalNode/natural#744? would that break something in the packaging in LLamaindexTS @himself65 ?
Came across an issue earlier, that highlighted the way the tokenizer works. I'm referring to the try/catch here
LlamaIndexTS/packages/core/src/node-parser/utils.ts
Lines 36 to 48 in 6f4549b
I noticed
natural
is being shimmed in but it's the older version which is buggy and making the try/catch necessary. The new Sentence Splitter works quite well and doesn't have the same issues that would cause the tokenizer to fail.Using a deprecated tokenizer and hiding the errors from it is quite messy which isn't ideal.
natural
is CommonJS only and so that makes sense as why it's being shimmed in but could we update to the latestnatural
and do something like the following insidesplitBySentenceTokenizer
while we wait for NaturalNode/natural#744? would that break something in the packaging in LLamaindexTS @himself65 ?We can still try catch on the function but at least make it not silent...
console.log
at the minimum?The text was updated successfully, but these errors were encountered: