[BUG] Problem with escaped characters in LLM answers #73

oderwat · 2023-04-24T11:40:31Z

Describe the bug

When asking to write a program, the LLM may output escaped characters like the following sequence:

//>  fmt
//> .
//> Print
//> f
//> ("%
//> d
//> \
//> n
//> ",
//> i
//> )
//>

This will end up in the frontend as an actual line feed, which makes it invalid code.

To Reproduce

Using alpaca-native-13B-ggml.bin

Prompt: Write a Go program that counts from 1 to 10

The prompt may need to be repeated to get a version where the LLM answers with something using fmt.Printf().

Expected behavior

The output should be correctly escaped for such code usage. I guess it is not easy to detect if the model outputs something that needs to be taken literal as it looks like as if it outputs '\n' for line feeds anyway. But maybe there is some solution I do not see.

Even if not perfect, it may be better to escape things when the LLM output generates double quotes?

The text was updated successfully, but these errors were encountered:

ItsPi3141 · 2023-04-24T14:59:19Z

I understand the problem here. However, most of the time, the AI isn't used to generate code. And the AI often doesn't use linebreaks in strings in the code that it generates. It would also be very hard to recognize if something is a quote for a string in a programming language or if it's a quote from other texts (e.g. book, news, etc)

oderwat · 2023-04-24T15:42:06Z

Well, I use AI (GPT-4) about 80% of the time to reason about or generate code, and it would be wonderful to be able to use it for that with actual private data (this is also why I don't want to send stuff to Duck Duck Go without being asked first).

But, I already tested and found that llama.cpp does not output the line feeds in the shell as \n anyway, while it does output escaped strings like the \n in the code perfectly fine. See this example generated with chat_mac_x64:

> write a go program that outputs the numbers 1 to 5 together with their written names
func main() {
    for i := 1; i <= 5; i++ {
        fmt.Printf("%d - %s\n", i, english[i-1])
    }
}

It seems as if you escape the output of the shell in a way that does not let the frontend distinguish between the actual LF and the normal strings. I understand that it is sometimes mind twisting to get all the escapes right. I guess you can fix that by looking into your pipeline processing and making some modifications to the escapes used.

Likewise, I may also look into that if you can't find a solution, I am pretty confident that this can be solved.

ItsPi3141 · 2023-04-24T15:50:53Z

Ok thanks for letting me know! I did not know that. I'll look into this and see what I can do.

oderwat added the bug Something isn't working label Apr 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Problem with escaped characters in LLM answers #73

[BUG] Problem with escaped characters in LLM answers #73

oderwat commented Apr 24, 2023 •

edited

Loading

ItsPi3141 commented Apr 24, 2023

oderwat commented Apr 24, 2023 •

edited

Loading

ItsPi3141 commented Apr 24, 2023 •

edited

Loading

[BUG] Problem with escaped characters in LLM answers #73

[BUG] Problem with escaped characters in LLM answers #73

Comments

oderwat commented Apr 24, 2023 • edited Loading

ItsPi3141 commented Apr 24, 2023

oderwat commented Apr 24, 2023 • edited Loading

ItsPi3141 commented Apr 24, 2023 • edited Loading

oderwat commented Apr 24, 2023 •

edited

Loading

oderwat commented Apr 24, 2023 •

edited

Loading

ItsPi3141 commented Apr 24, 2023 •

edited

Loading