How to build a transducer for longest suffix/prefix #133

nitnelave · 2021-12-01T10:06:52Z

Hi,

I'm working on autocompletion, and I'd like to find, given a known completion C, what is the longest suffix of the input that is a prefix of C.
Example: C = "banana"
input: "my name is alibaba"
output (overlap between the two): "ba" (or equivalently just the length 2 would be enough).

Intuitively, it seems like it would be possible to build such a transducer: feed the input and at every step it tells you what is the longest matching suffix, every state being accepting. I'm just not sure how to build it with this library, the interface to get an output from an evaluation is not clear, could you give me a hand there?

Essentially you would have a map of all the prefixes of C to their length, make that a repeating FST, and on evaluation take the max of the outputs (longest sequence).

BurntSushi · 2021-12-16T14:25:20Z

If you need to traverse the automaton directly, then you want a raw::Fst, and you'll likely want to make use of the raw::Node API.

I'm not sure if I quite understand what you're trying to do, but you might consider storing your completions in reverse. That way, you can traverse the automaton by starting at the end of the input simply until you cannot move any longer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to build a transducer for longest suffix/prefix #133

How to build a transducer for longest suffix/prefix #133

nitnelave commented Dec 1, 2021

BurntSushi commented Dec 16, 2021

How to build a transducer for longest suffix/prefix #133

How to build a transducer for longest suffix/prefix #133

Comments

nitnelave commented Dec 1, 2021

BurntSushi commented Dec 16, 2021