zokier 2 days ago

I've been thinking that FST could be well suited for building routers for web frameworks. I.e. matching request path `/foo/42` to a set of route patterns like `/foo/{id}` etc. As I understand FST should be near perfect for this use and could potentially allow very good performance. Especially if you can construct the FST at compile time. Somewhat surprisingly I haven't seen anyone doing anything in that direction though, and FST literature is bit difficult if you don't have formal NLP background

  • rickette 2 days ago

    Radix Trees are (often?) used for this purpose, for example in Chi: https://github.com/go-chi/chi. Coincidentally FSTs and Radix trees share some similarities.

    • woodruffw a day ago

      > Coincidentally FSTs and Radix trees share some similarities

      Indeed -- I think a useful way to comprehend FSTs is as a radix/prefix trie that also allows suffix compaction. There's probably a formal correlation proof between the two, but I don't know it off the top of my head.

riedel a day ago

Actually pushdown transducers are even more fun (context free languages). 13 years ago I played around with visibly push down transducers for format conversion like grammar based binary compression on XML Schema. Can get you down a nice rabbit hole.

jyndbeb 2 days ago

I'm not quite sure I understand how they are applicable, since access patterns itself can be arbitrary complex

For example how is github.event.pull_request.labels[another_variable].name handled?

  • yorwba 2 days ago

    Since it only needs to classify the result as one of "fixed", "structured" or "arbitrary", it doesn't matter what the value of "another_variable" is, you'll always be in the same state at the end. Finite state transducers can handle an infinite number of possibilities with a finite number of states just fine.

    It's a bit strange though that the author constructs the FST to only recognize a finite list of strings. Probably some of the matching logic that could have been part of the FST is pushed into the normalizer instead.

    • woodruffw a day ago

      Author here.

      > Probably some of the matching logic that could have been part of the FST is pushed into the normalizer instead.

      Yep, exactly -- there's a great deal of prenormalization that in principle could be pushed into automata run over the FST instead. The reason I didn't do that is laziness, but it's the obvious next step :-)