Hey. I have made some DSLs before, and for all of them, most of which were in C, I have used Flex and Bison. But this time I wanna use Scheme, Cyclone Scheme to be exact. I can potentially use Flex/Bison this time too, because Cyclone has a strong FFI to C.
But I’d rather innovate. I have been writing down the language’s grammar in EBNF:
https://gist.github.com/Chubek/bd54df78fe1f71f46cb262ba990a209b
And my thinking is, why not turn this into a parser? Not something like BNFC that translates BNF (not EBNF) into several parser and lexer specifications plus an AST, I want it to translate EBNF into Scheme.
But if you read the grammar, you will realize that there are some places where it’s not very descriptive and machine-friendly. For that reason, I think an LLM can help.
Now, I need your help. I am mostly active in systems programming. Like Assembly, C stuff. I don’t know much about LLMs and this whole AI revolution. I did some work as an ‘ML-engineer guy’ (not an ML engineer, an ML-engineer guy, there’s a difference!), so I know how this whole thing works. I have also read MITs standard book on mathematical optimization.
But I definitely need to use a pretrained model here. My knowledge of mathematical optimization is useless when you need like 28 million to train a model that would aide you with this?
I don’t want to use an API. I wanna own my software. I do use ChatGPT as a search engine, but that’s about it, I never owned Google anyways!
I know about HuggingFace. What model there do you think would help me?
Also, how do these weights work? If I bind one DNN framework to Cyclone, will the weights trained by another DNN framework work in it too? Do people use frameworks not written in C, so I would have to like triple-bind it? I know both Google’s and Facebook’s are in C. However they are in ‘garbage c’. Well let’s deal with that later.
Anyways, thanks for your help.
TL; DR:
I need an LLM that would be used in an EBNF -> Scheme parser generator.
Yes I realized as much that a hand-roller LP is best. Especially after I read Design Concepts [2008, MIT] and learned how to represent ASTs in S-Expressions. Thanks.