2025-09-21: AIFPL

Published: 2025-09-21

AIFPL improvements and reflections on LLM development processes

Improving the AIFPL internals

The original version of the code worked well, but Claude did a few slightly odd things to get everything working. This meant that the code was using a somewhat odd approach to atoms.

To be fair, the current design is actually quite good because it leverages native python implementations in a way that keeps performance higher. Unfortunately, this has come at a cost of a clean implementation architecture.

Completely reworked this to get the architecture cleaner and to make more of the internals work with immutable data structures. The immutability is a nice improvement because it makes the internals much easier to understand.

An aside about Claude doing weird stuff

At one point, while I was having Claude fix a few tricky bugs related to recursion, it ended up completely undoing all the changes it had done up to that point. This was pretty frustrating (and quite expensive).

I've seen other people complain about this on social media posts, but I tend to see this more as an indication that it has probably been trained with some data that's less than ideal. Posts that say "oh, I blew away all the new stuff and started again".

What was frustrating was it then declared victory because rerunning the tests on the old code all passed, but the whole point was to refactor the old code. It suggests it doesn't have a strong sense of history - something I've discussed with a few people who've said something similar in the last month or two.

Recursion causes headaches

One very frustrating period of debugging in the new code was to fix a problem with recursion and mutual recursion. Claude spent several hours failing to make this work. It's not quite clear why, but I suspect it had ended up confusing itself constantly because AIFPL has in interesting design choice. AIFPL doesn't have let and letrec but instead just uses [object Object] but has the interpreter decide if something is recursive or not. This feels much more natural to me, but it seemed to confuse Claude into trying to handle both cases the same way.

The key to resolving this was to start a new conversation with the AI about the merits of using a Y combinator to solve the recursive case which then led it to point out the two forms of let that normally exist in Lisp implementations. Once we had that then I could ask it to evaluate pros and cons of the one form versus two forms, and it concluded we only needed one, but we needed two implementations behind the scenes.

Once we had that then things "just worked".

The problem of being imprecise

I've been reflecting on how my original design concept got turned into working code, with everything passing, yet the code wasn't actually that "good".

It's sort of ironic to say it wasn't good because it did exactly what it's supposed to, but it did it in a really clunky way.

The problem is even though I had clear idea that I wanted a pure functional language, I didn't really think about what that should mean for the implementation. Immutable data structures, for example, are a much better idea, but having never built this sort of software before, I didn't appreciate how important that would be.

A regular Lisp system has immutable atoms, but mutable lists. A pure Lisp, however, ought to have immutable lists too. If we build an optimizing system later, then escape analysis might demonstrate a list is never reused and can thus be mutated safely.

That's where the code has ended up in fact! We now have a totally pure type system and a straightforward implementation that is architecturally elegant. Fortunately, with LLM support to rework and refactor, getting to this point only added 7 or 8 hours of work!

Another learning

A shortcut I sometimes take with Humbug is when I'm seeing the LLM get near the end of a context window is to delete some of the recent conversation and update the prompt that gave rise to it. This is great, but there's a caveat!

If the conversation we're deleting caused any code changes then the AI loses track of them! We must either force it to reload, or else restart the conversation from scratch (but with the necessary context).