djhworld

thoughts


Using LLMs to turn scripts into applications

I’ve built a bunch of scripts and tools over the years that are largely hodge-podge, “good enough” monstrosities that should never see the light of day, but mostly useful, albeit awkward to use.

One of those tools was something I built to help me maintain my beancount file.

The tooling downloads transaction data from various financial institutions and then converts the contents into beancount format so that I can copy/paste the contents into my file.

As part of the processing pipeline it also included a rule-engine of sorts that formatted and categorised transactions depending on various conditions defined in the rules and this was all configured through one big, cumbersome JSON file.

It’s worked fine for a long time but frustratingly the “rule engine” part was becoming a pain to maintain, or even remember how to use in the first place. A small bit of JSON is cute, but JSON left unattended becomes something unspeakable, especially if you feed it after midnight.

I’ve never really taken the time to improve on it though, as ever time seems to evaporate as you grow older.

So I thought I’d look into LLMs to help with this and move the rules engine into an application to help me manage the rules in a saner way.

Hello BeanEngine

This is what Claude (and I?) built, we called it BeanEngine.

It’s a web application that implements my rules engine and allows you to configure it through a nice UI.

The application has two functions

  • An API where I can pass transactions, and then it returns everything I need to convert them into a beancount entries.
  • A UI to configure rules on how to process the transactions and classify what they are for

Over the course of a sessions on a few days (maybe 2-3 hours) Claude completed this for me.

Anyone who tells you these AI tools are one-shot miracle machines are lying to you, from my experience building this tool, there were a lot of iterations, but it felt more akin to supervising someone sometimes telling them where they were wrong.

The UI

I made claude build me a number of pieces of functionality

  • A rule engine to process transactions, complete with search functionality and CRUD actions for rules.
  • An account mapping engine to map accoutns to beancount accounts
  • A transfer rules engine to identify transactions that are transfers between personal accounts

Here are some screenshots of the screens Claude built for me, notice how the UI is fairly consistent between screens. All of this was designed by Claude, I did tell it to implement dark mode.

Rules engine



Claude even added some import/export functionality which I didn't even ask for but kept anyway

Account mappings engine

Transfer rules engine

All of these systems are not that complicated, just CRUD screens backed by a SQLite database. But these would have taken me days to build and lets be honest, building CRUD apps is tedious. With LLMs they’re fun to build because you get the LLM to do all the work.

What’s all this in service of?

Well, the API uses all these rules engines to process transactions

For example I can send this to the API

[
  {
    "date": "2025-08-01",
    "from_account": "creditcardprovider:-x12434",
    "recipient": "PATREON",
    "amount": 4.8,
    "reference": "PAYPAL *PATREON MEMBERS 1234",
    "currency": "GBP"
  },
  {
    "date": "2025-08-01",
    "from_account": "creditcardprovider:-x12434",
    "recipient": "GOOGLE SERVICES",
    "amount": 12.99,
    "reference": "GOOGLE*YOUTUBE 1231231",
    "currency": "GBP"
  },
  {
    "date": "2025-08-09",
    "from_account": "creditcardprovider:-x12434",
    "recipient": "WAITROSE",
    "amount": 75,
    "reference": "WAITROSE 123423     ONLINE",
    "currency": "GBP"
  },
  {
    "date": "2025-08-08",
    "from_account": "foo-bank:12345",
    "recipient": "Daniel Harper",
    "amount": 200,
    "reference": "bar-bank savings account",
    "currency": "GBP"
  }
]

and it will respond with

[
    {
        "date": "2025-08-01",
        "recipient": "Patreon",
        "reference": "Podcast sub",
        "from_account": "Liabilities:CreditCard:CreditCardProvider",
        "account": "Expenses:Fun:Subscriptions",
        "amount": 4.8,
        "currency": "GBP",
        "classification_type": "ml_prediction",
        "confidence": 0.9993922710418701,
        "original_from_account": "creditcardprovider:-x12434",
        "original_recipient": "PATREON",
        "original_reference": "PAYPAL *PATREON MEMBERS 1234"
    },
    {
        "date": "2025-08-01",
        "recipient": "YouTube",
        "reference": "Youtube Premium",
        "from_account": "Liabilities:CreditCard:CreditCardProvider",
        "account": "Expenses:Fun:Subscriptions",
        "amount": 12.99,
        "currency": "GBP",
        "classification_type": "ml_prediction",
        "confidence": 0.9989938139915466,
        "original_from_account": "creditcardprovider:-x12434",
        "original_recipient": "GOOGLE SERVICES",
        "original_reference": "GOOGLE*YOUTUBE 1231231"
    },
    {
        "date": "2025-08-09",
        "recipient": "Waitrose",
        "reference": "",
        "from_account": "Liabilities:CreditCard:CreditCardProvider",
        "account": "Expenses:Groceries",
        "amount": 75.0,
        "currency": "GBP",
        "classification_type": "rule_override",
        "confidence": 1.0,
        "original_from_account": "creditcardprovider:-x12434",
        "original_recipient": "WAITROSE",
        "original_reference": "WAITROSE 123423     ONLINE"
    },
    {
        "date": "2025-08-08",
        "recipient": "Daniel Harper",
        "reference": "bar-bank savings account",
        "from_account": "Assets:Bank:Current:FooBank:Current",
        "account": "Assets:Bank:Savings:BarBank:Savings",
        "amount": 200.0,
        "currency": "GBP",
        "classification_type": "transfer",
        "confidence": 1.0,
        "original_from_account": "foo-bank:12345",
        "original_recipient": "Daniel Harper",
        "original_reference": "bar-bank savings account"
    }
]

You may notice that the response contains a few things

  • The recipient is sometimes rewritten to something cleaner (e.g. “Google Cloud”)
  • The reference is sometimes rewritten to something nicer (e.g. “Podcast Sub”)
  • The beancount Expense account the transaction should be for is returned
  • There’s some indication of what type of transaction it is (transfer between personal accounts or “ML classification”)

My script can then convert these into nice, clean beancount transactions

2025-08-01 * "Patreon" "Podcast sub" ; confidence: 1.00
  Liabilities:CreditCard:CreditCardProvider -4.80 GBP
  Expenses:Fun:Subscriptions

2025-08-01 * "YouTube" "Youtube Premium" ; confidence: 1.00
  Liabilities:CreditCard:CreditCardProvider -12.99 GBP
  Expenses:Fun:Subscriptions

2025-08-09 * "Waitrose" "" ; confidence: 1.00
  Liabilities:CreditCard:CreditCardProvider -75.00 GBP
  Expenses:Groceries

2025-08-08 * "Daniel Harper" "bar-bank savings account" ; transfer
  Assets:Bank:Current:FooBank:Current      -200.00 GBP
  Assets:Bank:Savings:BarBank:Savings

Wait what, ML model?

You may also notice the response contains some mention of ML prediciton.

I’m not an ML engineer and do not pretend to be, but I was having so much fun with this I decided to let Claude loose on building a really simple classification model for me. Overkill? Yes. Probably a terrible model? Almost certainly. But hey, while we are here….

The model accepts

  • a recipient (e.g. GOOGLE SERVICES)
  • an amount (e.g. 12.99)

and attempts to predict the target account e.g. Expenses:Fun:Subscriptions

The model is trained on all my past beancount transactions. I got Claude to write the model trainer to accept a beancount file, extract the data, clean it up and then train a really really simple neural network, which is probably overkill - it did say better classifiers exist.

This model is then stored in a PKL file, and I got claude to write a simple system so I can upload new versions of the model via the UI and switch models without having to restart the application.

Did I need the ML model? No. You can acheive the same result just defining the rules properly in the rules engine. But I figured it might be handy for the transactions that are less frequent.

Anyway

Just a blog post about something I found pretty neat.

I think what’s interesting to me is these sorts of tools and things that I have in my toolbox are mostly personal to me. I think with the era of these LLM things, it gives the opportunity to unlock more more from them. Maybe at the expense of the ball of spaghetti Claude has come up with, but I’m willing to make that trade off for the time being, I’m the only user that has to live with it.

Cheers xxx

Side note: I’m aware that beancount has importing functionality, I don’t really use these though I prefer to maintain my beancount file myself, I’ve honed a bunch of techniques to speed this up over the years and I’m too stubborn to change. 🙂