Jon Walz
Published on

Coding with LLMs

Unpredictability is entertaining

Opinions fly easily on the internet. It's getting harder with every passing day to find valuable and convincingly accurate information on anything. With the rise of generative AI, the truth-needle in the knowledge-haystack becomes frustratingly smaller (and we need bigger magnets).

In early 2024, I discovered the thrill of generating code with ChatGPT (the word thrill here is intentional). Then came Claude 3.5 Sonnet, which rocked the developer scene and put a lot of focus on the viability of using these LLMs to boost productivity, and possibly even more importantly, creativity.

Flash forward a year later and it's clear this new technology is here to stay. We now have automated assistants like Cursor, Windsurf, and (my favorite) Aider. One of my strongest observations after using these tools for a number of personal projects is the illusion of productivity. I will expand on this further throughout this post but to summarize: If your goal is to improve your software skills, be keenly aware of the time spent entertaining yourself vs the time spent actually learning your code base and the nuances of your chosen technology stack.

As modern day humans in the neo-liberal capitalistic environment, we are trained to think with a productive mindset and we can easily be convinced our actions are productive (read Bullshit Jobs by David Graeber). In a logical way, the more time we spend mashing buttons (aka telling the LLM to complete tasks) the less time we spend properly learning the technology we are using to complete tasks. We must then question, or maybe more appropriately, choose a long term goal of our actions.

I am not writing this to argue in favor of the button-mashing nature of coding with LLMs ('vibe coding' as kids are calling it) or the importance of deeply learning technology but rather point out the importance of choosing what we truly want from these technologies and ultimately from ourselves. As an entrepreneur in the tech space, it's very helpful to have a background in software development so to set realistic expectations when starting a software related venture. These generative tools have given us a huge advantage in their ability to quickly create working prototypes. While you don't have to look far to find opinions for and against using these tools, one thing is clear, it's creating a lot of opportunity for entrepreneurs with enough curiosity, creativity, and voracity to build something worth pursuing.

The thrill

Coding with LLM's provide a thrill. It's new. It's novel to the common person. Before I spent some time learning how LLM's actually work, it felt like magic. And like with any novel technology that arrives in the hands of the masses, there was a lot of excitement, speculation, and probably the loudest, fear.

For those of us with a background in development who are regularly experimenting and building with these AI tools, what quickly becomes clear is how these they struggle at scale. From overzealous halluci-happy code generation to completely misunderstanding the requirements and causing a human-in-the-loop undo/redo cycle. It's like taming wild tamagotchi horses.

All this said, I've learned some techniques to limit the number of headaches (and wasted minutes) when working with a larger codebase thanks to the awesome experiments by others online. First of all, shout out to IndieDevDan, AiJason, Devin Kearns, Nate Herk, and sites like deeplearning.ai/. These are not fail-safe tips since LLMs are non-deterministic and you will likely still encounter time-consuming outputs.

The Plan

One thing that has become quite clear is that having a plan will take you a long way (surprise?). In many of the tutorials and in my own experiments, the Product Requirement Document is driving much of the initial development. From a traditional software development perspective, this makes a lot of sense. Feeding the PRD in to the LLM to provide all necessary context will certainly help the outcome. I experimented with adding the PRD to every task request (ie every prompt) and while it was surprisingly helpful, it would still often miss some context and produce code that already existed.

This is when I created the BEST_PRACTICES.md document where I can inform the LLM of the tech being used, how to use it, and provide examples. This, combined with the PRD, helped the LLM to produce better code solutions but you can see where this is going - the context tokens are growing fast.

Another approach was to ask the LLM (usually a thinking model) to produce a TODO list of tasks to be done in order. I would ask it to complete a single task, provide appropriate context (PRD, BEST_PRACTICES, file paths) and it could work out what needs to be done with some fairly good results. This idea is now guiding my development process when leveraging LLMs to code. It is common to break down tasks into smaller tasks when developing and now I am experimenting with TDD to drive the tasks produced by LLMs, which will drive development.

Tips

Compliments to IndieDevDan for his course which has guided a lot of my experiments and understanding. A tip which he and some others have pointed out is to be succinct with your prompts and to structure them well. Here is an example prompt:

CREATE utils.ts
  - CREATE function getUser(id: string): User
    - MIRROR getLocation function
      - use UserService to obtain user info
      - add proper null checks
      - return user info

It appears that LLMs are very good at pattern recognition and, what IndieDevDan explains, very good at responding to "information rich keywords" such as CREATE, UPDATE, and MIRROR. This is a surgical example of using LLMs to write a specific task, which can improve the accuracy when dealing with larger codebases.

The other example would be to use an XML flavored structured prompt within the PRD and BEST_PRACTICES files:

<example-1>
  function getUser(id: string): User
</example-1>
<example-2>
  function getStatus(id: string): Status
</example-2>

Dev Cycles

The concept of AI "Taking our jobs" implies that our jobs don't change. Even during my few years (well, seven) of shipping code, it's hard to ignore how quickly the development experience changed. If we take this same perspective, the addition of coding with LLMs becomes a change in our development experience. I agree with the sentiment that the human is still responsible for the code that is shipped. The computer cannot be held responsible. It is we who generate the code. It is the driver of the LLM who is ultimately responsible for the code which is merged into the codebase.

Thus, it stands to reason, that the driver must understand the intricacies of their vehicle. The software engineer must understand the software being generated. And the stronger this understanding, the more accurate the code will be. LLMs will become a vehicle for delivering one's knowledge and understanding, not the other way around.