AI Software Engineering

The Third Wish

1000 Stars in cluster. Random distribution of spectral type, mass and position. A random initial velocity is constrained to provide a reasonable chance of stable orbit. Stars ejected from the cluster are usually the result of a computational step size that is too large to accurately model near collisions. Gravitational interaction between each unique star pair is computed and the updated cluster state rendered to produce each frame.

This video of a star cluster simulation was not generated by an AI. I mean… it could have been I suppose but that wasn’t really what I wanted to learn about. It was generated by an application I wrote while making use of a few different AI code assist tools. The application stack is:

  • EC2 Ubuntu instance
  • Basic UI in Node JS
  • Data storage on DynamoDb – NoSql table store
  • Motion computation and rendering engine in .Net Core/C#
  • Frame rendering with SkiaSharp
  • Video generation with ffmpeg
  • CloudFront edge server backed by S3

Since the earth was young, I’ve used the so-called “Three-Body Problem” (specifically, a star cluster simulation) as a foil to learn new languages, platforms, design paradigms, etc. It can quickly run into scale issues on several dimensions and so makes a nice little problem that isn’t artificially simple. The first version I ever did was a DOS program written in C. ‘Nuf said about that. So, it was only natural when I wanted to explore AI, I should turn to my old standby to see what I could see. This is still a work in progress, but it’s already gone much further than I frankly intended to go with it. Initially I wanted to test a simple question – how would AI facilitate test driven development? Given a test, how well would it generate code that would satisfy the test?

I started with a Hello World style “greeting” thing. So, I wrote test cases for a few different names following a simple pattern that required only name substitution, and it dutifully produced code to handle that. Then I broke the pattern, and it started adding special case logic. Fair enough. After a few rounds of this it was clear that it would cheerfully go on adding ‘if’ blocks for eternity, so I asked it to refactor using a map instead – which it did. Alrighty then.

Then something funny happened. I had been using names of characters from “Babylon 5” for the test cases. I decided to go the other direction and add a greeting to the map and let it write the test case. What I added to the map was “Kosh” -> “Well bless your heart”. What it generated for a test assertion was “Hello ambassador Kosh”, instead of “Well bless your heart Kosh”. Now, if you’re not familiar with B5, Kosh is an ambassador. It was disappointing that it didn’t evaluate the change I had made to the code and write the assertion accordingly. Instead, it wrote the assertion for the code I should have written if I was being consistent. So, was it right… or was it wrong? It gave me a chuckle at any rate. Overall, I have to say it was not a seamless experience… but I was still getting the hang of things and decided to drag my “Star Cluster” problem into the ring.

You really don’t want a play by play, so I’ll just lay my preliminary conclusions about AI code assist on the table. Sometimes it can be a little yip dog getting under foot and widdling on your shoe. Sometimes it’s like Lassie, grasping complex instructions and bringing help to the rescue. Sometimes it’s like the professor from Gilligan’s island. And sometimes it’s more like Gomer Pyle. Building software these days challenges even experienced engineers with the need to stitch together frameworks that one rarely has a chance to genuinely master. It’s hard to overstate how useful AI can be for this kind of wrangling. Auto-completion behavior is hit and miss and lacks fluidity in actual use. On the other hand, I’ve occasionally been able to auto-complete my way through several screenfuls of code when a clear direction is established. This is the real deal. I do wonder how the cost model will settle out – it’s hard to believe current price structure isn’t a loss leader. We can hope that it remains a fee-based capability so that the end user continues to be the customer and not the product (I’m looking at you search engines).

One final thought. Who benefits the most? Can it turn an entry-level engineer into a mid-level engineer? My own opinion… it doesn’t really work that way. The more skilled you are the more you will be able to leverage these tools… just like any other tool. In other words, I think it can allow an entry-level engineer to go farther faster. But productivity isn’t the only measure of experience, and AI will not replace the native insightfulness that you bring to the table. Consider – the more you know what “good” looks like the more you will be able to tease “good” out of the AI and know it when you see it. But it won’t keep you from digging yourself a hole faster and deeper than ever. For purposes of this investigation, I was particularly interested in trying to “run fast with knives” and was allowing myself the luxury of treating it like a scratch refactoring (e.g. – not worrying about making production level code). Given my objectives that was appropriate. But, the “third wish” is to undo the damage caused by the first two wishes. The next part of this investigation will be seeing how well AI tools lend themselves to operationalizing code.