Introduction: My Journey into AI-Powered Development

Can Artificial Intelligence truly write entire software programs without human intervention? Since December 2024, outside of my regular work, I've been on a mission to find out. I wanted to see just how far current AI tools could take the concept of "hands-free" coding. What started as an experiment quickly became a fascinating journey, as the capabilities of these tools evolved at an astonishing pace over just a few months.

In this post, I'll share my experiences: the tools I've tried, the workflows I've adopted, and the surprising results I've achieved. We'll look at what's genuinely possible with AI development today, where the limitations still lie, and what needs to improve. Think of this not as a formal survey, but as a practical snapshot from the trenches of pushing AI coding to its current limits.

Consider this document a time capsule. The AI landscape changes incredibly fast, so this reflects the state of things in May 2025. Hopefully, looking back on this in the future will highlight just how much progress continues to be made.

Early Explorations (December 2024)

My initial foray into AI coding was quite manual. The process revolved around a constant back-and-forth between the AI chat interface and my local development environment. Because the tools lacked built-in execution or file editing, I'd prompt the AI for code, copy the generated snippet, paste it into my editor (like Visual Studio Code), run it locally, and then copy any error messages back to the AI for debugging.

This copy-paste cycle became tedious, especially for larger programs or when the AI forgot previous context, requiring me to re-paste the same code sections repeatedly. There was no concept of state persistence or direct file access for the AI. Despite these hurdles, this basic loop was enough to get some initial working demos off the ground.

Refining the Process and a Paradigm Shift (January - April 2025)

The exploration continued with new models and tools emerging:

Hitting Stride with Advanced Models (April - May 2025)

The release of Gemini 2.5 Pro felt like the second major paradigm shift. It offered performance comparable to top-tier models like Claude but at a fraction of the API cost. This made extensive agentic coding economically viable.

With a powerful and affordable model driving the agents, complex tasks became much easier. Gemini 2.5 Pro started "one-shotting" (generating correct code on the first try) many problems. This spurred me to create The Sandbox, a collection of AI-generated projects, as I felt a significant capability threshold had been crossed.

Moving Beyond Simple Demos: Complex Projects Take Shape

Armed with Roo Code and the Gemini 2.5 Pro backend, building relatively sophisticated applications seemed increasingly reliable.

Despite these successes, a key bottleneck remained: the lack of a direct feedback loop between the browser and the coding agent. Manually relaying JavaScript errors from the Chrome developer console back to Roo Code still slowed down the iteration cycle considerably.

Assessing the Current State: How Well Does AI Code?

So, where do things stand? The progress is undeniable, but it's not perfect.

Is it faster than coding yourself? The answer is nuanced. If a project involves technologies you're unfamiliar with, AI can drastically speed up the initial development, getting you to a working prototype much faster than learning everything from scratch. However, the common "80/20 rule" often applies in reverse: AI might handle the first 80% of the project in 1% of the time, but completing the final 20% (debugging, refining, handling edge cases) can take 99% of the effort.

What works exceptionally well is rapid prototyping and exploring new tech stacks. Furthermore, agentic coding, especially with a TDD approach, can lead to robust and well-structured applications. The downside is cost and time. The TDD process inherently consumes more tokens (and thus money) and requires more iterations as tests are written, failed, and passed. However, the resulting quality and maintainability might justify the investment, especially for larger projects. By updating design documents and tests before adding features, the AI agent maintains a better understanding of the overall architecture, improving its chances of success.

The ability of frontier models like Gemini 2.5 Pro to process and implement information from dense technical documents (like the ESA paper) is truly remarkable. It feels like science fiction. Reports from places like Google DeepMind suggest similar feats, such as feeding Gemini 2.5 Pro the original Q-learning paper and having it implement the algorithm to learn Pong autonomously. This level of comprehension and application is astounding.

Another powerful example is the Procedural Planet Editor. I tasked Gemini 2.5 Pro (using its Deep Research capabilities) with surveying planet generation techniques. I then fed this AI-generated report back to the agent and asked it to implement the methods in a demo. The initial version was largely complete, though the final polish, as usual, required significant fine-tuning.

Key Challenges and Considerations

Despite the successes, several significant challenges and "gotchas" remain when relying heavily on AI for development:

Looking Ahead: The Future of AI in Development

One thing seems certain: AI is rapidly becoming an indispensable tool in the software developer's toolkit. Whether it's writing unit tests, refactoring complex code, generating boilerplate, adding features, or even assisting with system design, Large Language Models (LLMs) are increasingly involved.

Right now, agentic coding excels at creating impressive prototypes and enabling developers to quickly learn and utilize unfamiliar technologies. However, it's still an open question whether today's tools can reliably build and maintain truly large-scale, mission-critical applications without significant human oversight and architectural guidance. Challenges around coordination, cost, context management, and ensuring architectural integrity remain.

But the pace of progress is breathtaking. By the time you read this article, the landscape may have shifted yet again. It's highly likely that using AI assistants throughout the entire software development lifecycle will become standard practice. While AI agents might not fully replace skilled human engineers in the near future, they are undeniably transforming the development process and dramatically amplifying what a single developer can achieve.

If you have thoughts to share on this topic, feel free to join the discussion on this X thread.

Want to explore the projects mentioned? Check out my collection of AI-coded experiments: The Sandbox.

You are reading a version of the article which Gemini re-wrote to flow better while keeping all the original contents. The human-written version of this article can be found here.