← Back to Blog

My AI Coding Partner Almost Drove My Project Off a Cliff: A Cautionary Tale

  1. Core Scheduling Engine: A robust, asyncio-based scheduler.
  2. Intelligent Task Execution: A system that could analyze task outcomes and dynamically adjust parameters for future runs.
  3. Auto-Improving Task Generation: A feature where the AI could rewrite a task's underlying script based on performance metrics or failures.
  4. Human Ticketing System: When a task failed irrecoverably, the system would generate a "ticket," which a human could review, annotate with a solution, and feed back into the system's knowledge base.

The AI dutifully churned out code. The ticketing system was built, the self-improvement hooks were added, and the scheduler was wired up. On the surface, it looked like a massive success. The problem was that I was acting as a project manager, not an architect. I was specifying what to build, but I wasn't rigorously guiding how it was built and integrated.

The Collision with Reality: Four Horsemen of the AI-pocalypse

The rapid, un-reviewed development cycle papered over deep-seated issues. When I finally tried to run the integrated system, it collapsed. The root causes were not novel or exotic; they were classic software engineering failures, amplified and accelerated by AI.

1. The Siren's Call of Over-engineering

The AI doesn't have the business context or architectural foresight to say, "This is too much for one go." It is an incredibly powerful implementation engine. By asking for everything at once, I had inadvertently directed it to build a solution for a future that didn't exist yet, ignoring the immediate need for a stable foundation. The "intelligent" features were bolted onto a core that had never been pressure-tested, creating a solution in search of a problem.

2. The Technical Debt Avalanche

This is where the theoretical problems became painfully concrete. The AI, in its effort to satisfy all requests, made expedient choices that created fundamental conflicts.

The most glaring issue was the asyncio event loop conflict. Different modules, likely developed in separate AI prompts, were trying to manage the event loop independently. For example, the core scheduler might have been initialized with:

Problematic Code Snippet 1: Conflicting Event Loops

# In scheduler_core.py, generated by one prompt
import asyncio
from apscheduler.schedulers.asyncio import AsyncIOScheduler

class MainScheduler:
    def __init__(self):
        self.scheduler = AsyncIOScheduler()

    def run(self):
        self.scheduler.start()
        # This call blocks forever, running the loop.
        asyncio.get_event_loop().run_forever() 

# In ticketing_system.py, generated by another prompt
import asyncio

class TicketingSystem:
    async def process_ticket(self, ticket_data):
        # ... logic ...
        print("Processing ticket")

    def handle_failed_task(self, task_info):
        # This is the anti-pattern! It tries to run a new loop.
        asyncio.run(self.process_ticket(task_info))

When handle_failed_task was called by the running scheduler, it would crash with a RuntimeError: This event loop is already running. The AI, focusing on the local context of the ticketing system, used the convenient asyncio.run(), unaware that it was part of a larger, already-running event loop.

Furthermore, the choice of apscheduler's CronTrigger introduced another layer of complexity. Its blocking nature and separate threading model didn't mesh cleanly with the fully asynchronous design I envisioned, leading to timing bugs and difficult-to-debug race conditions.

3. The Ghost in the Machine: The Peril of Unsupervised Development

My development process was flawed. I was treating the AI as an autonomous developer, giving it a list of features and expecting a coherent result. I had abdicated my role as the architect and reviewer.

A human developer, if asked to build all this, would have pushed back. They would have asked for clarification, proposed a phased rollout, and raised concerns about complexity. The AI did not. It simply executed, weaving a tangled web without the holistic understanding that comes from experience. Without regular, manual code reviews and integration testing at each step, I was blind to the accumulating architectural rot.

4. The Architectural Tangle

The end result was a tightly coupled monolith. The "Auto-Improving Task Generation" module had direct dependencies on the TicketingSystem's internal data structures. The MainScheduler knew intimate details about how the "Intelligent Task Execution" module worked.

Conceptual Problem: Tight Coupling

# Before: A tangled mess of dependencies
class MainScheduler:
    def __init__(self):
        # The scheduler directly instantiates its "smart" components
        self.improver = AutoTaskImprover()
        self.ticketer = TicketingSystem()

    def _execute_task(self, task):
        result = task.run()
        if not result.success:
            # Direct call into another module's implementation
            new_script = self.improver.analyze_and_suggest_fix(task.script, result.error)
            if new_script:
                task.update_script(new_script)
            else:
                # Another direct, deep call
                self.ticketer.handle_failed_task(task.info) 

Debugging was a nightmare. A failure in one component would cascade through the system, making it impossible to isolate the root cause. The system was not a collection of cooperating modules; it was a single, fragile machine.

The Recovery: Lessons for a Human-AI Partnership

Crawling back from this brink was an exercise in humility and a return to first principles. The recovery process gave me a clear framework for working with AI, one that leverages its power without succumbing to its pitfalls.

Lesson 1: Embrace Incrementalism (The "Crawl, Walk, Run" Method)

The first step was to tear it all down. I started again with a single, clear goal: build a rock-solid, simple, asynchronous task scheduler. No "intelligence," no "self-improvement." Just a stable core.

Only after this core was built, tested, and proven did I begin to add the AI features back, one by one. Each new feature was developed as a distinct, optional module, not a core component.

The Fix: A Modular, Pluggable Architecture

# After: A clean, decoupled design using dependency injection

# --- Core Scheduler (knows nothing about "smart" features) ---
class MainScheduler:
    def __init__(self, plugins=None):
        self.plugins = plugins or []

    def _execute_task(self, task):
        result = task.run()
        if not result.success:
            # The core only publishes an event, it doesn't know the consumers.
            self.publish_event('task_failed', task=task, result=result)

    def publish_event(self, event_type, **kwargs):
        for plugin in self.plugins:
            if hasattr(plugin, f"on_{event_type}"):
                getattr(plugin, f"on_{event_type}")(**kwargs)

# --- Optional Plugin ---
class AutoImprovementPlugin:
    def on_task_failed(self, task, result):
        # Logic to improve task is now isolated here
        print(f"Plugin: Analyzing failure for task {task.id}")
        # ...

# --- Main application wiring ---
core_scheduler = MainScheduler(plugins=[AutoImprovementPlugin()])
# Now the smart feature is an optional plugin, not a core dependency.

This approach keeps the core clean and allows features to be enabled, disabled, or replaced without affecting the rest of the system.

Lesson 2: The Human-in-the-Loop is Non-Negotiable

I changed my role from "project manager" to "lead architect and senior developer." The AI is my brilliant but inexperienced junior partner. My new workflow looks like this:

  1. Define a small, isolated task. (e.g., "Create a plugin class that logs task failures to a JSON file.")
  2. AI generates the code.
  3. I critically review every line. I check for anti-patterns, architectural mismatches, and incorrect assumptions.
  4. I refactor and integrate the code myself. I am the one who connects it to the main application, ensuring it adheres to the established architecture.
  5. I write the integration tests and commit.

This human-centric loop is the single most important change I made. It keeps me in control of the architecture and quality.

Lesson 3: Simple is (Still) Better Than Complex

The asyncio problem was solved by enforcing a single, simple rule: there is only one event loop, and it is managed by the application's entry point. Modules and plugins must never call asyncio.run() or loop.run_forever(). Instead, they expose async functions that the main loop can await.

The Fix: A Single, Unified Event Loop

# In a plugin file (e.g., ticketing_plugin.py)
class TicketingPlugin:
    async def on_task_failed(self, task, result):
        # This function is now async and expects to be awaited
        await self.create_ticket(task.info)

    async def create_ticket(self, info):
        print(f"Creating ticket for {info}")
        # ... await async I/O operations ...
        await asyncio.sleep(0.1) 

# In the main application entry point
async def main():
    # Plugins are now designed to be awaited
    ticketing_plugin = TicketingPlugin()
    scheduler = MainScheduler(plugins=[ticketing_plugin])
    
    # The scheduler's `publish_event` would need to be async
    # and await the plugin calls.
    
    # ... startup logic ...
    await scheduler.run() # The main run function is now awaitable

if __name__ == "__main__":
    # The one and only place the event loop is run
    asyncio.run(main())

This architectural principle—simplicity—must be enforced by the human developer. An AI, optimizing for a local goal, may not choose the simplest global path.

Final Thoughts: The Pilot, Not the Passenger

AI development tools are not autonomous pilots for your projects; they are incredibly powerful copilots. They can handle complex maneuvers, process vast amounts of information, and execute instructions with superhuman speed. But the human developer must remain the pilot-in-command, responsible for the flight plan (architecture), the pre-flight checks (code reviews), and the ultimate direction of the journey.

My experience with Nighthawks taught me that the promise of AI is real, but it requires a new kind of discipline. We must resist the temptation to let it run unsupervised. Instead, we must guide it, question it, and integrate its output with the wisdom and foresight that only a human architect can provide.

By pairing our strategic oversight with the AI's tactical prowess, we can avoid flying into a storm of complexity and instead navigate toward building truly remarkable software.

Key Takeaways

  1. Start Simple: Build a solid foundation before adding intelligent features
  2. Human Review is Critical: Every AI-generated code needs human architectural review
  3. Modular Design: Keep features optional and loosely coupled
  4. Incremental Development: Add one feature at a time and test thoroughly
  5. Architecture Matters: The human must remain the architect, not just the project manager

The future of software development lies not in replacing human judgment with AI, but in creating a partnership where each contributes their unique strengths to build better software, faster and more reliably.