AutoPackager: Multi-Agent Software Deployment

AI-powered autonomous software packaging and deployment platform

Problem Statement

Enterprise software deployment at scale is a coordination problem disguised as a technical one. In a 200,000-device environment, the manual packaging workflow looks like this: someone discovers a new software release, downloads it, reverse-engineers the installation parameters, builds a package configuration, tests it on a representative device, fixes the inevitable edge cases, and then deploys it through Intune. Each step requires human judgment, domain knowledge, and context switching. The cycle time from release to deployment is measured in days or weeks.

The typical enterprise solution is to hire more people or build more process. AutoPackager takes a different approach: treat the entire workflow as an orchestration problem, where each step is handled by a specialized agent that understands its domain, and the system coordinates their work through a state machine. The goal isn't full automation — it's to move the human decision point from "should I click this button" to "does this deployment plan make sense."

Architecture Overview

AutoPackager is built as a multi-agent system where each agent is responsible for a specific phase of the deployment lifecycle. The agents are:

Each agent operates independently but reports state changes to a central orchestrator built on Celery and Redis. The orchestrator maintains the workflow state machine, ensures agents don't step on each other, and provides observability into the entire pipeline.

┌─────────────────────────────────────────────────────────┐
│                   Orchestration Layer                   │
│              (Celery + Redis State Machine)             │
└─────────────────────────────────────────────────────────┘
           │              │              │              │
           ▼              ▼              ▼              ▼
    ┌──────────┐   ┌──────────┐   ┌──────────┐   ┌──────────┐
    │Discovery │   │Packaging │   │ Testing  │   │Deployment│
    │  Agent   │   │  Agent   │   │  Agent   │   │  Agent   │
    └──────────┘   └──────────┘   └──────────┘   └──────────┘
           │              │              │              │
           ▼              ▼              ▼              ▼
    ┌──────────┐   ┌──────────┐   ┌──────────┐   ┌──────────┐
    │  LLM     │   │  LLM     │   │  LLM     │   │  LLM     │
    │Abstraction│   │Abstraction│   │Abstraction│   │Abstraction│
    └──────────┘   └──────────┘   └──────────┘   └──────────┘
      │      │       │      │       │      │       │      │
      ▼      ▼       ▼      ▼       ▼      ▼       ▼      ▼
   Claude  GPT    Claude  GPT    Claude  GPT    Claude  GPT
      

Multi-Agent Orchestration Design

The orchestration layer is where most of the interesting design decisions live. The core insight is that software deployment is not a linear pipeline — it's a state machine with branching, retries, and human-in-the-loop decision points.

We use Celery as the task execution engine and Redis as the state store. Each agent exposes a set of tasks (e.g., packaging.analyze_installer, testing.provision_vm) that the orchestrator can invoke. The orchestrator maintains a workflow graph in Redis that tracks:

The state machine design allows us to handle real-world complexity: if the Packaging Agent fails because a vendor changed their installer format, the system pauses at that state, alerts the operator, and waits for intervention. Once the issue is resolved, the workflow resumes from exactly where it stopped. No data is lost, no context is forgotten.

Agent Communication Protocol

Agents communicate through a simple event-driven protocol. Each agent publishes events to Redis (e.g., packaging.complete, testing.failed) and subscribes to events from upstream agents. The orchestrator acts as the event router and enforces ordering constraints. This design keeps agents decoupled — the Testing Agent doesn't need to know how the Packaging Agent works, only that it produces a package artifact.

Technology Choices

Celery and Redis

We chose Celery for task orchestration because it's boring technology that solves the hard problems: task queuing, distributed execution, retry logic, and failure handling. Redis serves as both the message broker and the state store. The combination gives us:

The alternative would have been a workflow engine like Airflow or Temporal, but both felt like over-engineering for our use case. Celery's task-based model maps cleanly to our agent architecture, and the simplicity reduces the surface area for things to break.

LLM Abstraction Layer

Each agent uses LLMs for domain-specific reasoning: the Discovery Agent extracts release notes, the Packaging Agent interprets installer flags, the Testing Agent analyzes failure logs. We built an abstraction layer that supports both Claude and GPT models interchangeably.

The abstraction provides:

The key design decision was to keep the abstraction thin. We don't try to hide model-specific capabilities — if an agent needs Claude's tool use or GPT's function calling, it can use those features directly. The abstraction only handles the common path: send a prompt, get a response, handle errors.

Why Not End-to-End Automation?

We deliberately kept human approval gates in the workflow. Full autonomy is possible but risky — a packaging error could break thousands of devices. The current design automates the tedious parts (downloading installers, extracting parameters, provisioning test VMs) and surfaces the decision points to humans (does this package configuration look correct? did the test pass?). This balance gives us 80% of the efficiency gains with 20% of the risk.

Outcomes and Lessons Learned

What Worked

What Was Hard

Lessons for Multi-Agent Systems

Building AutoPackager reinforced a few principles that apply to any multi-agent AI system:

Current Status

The public precursor to AutoPackager is available on GitHub and demonstrates the basic orchestration pattern. The ML-powered version described here is in active development and not yet production-ready. The architecture is validated, the agent framework is built, and we're iterating on the LLM prompting strategy to improve reliability.

The goal is to move from "proof of concept" to "production system" by focusing on the reliability fundamentals: better error handling, more comprehensive testing, and tighter integration with Intune's API. Once those are in place, AutoPackager becomes a force multiplier for any organization managing software deployment at scale.