April 1, 2026

AI Thinking

How MightyBot Compiles Plain English into Deterministic Workflows

How MightyBot Compiles Plain English into Deterministic Workflows

Summary: MightyBot's compilation pipeline transforms plain English policies into hybrid execution plans that combine deterministic code paths with structured LLM calls. The result: automation that runs identically every time, uses fewer tokens, and eliminates the trial-and-error loops that plague agent frameworks built on ReAct.


Most AI agent platforms treat execution as an emergent property. You give the agent a goal, it reasons about what to do, tries something, observes the result, and tries again. This is the ReAct pattern: Reason, Act, Observe, repeat. It works for demos. It fails in production.

The problem is predictability. When your "plan" materializes at runtime through iterative reasoning, you get different execution paths for the same input. You get token costs that vary by 10x depending on how many retries the agent needs. You get failures that are impossible to reproduce because the reasoning chain was nondeterministic. For enterprise workflows processing thousands of documents per day, this is a non-starter.

MightyBot takes a fundamentally different approach. Instead of letting agents figure out the plan at runtime, the platform compiles policies into deterministic execution plans before anything runs. The analogy to compiled vs. interpreted languages is deliberate: compiled execution trades flexibility for speed and predictability. For regulated industries where consistency matters more than ad hoc problem-solving, that tradeoff is the right one.

Policy Ingestion: Starting with Plain English

The compilation pipeline begins with a policy: a plain English description of the process the user wants to automate.

Here is a real example from construction lending:

"When a new loan application arrives, extract the borrower name, property address, loan amount, and insurance certificates. Verify that general liability coverage exceeds $2M. Flag any application missing required documents."

This is not pseudocode. It is not a prompt template. It is a business rule written by a domain expert who may never have written a line of code. The policy captures what needs to happen and what the success criteria are. It does not specify how to parse a PDF or which API to call.

The ingestion step normalizes the policy into a structured representation. It identifies entities (borrower, property, insurance certificate), actions (extract, verify, flag), conditions (coverage exceeds $2M), and failure modes (missing documents). This structured representation becomes the input to schema generation.

Schema Generation: From Intent to Structure

The platform analyzes the structured policy and generates a typed schema that defines the contract for the entire workflow.

For the loan application policy above, the generated schema includes:

  • Input types: PDF document (loan application), PDF or image (insurance certificate)
  • Output types: Structured extraction result with borrower name (string), property address (address object), loan amount (currency), coverage amount (currency), compliance status (boolean), missing document list (array)
  • Validation rules: Coverage amount must be numeric and denominated in USD. Loan amount must be positive. Borrower name must be non-empty.
  • Required fields: Every output field is marked as required or optional based on the policy language. "Extract the borrower name" makes it required. "Include the co-borrower name if present" makes it optional.

The schema is not a black box. It is inspectable and editable. Engineers can review the generated types, adjust validation rules, add constraints the policy did not explicitly state, and version the schema alongside the policy. This is the first point where a human can verify that the platform understood the intent correctly, before any execution happens.

Schema generation also identifies ambiguities in the policy. If the policy says "verify that coverage is sufficient" without specifying a threshold, the platform flags this as an unresolved parameter and asks for clarification. This happens at compile time, not at runtime when a document is already being processed.

Execution Plan Compilation: The Hybrid Split

This is the core of the compilation pipeline. The platform takes the typed schema and determines, for each step, whether it can be handled deterministically or requires LLM reasoning.

Deterministic paths include:

  • Field extraction from structured documents (pulling named fields from forms with consistent layouts)
  • Numeric comparisons (coverage amount > $2,000,000)
  • Boolean logic (is the document present? does the value exceed the threshold?)
  • Data transformations (currency normalization, date parsing, address standardization)
  • Routing decisions based on document type or extracted values

These steps are compiled to code. They execute in milliseconds, cost zero tokens, and produce identical results every time.

LLM-routed paths include:

  • Interpreting ambiguous document formats (an insurance certificate that uses non-standard layouts)
  • Extracting information from unstructured text (a cover letter describing the project scope)
  • Handling edge cases the policy did not anticipate (a document in a foreign language, a scanned image with poor OCR quality)
  • Making judgment calls that require contextual understanding (does this exclusion clause effectively void the coverage requirement?)

These steps use structured LLM calls with constrained outputs. The LLM is not asked to "figure out what to do." It is given a specific extraction or classification task with a defined output schema. The response is validated against the schema before the pipeline continues.

The compiled plan is a directed acyclic graph (DAG) where each node is either a deterministic function or a structured LLM call. The edges define data flow and dependencies. This graph is static: it does not change between executions. The only variation is which LLM-routed paths are activated based on the input data.

Optimization: Fewer Tokens, Faster Execution

Once the execution plan is compiled, the platform optimizes it for cost and speed.

Deterministic path optimization: Code-compiled steps are optimized using standard techniques: constant folding, dead code elimination, short-circuit evaluation. If a loan application is missing the insurance certificate entirely, the platform skips all downstream coverage verification steps rather than running them against empty data.

LLM call optimization: Multiple LLM steps that operate on the same document are batched into a single call with a combined output schema. Instead of making five separate API calls to extract five fields from an insurance certificate, the platform makes one call that extracts all five. This reduces latency (one round trip instead of five) and tokens (the document is included in the context once, not five times).

Constrained outputs: Every LLM call specifies an output schema using structured generation (JSON mode with a defined schema). The LLM cannot produce freeform text; it must return a valid instance of the expected type. This eliminates parsing failures and retries caused by malformed responses.

Token budgeting: The platform estimates token usage for each LLM step at compile time based on expected input sizes and output schemas. This makes costs predictable before the workflow processes its first document. Compare this to ReAct agents, where token usage depends on how many reasoning loops the agent needs and cannot be estimated in advance.

Deployment: Versioned, Testable, Rollbackable

A compiled execution plan is an artifact, like a compiled binary. It has a version number, a hash, and a complete dependency manifest (which policy version, which schema version, which model version).

Testing: Plans can be tested against historical data before deployment. Run 1,000 past loan applications through the new plan, compare outputs to the previous version, and flag any differences. This is regression testing for automation, something that is impossible with nondeterministic agent frameworks where rerunning the same input may produce different results.

Rollback: If a new policy version produces unexpected results in production, rollback is instant. Switch back to the previous compiled plan. There is no retraining, no prompt tuning, no hoping the agent "learns" from corrections.

Observability: Because the execution graph is static, monitoring is straightforward. Each node reports execution time, token usage (for LLM nodes), and output values. You can identify bottlenecks, track accuracy per step, and alert on anomalies. The why-trail for each execution links every output back to the specific policy rule, schema version, and source document that produced it.

Compiled vs. Interpreted: When Each Approach Wins

The compiled approach is not universally better. It is better for a specific and large category of enterprise use cases.

Compiled execution wins when:

  • The process is well-defined and repeatable (document processing, compliance checks, data extraction)
  • Consistency matters more than flexibility (regulated industries, financial services, insurance)
  • Volume is high enough that per-execution cost and latency matter (thousands of documents per day)
  • Auditability is required (every execution must be traceable and reproducible)
  • The cost of errors is high (incorrect loan decisions, missed compliance violations)

Interpreted/ReAct wins when:

  • The task is exploratory and poorly defined (research, open-ended analysis)
  • Each execution is genuinely unique (creative work, novel problem-solving)
  • The user is interacting conversationally and refining the goal in real time
  • Error tolerance is high and the cost of retries is low

Most enterprise automation falls squarely in the first category. The process is known. The inputs are structured (or semi-structured). The outputs have a defined schema. The rules are explicit. For these workflows, compilation is the right abstraction. You do not need an agent that "figures it out" at runtime. You need a system that executes a known plan reliably.

What This Means for Engineers Evaluating Platforms

If you are building automation for an enterprise use case, ask your platform vendor these questions:

  1. Is the execution plan static or dynamic? If the plan is generated at runtime, how do you ensure consistency across executions?
  2. Can you test a plan against historical data before deploying it? If not, how do you catch regressions?
  3. What is the token cost variance for the same input? If it varies by more than 20%, the platform is likely using iterative reasoning with retries.
  4. Can you roll back to a previous version instantly? If rollback requires redeployment or retraining, the platform is not treating execution plans as versioned artifacts.
  5. Can you inspect the execution plan before it runs? If the plan is opaque, you cannot verify that it will do what you intended.

MightyBot's compilation pipeline is designed to make all five of these questions answerable with a yes.


Related Reading


Frequently Asked Questions

Does the compilation pipeline work with any document type?

The pipeline handles PDFs, images, structured forms, and unstructured text. Document type affects which steps are routed to deterministic vs. LLM paths. A structured form with consistent field positions uses deterministic extraction. A freeform document with variable layouts routes to an LLM call with a constrained output schema. The pipeline adapts per document, but the execution plan itself remains static.

How long does compilation take?

Policy compilation typically completes in under 30 seconds. Schema generation and execution plan optimization happen once per policy version, not per document. Once compiled, the plan executes against individual documents in seconds. The upfront compilation cost is amortized across every subsequent execution.

Can I modify the compiled plan manually?

Yes. The generated schema and execution plan are both inspectable and editable. Engineers can adjust field types, add validation rules, override the deterministic/LLM routing for specific steps, and add custom preprocessing logic. The platform treats these modifications as part of the versioned artifact, so they are preserved across recompilations and included in rollback history.

What happens when the LLM model version changes?

Model updates do not automatically propagate to compiled plans. Each plan pins a specific model version in its dependency manifest. When a new model version is available, you can recompile the plan against the new model, run regression tests against historical data, and promote the updated plan only after verifying that outputs are consistent. This prevents silent behavior changes from upstream model updates.

Related Posts

See all Blogs