Lesson 01 · The Standard
How do you build an AI agent that doesn't break?
Most AI agents in the UAE fail within weeks, quietly. The fix is a standard, not a tool: would it still run if no one could ever touch it again? This is the Orbit Test.
Breaking is rarely loud. It is your AI quietly telling a customer the wrong thing, quoting the wrong price, answering an Arabic enquiry in clumsy English, and making a brand you spent years building look cheap. It can run fine for months, and you only find out it broke when the leads stop coming and the cash flow follows. This is how to build one that runs, and represents you, with no one watching, the way we build satellites.
Why most AI builds break
Every business in the UAE is bolting AI into its operations right now, and two things go wrong, every time. One: touch it and it collapses. Change a single piece, a price, a channel, a step, and the whole thing comes down, because nobody knows how to maintain it. Two: one small thing shifts and it topples. It is so fragile that the moment a dependency it does not control changes, and small things shift constantly, it falls over. Most builds here are a scraper duct-taped to an automation duct-taped to a chatbot. It works in the demo and breaks on you in a month.
This is why owners fear retainers. The word sells two different things: a subscription to the breakage, priced in because the vendor already knows it will fall over, or honest upkeep, the genuinely moving parts (an API, a price, a data format) maintained on a system that already works. Build it so it does not break, and the only retainer left is the honest one, for updates, not for breakage.
I learned this the hard way, from a drone to a satellite
My senior design project was a drone I flew from a VR headset and that had to stabilise itself against every gust. Under pressure to graduate, I patched things together until it worked. It looked like it worked. I had no idea why. Then I went from a school project to a real mission, putting a UAE satellite into orbit, where you can never touch it again. Patching does not survive that. You design it to run untouched, or it dies up there. That discipline is exactly what an AI system in your business needs.
Begin with the end: fix two numbers first
Engineering starts at the end. Before you build, you fix two numbers. One: how long must it run, untouched? A satellite has a design life set by its mission. Your system needs the same. For most UAE businesses I set it at 6 to 12 months, running without breaking. Why that long? Any system you put into a business takes about six weeks to show its first real signs and around six months to reach full potential. Judge it sooner and you are judging noise.
Two: what must it produce to be worth the hassle? Every system pays back in money or time, and time converts to money. Your AI number is simple: monthly output × the value of each, minus the cost to run. A lead engine at 100 qualified leads, each worth 200 AED in pipeline, minus 1,000 AED to run, is an AI number of 19,000 AED a month. If you cannot name that number, you are about to build a hobby, not a system.
Software, automation, AI: know where each one fits
A piece of software is just a computer doing a job a human would do; on its own it is dumb. Automation is that dumb software given a fixed set of rules, running the same way every time. AIis the layer of intelligence you add on top so it can handle what the rules never anticipated. Most of what people call “AI” is really automation with intelligence bolted into one part of it.
AI is probability, so you contain where it can act
Generative AI works by predicting the next word, one at a time, from everything before it. So it is genuinely useful but never fully certain. Yes, it hallucinates, and so do people on the job. The goal is not to remove the uncertainty, it is to contain where it is allowed to act. You do not hand AI the whole flow. You box it into a single sub-step, “write this one report,” not “run the whole onboarding.” The flow around it stays deterministic and boring, and boring is what survives.
Test it the way we test a satellite
A satellite is tested in two phases. Initiation: shaken on a vibration table to survive launch. Operation: baked and frozen, roughly +120°C facing the sun and −150°C behind the Earth, in a thermal-vacuum chamber. Your AI does not face heat and cold. It faces people, who are arguably more unpredictable than space. So define your extremes and test against them: a WhatsApp voice note in Arabic, the most hostile or confused customer, a flood of enquiries during a Ramadan rush, a dependency that quietly changes its output. Under each, it must still hit the number, or fail safe.
The Orbit Test: the standard everything is held to
One question decides it: would it run to its design life, hitting its number, with no one touching it? Hit one or two of these and it still breaks. It has to pass all five:
- □OT-1 · Design life is set. A declared run time it must operate untouched. Six to twelve months, minimum.
- □OT-2 · The AI number is defined. The money or time it must produce, named before any work begins.
- □OT-3 · Automation vs AI is deliberate. Intelligence applied only where it earns its place, not sprayed across the flow.
- □OT-4 · The AI is contained. Probabilistic behaviour bounded to one sub-step; the rest stays deterministic.
- □OT-5 · It survives the extremes. Verified against the worst inputs, hostile users, volume, and dependency change.
Score a real build
A Business Bay clinic gets enquiries on WhatsApp and Instagram, in Arabic and English, and bought an AI that replies and books. Score it:
| Criterion | The clinic's build | Box |
|---|---|---|
| OT-1 Design life | Nobody set how long it must run untouched. The vendor never asked. | open |
| OT-2 AI number | The owner knows it: ~200 enquiries a month, ~150 AED in booked value each. | met |
| OT-3 Automation vs AI | The AI answers, qualifies and books, all of it, freely. | open |
| OT-4 Contained | The whole flow is the AI. Nothing around it is deterministic. | open |
| OT-5 Survives extremes | Never tested on an Arabic voice note, a promo rush, or an API change. | open |
One box met, four open. It sails through the demo, then dies the first time a patient sends a voice note in Arabic during a Ramadan rush. That is the test doing its job: find the open boxes before you trust it with a customer, not after. Build it like a satellite, and the moment it launches, assume you can never go up and fix it.
Free workbook
The Orbit Test scorecard
Score any AI build against OT-1 to OT-5 before you trust it with a customer. The checklist, yours free.
Want it built for you?