The rule an expert uses to turn what they observe into a judgment. "His tone goes flat the second money comes up" is a signal. "That is stonewalling, and it is what ends the conversation" is the rule. What the coach can act on is the verdict. Signal, then rule, then verdict.
Raw signals are noise. The rule is where the coach's expertise actually lives, and it is the one part a model cannot fake from a transcript. It is what we sell, and it is what compounds. Get the rule wrong and everything downstream is a clever toy.
The rule is the coach. Everything else is plumbing around it.
The team argues about the small part. The defensible part is the big one underneath. That is the whole confusion.
Transcription gives you the signal. The rule is the part they pay for.
Gottman is in a library. EFT is in a library. Transcription and emotion scores are commodity APIs. Anyone can encode the rules. Anyone can buy the signals.
So the honest question Fred will ask, and should: why can't OpenAI ship this next month?
The verdict is not the defensible part. We should stop pretending it is.
The verdict leaves the box. The data never does. That boundary is the moat.
They record their real conversations between sessions; the coach sees what they never could. The read sells "clever AI." The pattern across their real life sells the moat. The trend is in v1, it is the whole point.
Client records at home, gets a kind read, shares it. The coach gets a heads-up with the cross-session trend. Press "Play the moment" inside.
The heuristic is the product they see. The infrastructure is why it keeps getting better in a way a wrapper can't. The verdict is how we acquire. The infrastructure is how it compounds.
Sell the wedge. Build the moat. Never confuse which is which.
The value isn't the agent. It's the system that stores and compounds it into our infrastructure.
This is the clarity. The confusion ends when everyone knows which job they are on.
The golden test pair is real: a 37-minute couples recording with a verbatim expected verdict, 6 signals, 1 verdict, 1 commitment, scored at recall ≥ 80%. Clearing it splits three ways, from the Jun 12 sync.
An agreed eval per agent: a clip in, the expected read out, relationship coach first. The lab maps a conversation into the cubby structure. Two still failing to fix: monotone voice, over-rehearsed.
The CI that runs the evals against a vault on each push, already live for hiring. The deterministic acoustic checks, pace and loudness. A real thesis per signal, not transcript-to-LLM.
The lab environment, with Bren: turn a recording into the segments and cubbies the eval reads, and confirm the relationship-coach wiring.
The moat scores differently: recordings retained, methods onboarded. Two scoreboards, two jobs.
Diagnostic read or therapeutic coaching as the wedge?Recommendation: diagnostic read, framework-backed. The sharper wow. Everything downstream follows this.
Relationship coach confirmed as the launch hero?It has the only fully-specified eval. Dating is the face; relationship is the provable engine.
Memory is in v1: the cross-session trend is the moat made visible.Decided. The open question is depth, not whether: how many recordings before the trend is trustworthy enough to put in front of a coach?
What anchors "right" when a relationship has no win/loss?Hiring had "who you actually hired." We need the equivalent ground truth for the verdict.
Your call on the read. The rulebook follows.
Lift the proven 3-layer engine, swap vocabulary.
Score against the golden. A number, not a vibe.
Capture, consent, compound. The flywheel turns.
The heuristic is the product they see. The infrastructure is why it compounds. That sentence is the company.