How to Tell If an AI Governance Rating Is Trustworthy¶
AI governance ratings are starting to appear, and most people cannot tell a trustworthy one from a confident-looking one. The trustworthiness of a rating is decided by its structure, not its branding. Five structural questions settle it: whether the rated system can grade itself, whether it is the only witness to what it did, whether it signs off on its own safety, whether it can actually be stopped, and whether the limits were set before the action or explained after. A rating that satisfies all five is built to be trusted. One that skips them is built to look trustworthy, which is a different and lesser thing.
A number is about to start following your company around.
Boards are asking for it. Insurers are pricing toward it. Regulators are circling it. The question, how well is this organization actually governing its AI?, is becoming a thing someone puts a score on. And where there is demand for a score, scores appear.
That is good. A market that can measure governance is healthier than one that can only argue about it.
But here is the problem with a score you did not produce: you have to trust it before you can use it. And most people have no way to tell a trustworthy governance rating from a confident-looking one.
So here is the test. Five questions. They are not about the rating's branding, its dashboard, or how many data points it claims. They are about its structure. A rating's trustworthiness is decided by how it is built, not by how good its number looks.
Why structure, not reputation¶
We have done this before, in every market where machines started making decisions at scale.
Lenders did not get to set their borrowers' credit scores, so an outside party did. Bond issuers did not get to rate their own bonds, so an outside party did. Manufacturers did not get to certify their own products as safe, so an outside party did. Every time, the same lesson: a score is only worth what the independence behind it is worth. The number is downstream of the structure.
AI governance is the newest version of the same story. So when someone hands you a governance rating, do not start with the number. Start with the structure. Five questions get you there.
The five questions¶
1. Can the thing being rated grade itself?¶
If the organization (or the vendor that sold it the AI) also produces the grade on whether that AI behaved, you do not have a measurement. You have a self-assessment with a number on it.
A trustworthy rating is produced by a party that does not benefit from the result. The grader and the graded are not the same entity, and not on the same payroll. If they are, the score may be honest, but you have no structural reason to trust it, and neither does anyone you would show it to.
A grade you give yourself is a claim. A grade someone independent gives you is evidence. Only one of those survives a hard question from a regulator.
2. Is the system the only witness to what it did?¶
A rating is only as good as the record it is built on. If the only account of what the AI did comes from the AI's own logs, logs the operator can edit, then the evidence underneath the score is editable. And an editable record is not a record. It is a story.
A trustworthy rating rests on evidence the rated system could not quietly change after the fact. Ask what the score is computed from. If the answer is "what they told us about themselves," you are one step removed from self-assessment again.
3. Does the system get to sign off on its own safety?¶
"We reviewed it and it is fine" is not a control. It is a sentence. Self-certification, the system or its owner declaring its own safety, carries no weight precisely because the party making the declaration is the party with the most to lose from declaring otherwise.
A trustworthy rating does not accept self-certification as input. It measures against a bar the operator did not get to set, and it does not let "we promise" stand in for "here is the evidence."
4. Can it actually be stopped?¶
Governance that cannot halt the thing it governs is theater. If there is no real off-switch, no point at which a human decision stops the system regardless of what the system "wants," then every other control is provisional.
A trustworthy rating checks for a stop that holds. Not a setting the system can override. Not a request the system can decline. A halt that is structurally above the system, not inside it.
5. Were the limits set before the action, or explained after?¶
This is the one most ratings skip, and it is the one that matters most.
Governance that arrives after the AI acts is not governance. It is a post-mortem. The limits on what a system is allowed to do have to be in force before it acts. Bound at the moment of decision, not reconstructed afterward to explain what happened.
A trustworthy rating distinguishes between "there was a policy" and "the policy was enforced before the action reached the world." The first is paperwork. The second is governance. A score that cannot tell them apart is measuring the paperwork.
What the five questions are really asking¶
Read them together and they collapse into one principle:
A system cannot be trusted to measure, witness, or certify itself. And the authority that bounds it has to sit outside it, and act before it does.
Governed. Attested. Measured. In that order, by someone who is none of the above.
That is the whole thing. Everything else, the dashboard, the data-point count, the five-dimension model, the color of the gauge, is downstream of those five structural facts. A rating that satisfies them is built to be trusted. A rating that skips them is built to look trustworthy, which is a different and lesser thing.
You do not need to be a security architect to apply this. You need to ask five questions and notice which ones get a real answer.
Why I am the one writing this¶
Because I am building the rating that is designed to pass its own test.
I am not going to tell you I am the referee. Anyone who declares himself the referee is just a louder vendor, and you have plenty of those. What I will tell you is that the test above is the one I hold my own work to: independent of the operator, built on evidence the system cannot edit, measured against a standard the operator did not write, with a stop that holds and limits that bind before the action. I call the structural rules behind it the Five Laws of AI Governance, and the measurement built on them AQ Score™, filed with the U.S. Patent and Trademark Office as a measurement standard. Governed. Attested. Measured by AQ Score. Three things that have to be true, graded by a party that is none of them.
I did not arrive at "you cannot trust a system that grades itself" from a whitepaper. I arrived at it the hard way, refusing to let unverifiable systems into places I am responsible for, and then building the standard that measures them.
So the next time a number arrives to follow your company around, do not ask how good the number looks. Ask the five questions. Then trust the score that survives them.