Home | Thought Leadership

Thought Leadership

Paying for judgement in an AI-world

By Nigel Watson

21 Apr 2026 6 min read

It's striking how frequently “judgement” crops up in discussions about AI adoption. I used the word a few times in a recent blog (Is an AI retention paradox hiding in equity vesting?) and it seems to have become the go‑to catch‑all for justifying a human presence somewhere in - or hovering over - the AI loop.

It has also become fashionable to observe that Remco's, boards and companies will increasingly want to reward judgement in an AI world.

The difficulty is that most incentive plans are still designed to pay for outputs, milestones and financial results, not the quality of the decisions that produced them. Judgement is often what boards say they value, but not what the performance scorecards are actually built to detect.

As AI raises the baseline for drafting, search, analysis and pattern recognition, the old distinction between the highest performer and the merely competent performer starts to narrow. When more people can produce acceptable output more quickly, the differentiator moves. It moves upward from volume of work to quality of decision.

That has obvious implications for executive pay and reward. But it does not mean Remco's should invent a new line on the bonus scorecard labelled “judgement” and allocate 15% to it.

That would not solve the problem. It would just rename it.

Judgement is not a trait. It is a pattern of decisions

In an AI context, judgement is not the same thing as doing the work manually. It is the executive capacity to frame the problem properly, decide where AI should and should not be used, interpret output in context, challenge weak or unsafe conclusions and take accountable ownership of the final call.

That is why it makes intuitive sense to say that judgement belongs in executive pay.

But, only if it is translated into something more concrete than “leadership quality” or “thoughtful use of technology”.

If a company wants to reward judgement, it should really be rewarding decision stewardship.

That means asking questions such as:

Did management identify the right use cases?
Did it invest in the right things and stop the wrong ones?
Did it use AI where it improved decision quality and restrain it where the risk was too high?
Did it spot exceptions, edge cases and failure points?
Did it improve the operating model without surrendering control?

That is much more precise. It is also much more defensible.

This is mainly an annual bonus issue, not an LTIP issue

If judgement is going to feature in pay, the annual bonus is probably its natural home.

LTIPs are designed to measure longer-range value outcomes. They are good at tracking shareholder returns, sustained financial performance or multi-year strategic delivery. They are much less good at measuring the quality of interim decisions under uncertainty.

Judgement is revealed in the choices management makes during the year: how capital is allocated, which deployments are prioritised, what is escalated, what is overridden, what is stopped and what is pushed through despite imperfection because the balance of factors justifies it.

That is normally annual bonus territory.

Put it into the LTIP and one of three things usually happens. It becomes too remote. It becomes too subjective. Or it ends up duplicating other strategic metrics that are already there.

Do not pay twice for the same thing

There is an obvious objection here.

If better judgement leads to better profit, margin or execution, is that not already rewarded through the financial metrics?

Sometimes, yes.

That is why companies need to be careful not to pay twice. A separate judgement-related component only makes sense where it captures something that is not otherwise visible in the year-end numbers. That might include disciplined capital allocation, high-quality intervention, risk avoided, unsafe deployments stopped early, or decisions whose value is real but not yet fully reflected in annual financial output.

That is an important discipline. Otherwise, “judgement” becomes a second payment for the same underlying performance.

The metric is not the same as the target

This is where many plans lose shape.

A metric is what you are measuring.
A target is the standard to be achieved.
An assessment method is how the business decides whether performance has been met.

If those three things are not kept separate, the whole exercise quickly collapses into mush.

In this context, the metric is not “judgement” in the abstract. It is something more operational, such as:

quality of AI-enabled decision-making in defined business areas;
risk-adjusted execution of AI-led operating model change;
disciplined capital allocation across AI initiatives;
quality of oversight, intervention and escalation.

The target then needs to say what success actually looks like. For example:

delivery of three approved AI-enabled workflow redesigns in identified functions;
agreed improvements in cycle time, service capacity or margin;
no material control failure, regulatory defect or customer harm arising from those deployments;
evidence that weak initiatives were stopped, redesigned or delayed rather than forced through for appearances.

And then comes the assessment method.

If a company is serious about paying for judgement, it should not assess it through a year-end narrative alone. The criteria need to be set at the start of the year and the business should assess against contemporaneous evidence: management papers, project reviews, risk input, internal control reporting, incident logs, client or customer feedback and recorded intervention points.

If companies want to pay for judgement, they should define it ex ante and assess it ex post against evidence, not against storytelling.

And being blunt, this is hard. By its nature, judgement is qualitative.

Which is precisely why sloppy design turns it into mush.

Strategic metric on the way up. Risk underpin on the way down

There are really two different compensation questions here.

The first is how to reward good AI-enabled decision-making where it creates positive business value.

That is where a strategic metric can help.

The second is how to prevent payout where that value was achieved in a poorly controlled way.

That is where an underpin, gateway or risk modifier matters.

The distinction is important.

A strategic metric can reward good prioritisation, sensible adoption and operating model improvement. But a risk underpin should sit beneath it to ensure that no amount of apparent efficiency or innovation earns a payout if there has been a material governance, compliance, conduct or control failure.

A plan that rewards AI-enabled efficiency without a meaningful control gateway is not really rewarding judgement at all. It is rewarding risk transfer.

Keep the weighting disciplined

Most companies should be wary of assigning too much of the annual bonus to this category.

If the weighting is too high, the business risks creating either subjectivity or a disguised discretionary pool. In most cases, this is better handled within an existing strategic or personal scorecard component than by allowing it to dominate the bonus.

The attraction of this area is obvious. It feels modern, strategic and important. But the same discipline still applies. Reward design should stay proportionate.

What companies are really trying to detect

The challenge here is not whether judgement matters. It plainly does.

The challenge is that traditional incentive structures are much more comfortable paying for visible outputs than for the quality of the choices behind them. AI makes that tension harder, not easier. When baseline production becomes easier, the premium shifts to discernment, challenge, prioritisation and accountability.

That means pay architecture needs to become better at identifying who actually improved the quality of the final decision.

Not who adopted the most tools.
Not who launched the most pilots.
Not who told the best story at year-end.

But who made the best calls about where AI created value, where it created risk and when human intervention mattered most.

That is much closer to judgement.

And if companies really want to reward it, they will need to design for it.

At Burges Salmon, we help boards and Remco's design incentive frameworks that reward judgement in a way that is disciplined, evidence‑based and defensible. In an AI world, judgement only belongs in pay if it has been deliberately designed for.