“Clinical-grade” has become a marketing word. It gets attached to anything pointed at a hospital, and it rarely survives a second question. So it is worth saying plainly what we mean by it — because the gap between consumer AI and clinical-grade AI is not a matter of degree. It is a matter of kind.
Safety is the feature, not the constraint
Most software treats safety as a tax on shipping. In healthcare it is the product. A tool that is occasionally wrong in a low-stakes app is an annoyance; the same error rate at the point of care is a patient harmed. Clinical-grade means designing for the failure case first and the happy path second.
That shows up concretely: hazard logs, a named clinical safety officer, and adherence to the DCB 0129 and DCB 0160 standards. Not because a framework demands it, but because that discipline is how you find the harm before it finds a patient.
Explainability, or it doesn't count
A recommendation a clinician cannot interrogate is a recommendation they should not trust. Clinical-grade AI shows its reasoning and cites its sources, so the person accountable for the decision can see why the system said what it said — and overrule it.
If we cannot explain why a system reached a conclusion, we treat that as a defect.
This is also what makes a tool defensible to a regulator. The same property that earns a clinician's trust earns an auditor's.
Earning the bedside, in stages
Clinical-grade tools are not switched on. They are phased in:
- Shadow — the system observes and is measured, but acts on nothing.
- Supervised — it assists, with a clinician signing off every output.
- Live — it operates in the workflow, under continuous audit.
Each stage produces the evidence required to justify the next. Nothing reaches a patient on the strength of a demo.
The short version
Clinical-grade AI is software built to the standards of medicine: safe by design, explainable by default, and deployed only once it has earned the right. Everything else is just AI with a stethoscope drawn on it.