Semantic model sprawl is a symptom, not the disease

Semantic model sprawl isn't caused by having too many Power BI models. It's caused by having too many competing definitions of the business. Fix the numbers everyone trusts, and the model count largely takes care of itself.

· 2 min read
Semantic model sprawl is a symptom, not the disease

There's a good piece doing the rounds on semantic model sprawl - the familiar enterprise pattern where report-first development quietly breeds dozens, then hundreds, of overlapping Power BI models. The advice is sound: thin reports over thick ones, certification tiers, separate model and report workspaces. I'd sign off on all of it.

But I want to push on the framing, because counting models misses what's actually gone wrong.

The count isn't the problem

Sprawl gets diagnosed as a hygiene issue - too many models, too much duplicated refresh, too much storage. Tidy it up and you feel better. But a hundred tidy models that each carry their own definition of "revenue" are not better governed than a hundred messy ones.

The real failure is that every one of those models encodes a private opinion about what the business measures. In a manufacturing setting it's rarely revenue that bites first - it's OEE. Every plant builds its own availability report. Three of them count planned downtime differently. By the time the numbers reach a leadership review, the meeting isn't about what to do, it's an argument about whose number is right. That argument is the cost of sprawl. The storage bill is rounding error next to it.

Governance usually aims at the wrong target

Most Power BI governance effort goes on the perimeter. Tenant settings. Sensitivity labels. Who's allowed to publish. Useful guardrails, all of them, and none of them touch the thing that determines whether two reports agree.

The artefact that decides that is the semantic model. One certified model that defines the measures, owned by a named person, with everything else built as a thin report on top of it. Get that right and you've governed the part that matters. Get it wrong and no amount of label taxonomy will save you, because the disagreement is baked into the definitions themselves.

Certification only works if someone will say no

The three-tier certification model (uncertified, promoted, certified) is the right shape, but fails when "certified" becomes a rubber stamp because nobody wants the awkward conversation. A certification process where everything gets certified is just a longer way of having no certification at all.

Certification is a governance function precisely because it includes the word no. No, that measure duplicates one we already have. No, that model can't be promoted until it has an owner who'll answer for it. If your process can't say that, you don't have governance.

Self-service belongs ON TOP, not instead

None of this is an argument for locking everything down. The teams that try to ban departmental modelling end up with shadow exports to Excel within a month, and now you've lost the lineage entirely. Composite models exist for a reason: connect to the certified core, add the local budget or target table the central model will never carry, and you get the last 20% without forking the definition of the first 80%.

That's the whole game. A governed core that everyone trusts, and freedom around the edges that builds on it rather than competing with it. Governance as the thing that makes self-service safe, not the thing that strangles it.

So before you launch a project to rationalise your model estate, ask the quieter question. Not "how many models do we have," but "for the handful of numbers this business actually runs on, is there one model that owns each, and one person who owns that model." If the answer is no, the sprawl was never the disease. It was the symptom.