Ask Your MMM Anything: Agentic Dashboards for Marketing Mix Models

AI Agents
Marketing Mix Models
Bayesian Statistics
Decision Making
A recap of our PyMC Labs webinar on Decision Lens, the agentic dashboard that lets stakeholders interrogate a marketing mix model directly, with the three problems it solves and the parts that are still hard.
Author

Luca Fiaschi

Published

May 11, 2026

Last week Andy Heusser and I ran a PyMC Labs webinar called Ask Your MMM Anything. The premise is the one I have been chewing on for most of my career: building the model is no longer the hard part. Letting the people who actually make spend decisions reach the model is. Below is the recap, with the slides that did most of the work and a live walkthrough of Decision Lens on a real MMM.

The bottleneck is not the model

The first time I felt this problem was as an analyst at Rocket Internet in Europe. The model was usually fine. The follow-up questions were what killed me.

“What if competition picks up?” “What if we cut paid social by ten percent?” “What about the Northeast only?” “What if we assume synergies after the acquisition?” Every one of those was a small ask on its own. Together they ate my week. I was, in effect, a human API to a model that the business already had.

The root cause is not bad analysts and not bad stakeholders. It is the path between them.

Slide 4 from the webinar: the traditional path from stakeholder to model takes days or weeks. With an agent in the loop it takes seconds.

I think of it as three problems stacked on top of each other:

  • Slow iteration. Every “what if” goes through a person, so the loop runs in days, not seconds.
  • Translation gap. Data scientists speak Python and posteriors. Stakeholders speak P&L and PowerPoint. Every handoff loses signal.
  • Misallocated talent. Your best modelers spend their time formatting one-off scenarios instead of improving the model.

For years I assumed this was a “build better dashboards” problem. It is not. Static dashboards capture the questions someone thought of months ago. The questions you actually need are the ones that arrive in the meeting.

Two products, two users

About a year ago we started building toward a single answer to this and quickly realized it had to be two products, because the users are not the same.

Slide 5: Decision Lab is for the people who build the model. Decision Lens is for the people who need to act on it.

Decision Lab is the workbench for data scientists. It is open source. You use it to build, validate, iterate, and set guardrails on the models that drive decisions. The statistical rigor lives here.

Decision Lens is the agentic dashboard for stakeholders. Once a model has been validated in Decision Lab, you hand it to Decision Lens, and a CMO or marketing lead can query it directly in plain language. Scenarios, optimizations, plain-English explanations of what the model is doing under the hood.

Both sit on top of PyMC and PyMC-Marketing, so the underlying machinery is real Bayesian inference, not an LLM pretending to do statistics. That matters more than it sounds.

We get asked all the time: why can’t I just point ChatGPT at my data? Two reasons. A general-purpose LLM has no access to your fitted posterior, so it hallucinates numbers. And it has no native way to quantify uncertainty. If you ask GPT to estimate a probability, what you get back is essentially noise. The point of Decision Lens is to be the thin, well-instrumented layer that lets a language model talk to a real model without inventing anything.

What it looks like on a real MMM

For the webinar Andy ran Decision Lens on a fitted marketing mix model with spend across TV, direct mail, newspaper, social, paid search, and a few other channels. I will not narrate the whole demo (it is the most useful part of the video), but the shape of the interaction is worth pulling out.

The session opens with a model summary. Total revenue over the analysis window, media-driven share, R², a note that the chains converged cleanly, and the top channel drivers. The agent already knows the user’s goal (Andy had told it earlier the goal was a 20% budget reduction), so the summary is contextualized to that.

From there the conversation moves quickly:

  • “Show me total spend by channel.” The agent runs the analysis and drops a chart into the chat. Andy drags it to the dashboard. The dashboard is no longer static. It is being assembled as the conversation goes.
  • “Run the budget optimization, but I can’t change TV.” This is the move that matters. Real business constraints almost never live inside the model. The agent absorbs the constraint, runs the optimization, and returns a recommendation that cuts direct mail substantially and shifts spend into other channels.
  • “Why did we increase spend on newspaper? It looked low-ROI to me.” The agent reads its own output, notes that newspaper does have a poor return per dollar but is also the only channel with headroom left because everything else is already saturated, and explains the trade-off in plain English. This is the kind of question that, on a normal Tuesday, would have generated a Post-it note on a data scientist’s monitor.
  • “Create an insight card summarizing this.” A short text artifact gets added to the dashboard so the next person opening it sees what was decided and why.
  • “Share this with my CMO.” The CMO gets an email with a link to the dashboard and a fresh agent of their own, so they can keep poking at the same model from their own angle.

A few things worth noting. Every chart the agent produces comes from real code executed inside a Marimo notebook, not from numbers the LLM made up. The notebook is exportable, so a data scientist who is skeptical of an answer can open it and check the math. Andy’s point in the webinar was the right one: this is a developer-friendly compute environment, and the agent’s reasoning is fully auditable.

How we keep the agent from making things up

The reason most “ask your data” demos die in production is hallucination. We treat that as a first-class engineering problem, not a prompting problem.

A few of the layers that matter:

  • Pre-loaded model context. Before the user asks anything, the agent has the fitted model’s parameters, ROAS estimates, fit diagnostics, and training window in its context. It does not have to guess.
  • Code-grounded answers. All quantitative claims come from code the agent writes and executes in a sandboxed Marimo notebook, not from the LLM’s own arithmetic.
  • A validation node in the agent graph. We use LangGraph, and before any numeric answer goes back to the user, a validation step double-checks it against a gold-standard analysis. If it does not match, the system retries instead of shipping the bad answer.
  • One agent, many skills. The architecture is intentionally simple: a single agent with a curated library of skills compressed from our client work. The complexity is in the skills and the harness around the LLM, not in a sprawling multi-agent topology. We add agents only when there is a real reason to, which there usually is not.

None of this makes the system bulletproof. LLMs are stochastic, and you cannot drive that to zero. What it does mean is that when millions of dollars of marketing budget are riding on an answer, the answer comes from a system that has been engineered to catch its own mistakes, not from a chatbot guessing.

Everything is deployable inside a client’s own cloud, with sandboxed compute, K8s-native scaling, a permissions model that distinguishes admins from data scientists from stakeholders, and traces of every action. The data does not leave the client’s environment.

The parts that are still hard

I want to be honest about what is not solved.

Slide 10: the two failure modes that come up most often in real deployments.

The agent has to know what it does not know. A stakeholder asks, “what if a new competitor enters our market?” The model has no competitive variable, so the right behavior is not to invent an answer. It is to recognize the limit, explain it, and file a structured request back to the data science team to extend the model. We are still tightening that loop, and it is one of the more interesting parts of the work.

Change management is the bigger blocker than the tech. Large organizations have a workflow for getting analytical answers. It involves tickets, decks, and meetings. Telling a marketing team they can now type into a chat box and get a posterior-backed answer in twelve seconds is, surprisingly, not always greeted with cheers. People have to relearn how to ask questions and how to trust the answers. That part of the deployment is hands-on and unavoidable.

There are smaller things too. The same question phrased two different ways will sometimes route through different skills, so we put effort into making the numbers stable even when the prose is not. And scenario planning is most useful when you can compare scenarios side by side, which is something we customize per client rather than ship as a one-size view.

What changes for your team

Some people read this story as “you no longer need data scientists.” I think the opposite is more likely.

Slide 12: the role of the data scientist shifts from human API to capability builder.

When stakeholders can ask the model directly, the data science team gets pulled off the treadmill of one-off scenarios and back onto the work that actually compounds: improving the model, setting guardrails, expanding coverage to new use cases, hardening the system. That is harder work, and better work.

For the organization the math is roughly what the slide says. Roughly ten times more people end up touching the model without ten times more data scientists. Decision velocity collapses from days to seconds. Decisions get made in front of the same numbers instead of in parallel decks. And the model investments you already made get a lot more valuable because more of the business is actually using them.

Where this goes next

We picked MMM for the webinar because it is the most universally felt version of the problem. 62% of advertisers say they are actively using or building media mix models per recent eMarketer data, and a lot of those budgets are large enough that the decisions move real money. But Decision Lens is not just an MMM tool. The same architecture (a validated Bayesian model plus a tightly-scoped agent on top) also works for forecasting, causal inference, customer lifetime value, pricing, and most of the quantitative machinery that currently lives behind a technical wall.

If you want to dig in further, three places to go from here:

The thing I keep coming back to is that the math does not have to be less rigorous for the model to be more accessible. If you do the engineering right, both can be true at the same time, and the model finally gets to do the job it was built for.