Methodology

The Marketplace SDK Dogfood Loop

Ten numbered runs hardened Marketplace skills; Run 10 (PageShot) shipped first in Pages. Later apps dogfed the same learnings—patches, scale proof, or regression when skills already matched.

Methodology7 min readApril 28, 2026

I — The shape of the approach

Skills aren't trustworthy until something real has been built against them. Each numbered run (Run 01–Run 10 in the ledger below) was a fresh scaffold that exercised one extension point end-to-end, surfaced gaps as patches, then re-fed the next run.

Run 10 is PageShot — the last row in that ledger and the first production run in this narrative: the first app taken from PRD through /ship inside a real Sitecore Pages tenant, built against the patched Marketplace skills. The ships that followed — QuickCopy, Component Atlas, and Last-Edit Trail — are not Run 11–13. They build on Run 10 and everything before it (same skill files, same pipeline, new PRDs) and each one still dogfoods the process: a real scaffold, a real iframe, real tenant traffic.

Did they all add value? Yes — dogfood value is not only “new patches.” QuickCopy returned catalogue patches when production diverged from what tests encoded (the response-shape class of bugs). Component Atlas returned scale and coverage — two extension points, tenant-wide graphs, heavier xmc.agent.* use. Last-Edit Trail returned regression proof: zero new skill patches, meaning the skills already matched reality for that build. Those are different signals; all three strengthen the loop.

Synthetic harnesses

Runs 1 – 3

Code-only, agent-side gates. Bootstraps and re-dogfood-loop establishment.

Live co-execution

Runs 4 – 9

Christian + Claude in the portal at the same time. Caught what code-only never could.

Real products

Run 10 → then more dogfood ships

Run 10 · PageShot closes the numbered ledger as the first production ship in Pages. Later apps reuse those learnings — same pipeline, new PRDs; each is dogfood; some surface new patches, others prove the catalogue already fits.

II — The runs, one by one

Every row is a real Marketplace app build. Each one had to do something new — a new extension point, a new scaffold, a new module — so the skills had something fresh to fail against.

Run 01xmc:fullscreen

First scaffold

11 doc-gaps caught in one run — install prompts, next-app/ subdir, Provider pattern, double-unwrap, Xmapp namespace.

Run 02xmc:pages:contextpanel

Re-dogfood

All 7 prior patches held. Surfaced invalid-input rule, redirect/404 templates, the as-string contextId anti-pattern.

M1 · Run 03xmc:fullscreen

First agent-verified harness

Reusable visual harness pattern. Caught lint-fail-on-clean-scaffold + missing test runner + Vitest type-globals gap.

M2 · Livexmc:pages:contextpanel

Pages Context Panel

pages.context uses query-option subscribe, not the subscribe verb. Two distinct code paths now documented.

M3 · Livexmc:pages:customfield

Custom Field

Utility methods are the canonical surface — no custom-field query keys. Added typed read/writeFieldValue wrappers.

M4 · Run 06xmc:dashboardwidget × 2

Dashboard widgets

Multi-route works. Found root-route templates render inside MarketplaceProvider — spinner blocks 404s outside the iframe.

M5 · Livexmc:standalone

Standalone — the umbrella discovery

Standalone is the Cloud Portal's umbrella surface; XMC works by picking the matching sitecoreContextId. Three skills rewritten.

M6 · Livexmc:fullscreen + AI

AI module

First @sitecore-marketplace-sdk/ai install. Recorded live SectionReview shape + undocumented payload-size ceiling.

M7 · Run 09Mode B · Auth0

Scaffold 3 — first full-stack dogfood

Discovered dual next.config.{mjs,ts}, the Server-Action-vs-Route-Handler unwrap inconsistency, mode-independent Xmapp namespace.

Run 10 · Realxmc:pages:contextpanel

PageShot

First product shipped inside Pages. Found iframe sandbox blocks <a download>, undocumented screenshot_base64 field, three-segment env-var convention.

III — Inside the loop

The phases above answer what ran. This section answers how — the mechanism a critic should be able to reproduce.

How a finding becomes a patch

Every run follows the same ten-step contract: prep → execute the skill as written → instrument every SDK call site with structured request / ok / error logs → agent-side gates (typecheck, build, lint, unit tests, no escape hatches) → user-side checklist → record in the matching catalog → enqueue patch candidates → apply patches to the skill files → update ledger → next run.

The artefact is the patch. Each one gets an ID and lands in a specific section of a specific skill, with a one-line lesson. The ledger records which run applied it and which deferred — with reason. That is the audit trail for the "42 patches" headline number — they are addressable, not aggregate.

Graduation gates

Four explicit sequencing rules — none implicit, none "when it feels ready".

Gate	Rule
Synthetic → Live	An M-run must be `done (agent-verified)` before its matching L-run starts. Mixing code-gap signal with runtime-gap signal kills causality.
One run per sitting	Execute, record, patch. No batching. The next run is the regression test for the previous one's patches.
Re-dogfood cuts the line	An SDK bump or a freshly-applied patch fires a re-dogfood trigger. The triggered run runs before the planned next run.
Patch budget	More than five patch candidates in a single run pauses the phase until they're applied — otherwise the causal trail breaks.

What we sort findings into

Findings split three ways at recording time, so live signal stays separable from code signal:

Agent-verified — passes typecheck / build / lint / unit tests; call shapes match skill files verbatim; no forbidden casts.
User-verified — Christian confirmed the Sitecore-side effect in SitecoreAI (item renamed, field updated, canvas refreshed).
Live-observed — only visible with Claude tailing the dev-server log while Christian drove the portal at the same time.

Patches themselves bin into six categories: doc-gap, anti-pattern, type sloppiness (wrapper or upstream fix), architectural assumption (often a multi-skill rewrite — Run 7 rewrote three), sandbox / runtime constraint (PageShot's iframe <a download> block), upstream SDK bug (tracked + deferred, never silently absorbed).

Who does what in the live phase

The topology is fixed and load-bearing.

Claude starts npm run dev as a background process, tails the dev-server log, confirms the route is reachable, and reports each expected log line as it fires.
Christian drives the Cloud Portal and Sitecore UI, performs the scripted actions one at a time, copies devtools console lines into chat when needed.
Both must agree a discrepancy is real before it becomes a patch candidate.

The harness UI standard (per-test card, sticky log panel, manual-check checkboxes, explicit Init / Reset / Destroy buttons) is what makes this observable in real time. Skipping it isn't a shortcut — it makes the live phase blind, and a future run has to re-do the work with the harness in place.

Regression coverage, in two words

Re-dogfood. Every patch survives to the next qualifying run or it gets ripped out. A patch that doesn't survive a fresh scaffold is worse than no patch — it's a false sense of safety. That's why Run 2 exists. QuickCopy's scaffold pass showed zero friction and validated earlier patches at compile time — then production still surfaced new unwrap lessons, which is also the loop closing, one layer deeper.

IV — The production apps

Each row below went through the full /create-prd → /architect → /task-breakdown → /implement → /code-review → /test → /document → /ship pipeline and shipped inside a real Sitecore Pages editor against a real tenant. PageShot is Run 10 — the first production dogfood ship and the bridge out of the ten-run table in section II. QuickCopy, Component Atlas, and Last-Edit Trail are later dogfood ships that apply the same learnings (patched skills, conventions, CI gates); they are not extra numbered ledger rows, but they still exercise the methodology every time — evidence over sprint cycles.

Real-world · 01

PageShot

2026-04-22 → 04-23 · Page Builder Context Panel

Capture screenshots of Sitecore pages from inside the Pages editor — mobile/desktop, four height presets, copy/download/open.

Crossed every layer at once — scaffold, server-side OAuth, Agent API, Blok UI, Tailwind v4, custom fonts, and the Pages iframe sandbox in a single build.
Found the sandbox trap — the Pages iframe quietly blocks file downloads; the workaround pops the image into a real browser tab where the editor can save it normally.
Documented Agent API reality — undocumented response fields, parameter types that disagree with the OpenAPI sample, and no full-page mode.
Established conventions — a credential-naming convention for multiple automation clients per tenant.
Surfaced styling truth — the Blok preset already ships brand fonts, and Tailwind v4 design tokens belong in a different place than v3 muscle memory expects.

Read the PageShot case study→

Real-world · 02

QuickCopy

2026-04-26 · Pages Context Panel

Five copy buttons + share-link split for the page metadata marketers paste most often. Dark/light theme, full keyboard control, accessible by construction.

Pure pipeline graduation — built end-to-end through /create-prd → /ship, all earlier-run patches already in place.
Zero scaffold friction — first scaffold produced a clean app; lint, typecheck, tests, and build all green.
Validated the SDK story — Live URL / Preview URL / Item ID extraction reused the conventions hardened in earlier runs without rework.
Exercised QA and accessibility gates — automated a11y checks integrated, with two regressions caught and fixed before ship.
Dogfood still bit — scaffold and compile path were clean, but production uncovered response-shape drift; that fed new skill patches — exactly the kind of signal post-Run 10 ships are for.

Read the QuickCopy case study→

Real-world · 03

Component Atlas

2026-04-28 · Dashboard widget + Context Panel

Tenant-wide live atlas of renderings and datasources — two extension points from one app registration, built on in-memory xmc.agent.* graphs.

Scaled the same skill set — proved the loop held when the surface grew from a focused panel (QuickCopy) to search + impact analysis across a tenant.
Two routes, one registration — dashboard widget and Pages Context Panel share one codebase pattern the methodology now treats as repeatable.
Agent API at centre mass — heavy reliance on the SitecoreAI Agent API stressed typings and docs the earlier Context Panel runs had only brushed.
Anti-metrics guard shipped — grep gate blocks vanity throughput counters; aligns with “impact, not throughput” as a dogfood-quality bar.

Read the Component Atlas case study→

Real-world · 04

Last-Edit Trail

2026-05-04 · Pages Context Panel

Read-only version trail for the active page — subscribe-via-query reactivity, no backend, and ADR-level specs before the first component.

Zero skill patches — full pipeline in one session with only product-repo ADRs; the skills were already sufficient.
Stale-response guard — monotonic request ids drop out-of-order results when editors switch pages faster than the network (production framing of async iframe UX).
Two reactivity paths, one chosen on purpose — the SDK exposes two ways to react to editor context changes; the panel uses the query-driven one. The dogfood loop documented both, so the choice is principled, not accidental.
Tests as contracts — structural tests enforce iframe-only architecture (no backend routes), the allowed Blok list, and forbidden subscribe shapes — methodology takeaway: encode loop rules in CI, not only in prose.

Read the Last-Edit Trail case study→

V — What it added up to

Patch and catalogue numbers below are from the skill ledger snapshot 2026-04-27. The four production apps above are the narrative continuation — new ships can add patches or add none, but they all extend the same feedback loop.

artifacts in
the catalogue

patches applied
across all runs

extension points
fully exercised

upstream items
pending

VI — What the loop actually taught us

The patches are the artefact. These are the ideas that crystallised during the ten numbered runs and stayed true through the production apps that followed.

Lesson 01

Skills aren’t real until something fails against them.

Run 1 produced 11 doc-gaps in a single scaffold. Reading the docs would never have surfaced any of them — you have to build against them with type checking on.

Lesson 02

Re-dogfood is the contract.

Every applied patch fires a re-dogfood trigger on the next qualifying run. A patch that doesn’t survive a fresh scaffold is worse than no patch — it’s a false sense of safety.

Lesson 03

Live co-execution catches what code-only can’t.

The Standalone umbrella model, the AI payload-size ceiling, and the iframe sandbox trap all required a human + agent in the portal at the same time. None would have shown up under a type-check or a unit test.

Lesson 04

Wrong assumptions are the highest-value bugs.

Run 7’s “Standalone has no XMC” framing was wrong. Catching it before a customer did — and rewriting three skills as a consequence — was worth more than any of the syntax patches.

Lesson 05

The harness pattern was reusable.

Per-test cards, sticky log panel, manual-check checkboxes, structured tag-prefixed logs — built once in Run 3, reused in every harness after. The pattern itself became part of the skill set.

Lesson 06

Real products test the skills holistically.

PageShot crossed scaffold, SDK, Agent API, OAuth, Blok, Tailwind, Geist, iframe sandboxing, and Permissions-Policy in a single build. QuickCopy validated that the stack could ship again without scaffold friction; Component Atlas scaled surface area (two extension points, tenant-wide graphs); Last-Edit Trail showed the same pipeline could run end-to-end with no new skill patches — the ultimate regression pass.

Lesson 07

Fast context switching needs explicit async discipline.

When pages.context fires faster than version fetches complete, UI can flash the wrong page's data. Last-Edit Trail fixed this with a request-id guard (ADR-0010) — a pattern worth copying anywhere subscription-driven panels overlap slow queries.

Related case studies

PageShot

A one-button screenshot panel inside the SitecoreAI Pages editor — built end-to-end against a real iframe, real OAuth, and a real Agent API.

Case StudyMarketplaceSitecoreAI

April 23, 2026

QuickCopy

Five copy buttons and a share-link split — the second Marketplace product through the agentic pipeline, built end-to-end on the patches PageShot left behind.

Case StudyMarketplaceSitecoreAI

April 26, 2026

Component Atlas

A live, tenant-wide view of where every rendering and datasource is used — running entirely in the Pages iframe with no backend and no persisted index.

Case StudyMarketplaceSitecoreAI

April 28, 2026

Lessons learned

Last-Edit Trail

A Page Context Panel that shows the last 5 versions of the active page in the current language — newest-first, with age and (when surfaced) author. "git log for this page" instead of a Slack ping.

Case StudyMarketplaceSitecoreAI

May 4, 2026