New: Baz Planner - eliminate entire classes of bugs before code is written

Code review is too late

Baz reviews the plan before the code exists, and turns the pull request into a confirmation.

Use Baz where you work
Back to plansPlans
More actions

Plan - Backfill late-arriving events in the orders pipeline DATA-2187

BBaz Planner · drafted from the DATA-2187 thread Pending review v3· 2 minutes ago
No reviews yet · review requested from Sofia Keller

Context

The orders pipeline ingests events in hourly batches. The mobile SDK retries from a device queue, so a slice of events lands hours late and is currently dropped from fct_orders. The goal: capture late arrivals without double-counting. Five stages change together:

  1. Capture - record a processing timestamp on every event so late arrivals can be detected.
  2. Re-window - drive the incremental model off processing time with a bounded lookback.
  3. Backfill - idempotently re-merge the last 14 days so no day is left short.

1. Source

Stamp event_ingested_at (processing time) onto every row as it lands, in stg_orders. The producer already emits an event_time; this adds the warehouse arrival time alongside it.

2. Schema

Add a nullable event_ingested_at timestamp to the orders_landing table. No historical backfill of the column is required - new loads populate it, and the model treats nulls as on-time.

3. Incremental model

Switch the fct_orders incremental predicate from event_time to event_ingested_at with a 3-day lookback, and dedupe on order_id keeping the latest row.

13{% if is_incremental() %}14  where event_time > (select max(event_time) from {{ this }})14  where event_ingested_at >=15    (select max(event_ingested_at) from {{ this }}) - interval '3 days'16{% endif %}

4. Backfill job

Add an idempotent backfill for the last 14 days of partitions, guarded so it can never --full-refresh the production table. Re-runs upsert on order_id and converge to the same totals.

5. Downstream consumers

The revenue rollup and the finance dashboard both read fct_orders. Late merges revise historical daily totals, so notify #data-platform and move the dashboard to as-of semantics.

Verification

dbt build passes with the unique/not_null tests on order_id, row counts reconcile against source for a seeded late-event fixture, and no run triggers a partition full-refresh. Estimated 4 models - no destructive migration.

AI coding is foundational to how software gets built, and teams need a new verification layer - Baz is defining it.

Baz is more contextual, scalable, and dependable - refreshing for teams shipping AI-native software.

The same bug has two prices

Post-coding review catches mistakes after you’ve paid to make them. Planner moves the review to the plan: drafted from the ticket, checked against your architecture, gated by risk, and approved before an agent writes code. Teams report up to 65% less rework after merge, measured by reverts and hotfixes.

Caught in review
  1. write the code
  2. review finds it, you rework it
  3. re-test, re-review, switch back in
Caught in the plan
  1. ~0× a plan edit, before the code exists
plan ∙ DATA-2187
baz planner extension
Claude Code · Baz Enterprise
~/orders-pipeline
>Plan DATA-2187: backfill late-arriving events in the orders pipeline. Don’t write code yet.
Routing the plan through Baz to check it against the warehouse + dbt project before I propose anything.
Baz(plan · data-loss & cost scan)└ Done (21 tool uses · 39.2k tokens · 31s)
Proposed plan - late-event backfill in orders pipeline (DATA-2187)
1.Stamp event_ingested_at (processing time) onto every row in stg_orders.
2.Drive the fct_orders incremental model off event_ingested_at with a 3-day lookback.
3.Add an idempotent backfill for the last 14 days, guarded against --full-refresh in prod.
Baz Filtering the incremental model on event_time silently drops late events - they land after the watermark. Switched the predicate to event_ingested_at so no rows are lost, and the lookback re-merges late arrivals idempotently.
−51% tokens vs. an unplanned session · 0 silent data-loss paths left open
Approve this plan before any pipeline code is written?

Review becomes confirmation, not correction

Whatever still reaches a pull request gets a final check from purpose-built agents with full context: behavior, requirements, APIs, architecture, production systems, and security.

Spec Reviewer Agent

Validates code against product requirements, designs, and expected behavior - catching gaps and deviations before they ship.

Jira · Figma · LinearRequirements traced

Advanced Security Agent

Reasons across auth and network boundaries, infrastructure, pipelines, and application code to uncover vulnerabilities.

OWASP Top 10Auth & infra boundaries

SRE Agent

Correlates repository changes with production telemetry to identify reliability, performance, and observability risks - then proposes fixes.

Datadog · SentryTelemetry-linked fixes

Fixer Agent

Automatically applies and validates safe code changes in an isolated runtime, turning review feedback into tested commits.

Isolated runtimeTested commits

Latest updates

See all updates

Join the team

Help us build the platform that reviews the plan, not just the code.

View all opportunities

Move your review upstream

Start free on a real repository, or talk to us and we’ll run Baz against a change from your backlog.