← Back to blog

Schema in the Data Layer: a Rust Tracer Bullet for DYFJ

The first meaningful Rust commit for DYFJ proves a stance worth defending — that the database, not the language, owns the contract.

Today I shipped the first meaningful Rust commit for DYFJ — a tracer bullet that writes one event into Dolt and reads it back, end-to-end, with compile-time-checked queries. About two hundred lines of code that prove a stance I care about: the schema lives in the data layer, not in language types.

It’s a small feature. What matters is what it locks in.

The category error I keep seeing

Most stacks I’ve worked in over the last few decades define their data contract in code — the framework’s native classes, types, or interfaces are the contract, and the schema in the database is treated as something exported from them, rather than the other way around. The database has a schema, and the language has a parallel one, and they drift. The drift is silent until something breaks in production at 3am.

The AI-infrastructure version of this is worse, not better. Every “agent framework” I’ve looked at lately defines memory models, event shapes, and tool schemas as Python or TypeScript classes — and then exports them to JSON Schema or OpenAPI specs that pretend to be the contract. The class is the source of truth. The actual runtime is whatever interprets that class.

This is backwards. The data is going to outlive the language. Switch runtimes, swap orchestrators, replace the harness entirely — the data should still know what it is. So in DYFJ, the schema lives where the data lives: as Dolt DDL, committed to the repo in schema/. Every TypeScript or Rust binding is a consumer of that schema, not its definition.

Most projects that take this stance say it once in a README and move on. I wanted to enforce it.

The tracer bullet

schema/001_events.sql defines what an event looks like — OpenTelemetry correlation fields, security identity, event-type ENUM, the standard stuff. Eleven NOT NULL fields plus parent_span_id for root-span detection.

The Rust tracer bullet exposes two library functions:

pub async fn write(pool: &MySqlPool, event: &Event) -> Result<()>
pub async fn read_by_id(pool: &MySqlPool, event_id: &str) -> Result<Option<Event>>

Both use sqlx::query! and sqlx::query_as! macros — the variants that check the SQL at compile time against the actual database. If I write SELECT non_existent_field FROM events, the build fails. If I rename a column in the schema, the Rust code fails to compile until I update the queries. That’s the stance enforced at the language boundary.

For anyone cloning the repo without Dolt running, the .sqlx/ directory holds JSON cache files — pre-recorded query validations. Standard sqlx workflow: cargo build works offline using the cache; cargo sqlx prepare regenerates it whenever queries change.

The integration test at core/tests/schema_round_trip.rs does the actual round-trip:

$ cargo test -- --ignored
test round_trip_session_start_event ... ok

Generate a synthetic event, write it, read it back by primary key, assert field-by-field equality — including the nullable parent_span_id, which round-trips as None. That’s the proof.

On testing posture

Writing the failing test first paid exactly once, at the layer that mattered. Pure TDD as ritual is overkill for a 200-line binary. But the integration test is genuinely the success criterion, and writing it before the library functions existed forced the API to be defined by use rather than by guess. The compiler told me what to build. Three function signatures, one error type, one struct — that’s the API, and I never had to second-guess it because the test was already calling it. Unit tests came after the implementation, because the internals didn’t exist to test until I’d written them. Different layers, different timing. Not religion.

What’s next

The tracer bullet is the substrate’s foundation, not its surface. Next is extending it — more event types beyond session_start, batched writes, query helpers — and then the second Layer 0 stance gets pressure-tested in working code.

The repo is at github.com/bitspace-ai/dyfj. The commit log narrates the day’s work, including the wrong turns. That’s the point of working in public.