kusuka — to weave [Swahili]
You have multiple data sources measuring the same thing. Kusuka weaves them together and shows you exactly where the threads don't meet — where the gap is, why it exists, what kind of fix it needs, and how far it's spread.
Upload two spreadsheets to reconcile, paste JSON for a quick comparison, or build a full position matrix. Everything runs in your browser — no data leaves your machine.
Every monitoring tool asks "does this value look wrong?" Kusuka asks "do two independent paths agree?" That's a fundamentally harder question to fool.
Kusuka runs a Strand — a structured comparison of two data paths at every position in your pipeline. The output is a convergence matrix you can read like a scan.
Any two independent measurements of the same system. Source A vs Source B. Model output vs ground truth. This month vs last month. You choose the paths.
Every position — each row (entity) by each column (pipeline layer) — gets a convergence glyph. Agreement, approximation, drift, failure, or missing data. Nothing hides.
Block of nulls = source problem. Systematic column = volume issue. Gradual drift = formula error. Row-localised = orphan record. The shape tells you what to fix and who should fix it.
Run Kusuka regularly. The change in convergence between runs tells you whether your pipeline is healing or degrading — before downstream users notice.
The exact position and layer where the gap first appeared. Not "something is off" — "it started here, at this step, on this entity."
Gap classification. Block null = missing source. Systematic column = volume problem. Drift = formula error. The shape of disagreement IS the diagnosis.
Ripple tracking. A gap at layer 3 propagates to layers 4, 5, 6. Kusuka shows the full chain — so you fix the root, not the symptoms.
Every gap gets a class. Each class maps to a team and a fix type. Source team for nulls. Engineering for drift. Data for volume. No ambiguity about ownership.
Run Kusuka over time. The change in convergence between runs — positive, negative, or flat — tells you if your fixes are working before anyone downstream notices.
We validate Kusuka against real problems across unrelated domains. Each study below uses real data and is fully interactive. The engine doesn't change — the domains do.
36-edge traffic network simulated with SUMO. Blocked a road at minute 20. Kusuka identified the exact segment, separated sensor noise from real congestion, and tracked the wave across neighbouring roads.
72 hours of temperature data across 24 Kenyan weather stations. When a cold front arrived, satellite estimates lagged behind ground readings. Kusuka showed exactly which stations were in the anomaly zone.
120-minute window across the Rift Valley. M4.2 earthquake at minute 45, aftershock at minute 75. Kusuka tracked wave propagation, found monitoring blind spots, and separated instrument noise from real ground motion.
6 trading episodes across 7 pipeline layers (data, signal, gate, fill, position, exit, outcome). Kusuka found 83% convergence — and one orphan row from a deprecated code path that would have gone unnoticed.
Your domain not listed? Kusuka works anywhere you have two independent ways to measure the same system. Health, finance, agriculture, infrastructure, software pipelines — tell us what you're working with.
These aren't demos — they're real Kusuka runs against production systems and research pipelines. Each one found issues that would have gone undetected by traditional monitoring.
6 trading episodes validated across 7 pipeline layers — from raw data ingestion through signal generation, gate logic, order fill, position management, exit, and outcome recording.
100 lessons in a GNN-powered knowledge graph validated across 7 layers — from raw lesson text through structural embedding, GCN propagation, semantic grounding, cross-domain linking, human-applied confirmation, and final retrieval weight.
Neural machine translation output for 4 African languages (Kikuyu, Luo, Swahili, Gusii) validated by back-translating to English and measuring semantic round-trip fidelity.
127 prediction markets with 235 price snapshots and 601 price points, validated across a 5-layer pipeline from market creation through price capture, signal generation, episode tracking, and outcome resolution.
A community health data warehouse with 784K monthly performance records across 4 counties, validated across 4 KPIs by comparing the aggregation layer against the filtered BI layer that feeds dashboards used by 600+ field staff.
registered_households column showed 312K one-sided nulls (o glyphs) — data present upstream but absent downstream. This isn't a bug; it's a design choice. But without Kusuka, nobody had quantified how much data the filter drops (95%). In a pipeline serving 600+ users, knowing the shape of what you're NOT showing is as important as verifying what you are.
A ClickHouse-backed dbt pipeline serving 600+ dashboard users across 5 regions. 4 independent Strand tests run against production data: metrics-vs-fact reconciliation, staging-to-intermediate preservation, entity dedup boundary, and 9-month temporal drift analysis.
children_assessed but diverged on 3 other KPIs. iccm_visits diverged because the metrics layer applies a two-gate filter (sick child + non-empty diagnoses) before counting, while a naive recount from the fact table skips the first gate. referred_visits diverged 3–10x because the metrics layer counts process-step flags within the sick-child cohort, while the fact table's referral flag captures all referrals including non-ICCM pathways. Neither layer is wrong — they measure different populations. But without Strand testing, the gap was invisible.dbt deps.dbt_kusuka brings Strand convergence into your existing dbt project. Compare any two models — or any two raw SQL queries — column-by-column. Get the glyph matrix, SMAPE scores, gap classifications, temporal drift detection, and summary stats. All in SQL, all in your warehouse. No external calls. No dependencies.
Works with PostgreSQL, ClickHouse, BigQuery, Snowflake, DuckDB, and Redshift. Cross-database type casting handled automatically. Strand Spec v1.1 thresholds configurable via dbt vars.
Kusuka isn't just diagnostic — it's a gate. Set a convergence threshold. If two independent paths don't agree above that threshold, the pipeline blocks. No diverged data reaches production. No silent drift compounds overnight.
kusuka_converge — your dbt build fails if convergence drops below threshold. Ships as a generic test. One line in your schema.yml.
Run a Strand comparison on PR data vs main branch data. If any column diverges (X glyph), the merge blocks. Catch schema drift, formula errors, and data source changes before they land.
kusuka_no_drift + strand_temporal — track convergence across time periods. Not just "is it wrong now" but "is it getting worse?" dΞ/dt is the derivative of your pipeline health.
Kusuka compares data paths. But data isn't always numbers — sometimes the gap between what someone says and what they do is the most important signal. These enrichments extend Strand into new territory.
We don't demo with fake data. We start with your actual sources, run a real Kusuka analysis, and show you what your system is missing. If it works — and it will — we build from there.
Give us access to your two data sources. We run Kusuka against your real system and deliver a diagnostic report showing what we found.
Continuous convergence monitoring. Kusuka runs against your live data and alerts you when the gap between paths changes — before downstream users notice.
Run Kusuka inside your own platform. Self-hosted, your data never leaves your infrastructure. Full control.
Start with a 30-day pilot. Real data, real results, real insight into what your system is missing.
Book a pilotThe Strand metaphor was born watching John Chore do a manual reconciliation. He called it reconciliation. But the pattern underneath — two strands, woven together, gaps visible at every position — that was DNA. Kusuka exists because the structure was always there. It just needed someone to stop and see it.