Skip to content
Back to articles
Data14分

Transformation Layer 2026: dbt Core 1.10 vs SQLMesh 0.150 vs Dagster 1.10

久保 真由美Lead Analytics Engineer
2026-04-2114分
dbtSQLMeshDagsterSemantic LayerMetricFlowCube.devCI/CD

The Three-Way Race in Analytics Transformation Tools: Where Things Stand in 2026

Up to around 2022, there was a quiet consensus that "dbt is what you use for data transformation." Since 2024, that consensus has clearly fractured. SQLMesh has demonstrated a genuine technical edge with virtual data environments and column-level lineage. Dagster has grown by enveloping dbt within an asset-centric orchestration model. And dbt Labs itself has fought back with the 2025 launch of the Fusion engine (a Rust-based dbt runtime) and a push to mature its Semantic Layer.

As of April 2026, KGA's recommendations to clients are split along scale and project type. Existing medium-to-large dbt projects (500+ models) should stay on dbt Core 1.10 + Fusion. Greenfield projects centered on PostgreSQL, BigQuery, or Snowflake where CI wait times are a real pain point should use SQLMesh 0.150. Projects that want a single orchestrator across the full data pipeline — ML, reverse ETL, BI refresh — should use Dagster 1.10 + dbt or Dagster 1.10 + SQLMesh.

dbt Core 1.10 and the Fusion Engine

dbt Fusion, which went GA at the end of 2025, rewrites the compilation and parsing phase of dbt — previously in Python — in Rust. On projects with around 1,000 models, `dbt parse` is 3–4× faster and `dbt compile` is 2–3× faster. This directly cuts CI time: at a KGA retail client, per-PR CI dropped from 18 minutes to 7 minutes.

Three things are worth highlighting in 1.10. First, the `microbatch` incremental strategy has stabilized. For time-series data ingestion, you can now handle both backfilling historical partitions and appending new batches within the same model definition, eliminating the dual-management overhead of Lambda-style architectures. Second, Python models support row-partition execution. When running dbt Python models in Snowpark, Databricks, or BigQuery DataFrames, partition-level parallelism kicks in automatically. Third, MetricFlow (the Semantic Layer core) has stabilized its definition file syntax — the metric YAML is now clean, having moved past the experimental constructs introduced in 1.9.

Weaknesses persist. Column-level lineage is available only in dbt Cloud, not the open-source version. Unit testing has arrived but is less expressive than SQLMesh's audits. Virtual environments (blue-green table switching) are not supported, so teams still rely on zero-copy cloning workarounds.

SQLMesh 0.150 and Virtual Data Environments

SQLMesh's primary differentiator is its virtual data environment. When a developer runs `sqlmesh plan dev` against a feature branch, SQLMesh doesn't physically copy tables — instead, it materializes only the changed models under new names and wraps everything else in logical views that reference production tables. This dramatically reduces the full-build cost during CI.

Highlights from 0.150:

Column-level lineage in open source: The same level of column-level lineage that costs money in dbt Cloud is available in the open-source SQLMesh UI. Impact analysis shows exactly which downstream dashboard metrics would change if a column definition is modified.

Automatic optimization for incremental time-range models: Auto-detection of which partitions need to be loaded, and idempotency guarantees for historical backfills, are more flexible than dbt's microbatch. Late-arriving facts can be handled declaratively.

Audits (data quality assertions): SQL-based audits run automatically before and after model updates, appends, and backfills. If an audit fails, the operation is rolled back — making production data corruption effectively impossible.

Native Python models: Equivalent to dbt Python models, but dependencies are resolved implicitly from type hints on function arguments.

The main weakness in 2026 is ecosystem depth. The community asset library — equivalent to dbt-utils, dbt-expectations, dbt-snowflake-monitoring, and so on — is still thin, which means more custom code. Japanese-language documentation and internal training content are far sparser than what exists for dbt.

Dagster 1.10 and the Asset-Centric Approach

Dagster is fundamentally a Python-based orchestrator, but its software-defined assets concept — the ability to treat dbt, SQLMesh, Airbyte, Fivetran, Hightouch, and other tools as a unified asset graph — is its strongest differentiator.

Evolution highlights in 1.10:

Support for both dbt and SQLMesh as assets: In the same pipeline, dbt and SQLMesh can coexist during migration without breaking. At a KGA financial client, 800 existing dbt models are being migrated incrementally to SQLMesh, with Dagster managing both in a single view.

Pipes protocol expansion: A lightweight integration mechanism for jobs running in external environments — Databricks, EMR, Kubernetes — that works regardless of whether those jobs are written in Python, Scala, or Rust.

Mature declarative scheduling: A declarative SLA like "update asset A within 15 minutes of asset B being updated" is more intuitive than cron and automatically tracks dependency changes.

Asset checks: Data quality checks defined as first-class metadata on assets; failures automatically halt downstream updates.

The main weakness is the learning curve. The shift from Airflow's task-centric mental model to Dagster's asset-centric one is non-trivial and requires the team to rethink their approach. For small projects that run purely on SQL, it is clearly overkill.

Semantic Layer Implementation Strategy

A Semantic Layer is unavoidable in the 2026 data stack. BI tools, LLM agents, and reverse ETL all need to pull from a single authoritative metrics definition. The three main options:

dbt MetricFlow: The core engine of the dbt Semantic Layer. Tightly coupled to dbt models, with `semantic_model` and `metric` definitions in YAML exposed over JDBC/GraphQL. Tableau, Hex, Mode, Lightdash, and Metabase all integrate. The advantage is that metric changes are automatically included in the model CI cycle. The disadvantage is that the full Semantic Layer API requires a paid dbt Cloud plan, and concurrent connection limits are plan-dependent.

Cube.dev: A Semantic Layer-focused OSS/commercial tool that connects to dbt, SQLMesh, or raw SQL. Strong caching layer and access control. Includes an MCP server for LLM agents, allowing natural-language queries against metrics in a safe way. At a KGA e-commerce client, Cube.dev + dbt powers an LLM dashboard where teams query metrics in Slack.

SQLMesh Metrics: Introduced in SQLMesh 0.150. Metrics are defined in the same repository as models, with deep integration into column-level lineage. Less mature than MetricFlow or Cube.dev, but the natural choice for SQLMesh-based projects.

CI/CD Patterns and Cost Monitoring

Best practices for operating transformation tools in 2026:

Blue-green deployment: Implemented via SQLMesh virtual data environments or dbt + Snowflake zero-copy clone / BigQuery snapshot. Run a full build against production-equivalent data during PR review, then promote on approval.

Slim CI: Build only the models affected by the change. For dbt, `dbt build --select state:modified+`; for SQLMesh, it is built in as a standard feature.

Model-level cost attribution: For Snowflake, join QUERY_HISTORY + ACCESS_HISTORY; for BigQuery, query INFORMATION_SCHEMA.JOBS; for Databricks, query system.billing.usage — then cross-reference dbt model metadata to get per-model cost attribution. In 2026, dbt-snowflake-monitoring, SQLMesh's built-in cost tracker, and Dagster's asset insights each offer roughly equivalent visibility.

Per-PR cost budgeting: In KGA's GitHub Actions CI, we estimate the full-build cost for each PR and automatically add an approval reviewer when the estimated cost exceeds ¥500. This catches wasteful SELECT * queries and inefficient CTEs before they merge.

Conclusion: The Recommended 2026 Stack

For greenfield projects, KGA's recommendation is Dagster + SQLMesh + Cube.dev. The combination of virtual environments for fast CI, asset-centric orchestration, and a clearly scoped Semantic Layer are all there — and the entire stack is open source.

If you have existing assets (300+ dbt models), sticking with dbt Core 1.10 + Fusion + MetricFlow is the rational choice for now. Consider dbt Cloud only if you need the Semantic Layer API and column-level lineage. If Airflow is already entrenched, there is no urgent reason to replace it for existing projects — but new ones are worth piloting in Dagster.

Regardless of tool choice, neglecting CI/CD automation, cost visibility, and centralized Semantic Layer management will result in a mountain of technical debt within five years. You should spend at least as much time on operational design as on tool selection — if not more.

Let's solve your technical challenges together.

KGA IT Solutions delivers AI, cloud, and DevOps expertise to address your specific challenges.

Contact Us