What is SAP Datasphere?
Definition: SAP Datasphere
SAP Datasphere (formerly SAP Data Warehouse Cloud) is a fully managed, cloud-native data platform offered by SAP. It combines a data warehouse, data lake, data integration, data federation, and a business semantics layer in a single service. It runs on SAP Business Technology Platform (BTP) and is available on AWS, Azure, and Google Cloud.
Launched in 2023 as the successor to SAP Data Warehouse Cloud, Datasphere is designed for organisations that run SAP ERP systems (S/4HANA, ECC, BW) and want to combine that data with other sources in a governed, modern cloud environment — without abandoning the SAP investment they already have.
The core promise: connect to anything, model with SAP semantics, and expose governed data to SAP Analytics Cloud and other BI tools. It is not just another cloud warehouse. It is SAP's attempt to make the entire data supply chain — from source to dashboard — a first-class SAP experience.
Architecture Overview
Datasphere is organised around four architectural pillars that work together:
- Data Spaces — isolated, governed workspaces per team or domain
- Data Builder — the data modelling and integration layer
- Business Builder — the semantic business layer
- Data Catalog — enterprise-wide data discovery and lineage
Underneath these layers sit two storage tiers: a managed SAP HANA Cloud instance (the hot tier for high-performance SQL) and a managed object store (the cold/lake tier, compatible with Delta Lake and Apache Parquet). Both tiers are fully abstracted from the user — you interact with them through the Datasphere UI or SQL.
Key architectural insight
Datasphere sits on top of SAP HANA Cloud, which means it inherits HANA's column-store engine, in-memory processing, and multi-model capabilities (relational + document + graph). This gives Datasphere a performance edge for SAP-native analytics compared to generic warehouses — but also means the platform is HANA-centric at its core.
Data Spaces: The Governance Unit
A Data Space is the fundamental unit of organisation and governance in Datasphere. Think of it as a department-level sandbox that has its own:
- Storage allocation (disk quota)
- Compute allocation (memory quota)
- Members and roles (Space Administrator, Modeler, Viewer)
- Connections to source systems
- Data assets (tables, views, flows, models)
Spaces communicate with each other through Cross-Space Sharing: a Space can share a view or table with another Space, but the consuming Space cannot modify the original asset. This enforces data product thinking — each space owns and publishes its data, while other teams consume it as a read-only product.
This model aligns well with Data Mesh principles: distributed ownership, domain-driven design, and governed sharing — all within a single managed platform rather than separate infrastructure per domain.
Practical example: Space structure for a retail company
- FINANCE Space — owns GL postings, cost centre data (source: S/4HANA Finance)
- SALES Space — owns sales orders, pricing conditions (source: S/4HANA SD)
- SUPPLY_CHAIN Space — owns inventory, delivery data (source: SAP EWM)
- CENTRAL_REPORTING Space — consumes shared views from all three spaces, builds cross-domain KPIs for SAP Analytics Cloud
Data Builder: Modelling and Integration
The Data Builder is where you model, transform, and integrate data. It offers several object types:
Local Tables
Persisted tables stored in the HANA Cloud or object store tier. You can design the schema in the graphical table editor or import a CSV. Local tables are the foundation of your data storage.
Views
SQL views or graphical views built on top of local tables, remote tables, or other views. The graphical editor lets you join, filter, aggregate, and add calculated columns without writing SQL — but you can always switch to SQL mode for full control.
-- Example: SQL view combining sales orders with product master from two spaces SELECT so.SalesOrderID, so.CustomerID, so.NetValue, so.Currency, pm.ProductDescription, pm.ProductCategory FROM "SALES"."SalesOrders" so JOIN "CENTRAL_REPORTING"."ProductMaster" pm ON so.MaterialNumber = pm.MaterialNumber WHERE so.DocumentDate >= '2026-01-01'
Analytic Models
Analytic Models are the bridge between the Data Builder and SAP Analytics Cloud. They expose measures, dimensions, hierarchies, and variables in a format that SAP Analytics Cloud (SAC) stories consume natively. An Analytic Model is essentially the OData/REST endpoint for your semantic layer.
Entity-Relationship (ER) Models
A canvas view for documenting the relationships between tables and views — useful for understanding and communicating data model structure across teams.
Data Flows
Graphical ETL pipelines for loading and transforming data. Data Flows support joins, aggregations, projections, and write targets (local tables). They run in a Spark-like execution environment inside Datasphere. For simple loading scenarios, Data Flows replace the need for external ETL tools.
Replication Flows
Purpose-built for real-time or scheduled replication from SAP sources (S/4HANA, BW, ECC) into Datasphere local tables or the object store. Replication Flows use Change Data Capture (CDC) where available — meaning only changed records are replicated after the initial load, not the full table every time.
Why Replication Flows matter
Replication Flows are the fastest and most reliable way to get SAP transactional data into Datasphere. They handle delta loads, schema evolution, and partitioning automatically. For S/4HANA sources, they leverage SLT (SAP Landscape Transformation) or the SAP HANA Smart Data Integration (SDI) adapter for near-real-time CDC.
Transformation Flows
More powerful than Data Flows for complex multi-step transformations. Transformation Flows execute as SQL scripts in sequence and support looping, variables, and error handling. They are comparable to stored procedures orchestrated as a pipeline.
Business Builder: The Semantic Layer
The Business Builder is one of Datasphere's most distinctive features compared to generic cloud warehouses. It lets business users and analysts define business entities, metrics, and dimensions in business language — without writing SQL.
Business Entities
A Business Entity maps to a physical view or table, but adds semantic context: labels, descriptions, synonyms, and data category (fact, dimension, text). Business Entities are the "nouns" of your data model — Customer, Product, Sales Order.
Fact Models and Consumption Models
A Fact Model connects a fact Business Entity to its dimension associations. A Consumption Model exposes a set of Fact Models as a unified query surface for SAP Analytics Cloud. Users building stories in SAC navigate the Consumption Model's measures and dimensions — they never see raw table names.
KPIs and Metrics
The Business Builder supports defining reusable KPI definitions with targets, thresholds, and trend indicators. These KPIs feed directly into SAP Analytics Cloud's KPI dashboard features — no manual setup in the BI layer required.
This is the layer that SAP BW veterans will recognise most. In BW, InfoObjects, InfoCubes, and Queries served the same purpose. In Datasphere, the equivalent stack is: Local Table → View → Business Entity → Consumption Model → SAC Story.
Federation: Query Without Moving Data
Data Federation is a core capability that sets Datasphere apart from many cloud warehouses. Instead of copying data into Datasphere, you create a Remote Table that points to a table in an external system. When a view or model queries that remote table, Datasphere pushes the query down to the source and returns the result — the data never leaves the source system.
Supported remote sources include:
- SAP Systems: S/4HANA, BW, ECC, SAP HANA on-premise
- Cloud warehouses: Snowflake, Google BigQuery, Amazon Redshift, Azure Synapse
- Databases: SQL Server, Oracle, PostgreSQL, MySQL, IBM Db2
- Cloud storage: Amazon S3, Azure Data Lake Storage, Google Cloud Storage
- SaaS applications: Salesforce, SAP SuccessFactors, SAP Ariba
Remote Table vs Replicated Table
A Remote Table (federated) queries the source live — always fresh, zero storage cost, but query performance depends on the source. A Replicated Table copies the data into Datasphere local storage — faster queries, slightly stale (based on replication schedule), uses your storage quota. You can switch between the two modes on the same object.
Federation is particularly powerful for SAP-to-non-SAP joins: you can join live S/4HANA sales orders (remote, federated) with a replicated Salesforce opportunity table in a single SQL view inside Datasphere, without any intermediate ETL job.
Open SQL Schema: Bringing Your Own Tools
An Open SQL Schema is a special schema inside a Datasphere Space that external tools can connect to directly via a standard ODBC/JDBC connection. This means you can:
- Write data into Datasphere from Python, dbt, or Spark
- Query Datasphere tables from any SQL client (DBeaver, DataGrip, Tableau)
- Run dbt models against Datasphere as the SQL engine
- Use third-party ETL tools (Fivetran, Matillion, Informatica) to load data
# dbt profile for SAP Datasphere via Open SQL Schema (HANA adapter)
datasphere_project:
target: prod
outputs:
prod:
type: hana
driver: hdbcli
host: <datasphere-host>.hanacloud.ondemand.com
port: 443
user: <technical_user>
password: <password>
schema: OPEN_SQL_SCHEMA_NAME
threads: 4
The Open SQL Schema bridges the gap between the SAP-governed world and the open-source data engineering ecosystem. It is how teams using dbt or Python for transformations can still persist results in Datasphere and have them governed by Spaces and the Business Builder.
Data Catalog: Enterprise Data Discovery
The Data Catalog is the enterprise metadata layer across all Spaces and connected sources. It provides:
- Search — find tables, views, and business entities by name or business term
- Lineage — trace data from a dashboard KPI back to the source system table
- Impact Analysis — see which downstream consumers are affected before you change a source table
- Business Glossary — define business terms and link them to technical assets
- Data Quality — attach quality rules and scores to assets
The Catalog is particularly valuable in large organisations where dozens of Spaces exist. Without it, discovering who owns what data and how it flows becomes nearly impossible at scale.
SAP Analytics Cloud: The Native BI Layer
SAP Analytics Cloud (SAC) is the natural BI consumer of Datasphere. The integration is tight and first-class:
- SAC connects live to Datasphere Analytic Models and Consumption Models — no data export required
- SAC stories consume hierarchies, KPIs, and variables defined in the Business Builder
- Planning models in SAC can write back into Datasphere local tables (actuals vs plan comparison)
- Single sign-on and unified governance across the SAP BTP tenant
For organisations already licensed for SAC, the Datasphere + SAC stack is a coherent replacement for the classic SAP BW + BEx Query + Analysis for Office setup — cloud-native, no ABAP development required, and significantly faster to deploy.
That said, Datasphere also exposes standard OData and SQL endpoints, so Power BI, Tableau, and Looker can connect to it as well — you are not locked into SAC as the only BI tool.
SAP Datasphere vs. Snowflake, Databricks, and Azure Synapse
| Dimension | SAP Datasphere | Snowflake | Databricks | Azure Synapse |
|---|---|---|---|---|
| Primary strength | SAP ecosystem integration + semantic layer | Multi-cloud SQL warehouse, data sharing | Lakehouse, ML/AI, open formats | Azure-native, hybrid analytics |
| SAP source connectors | Native (S/4HANA, BW, ECC via CDC) | Via third-party (Fivetran, Qlik) | Via third-party (ADF, Fivetran) | Via Azure Data Factory |
| Business semantic layer | Built-in (Business Builder) | None native (use dbt semantic layer) | Unity Catalog + dbt / AtScale | None native |
| Data federation | Strong (100+ connectors, live query) | Iceberg external tables, limited live | Unity Catalog foreign catalogs | PolyBase / Synapse Link |
| ML / AI workloads | Basic (via SAP AI Core integration) | Snowflake ML Functions, Cortex | Full MLflow, AutoML, GPU clusters | Azure ML integration |
| Open format support | Delta Lake, Parquet (lake tier) | Apache Iceberg | Delta Lake (native) | Parquet, Delta, Iceberg |
| Pricing model | Capacity units (storage + compute bundled) | Credits (compute) + storage separate | DBUs (compute) + cloud storage | DWUs + vCores + storage |
| Best for | SAP-heavy organisations | Multi-cloud SQL, data sharing | Data engineering, ML, open data | Azure-native, Microsoft shops |
The honest verdict: Datasphere is not trying to beat Snowflake or Databricks at their own game. It targets a specific sweet spot — organisations running SAP at scale that want a governed, integrated analytics layer without building custom connectors and semantic layers themselves. For non-SAP organisations, Snowflake or Databricks are almost always a better fit.
Migrating from SAP BW to Datasphere
SAP BW (BW/4HANA or classic BW on HANA) veterans will find some familiar concepts in Datasphere, but the migration is not a 1:1 lift-and-shift. Here is how the conceptual mapping works:
| SAP BW Object | Datasphere Equivalent |
|---|---|
| InfoObject (Characteristic) | Business Entity (Dimension) |
| InfoObject (Key Figure) | Measure in Analytic Model |
| InfoCube / CompositeProvider | View + Analytic Model |
| DataStore Object (DSO) | Local Table (with delta merge) |
| BEx Query | Consumption Model |
| Process Chain | Data Flow / Task Chain |
| Transformation (BW) | Transformation Flow |
| SAP BW InfoProvider | Analytic Model |
SAP provides a BW Bridge feature inside Datasphere that lets you run BW/4HANA objects natively — so you do not have to migrate everything on day one. You can progressively refactor BW objects into native Datasphere objects at your own pace, while keeping BW-dependent processes running.
Migration strategy recommendation
Start with Replication Flows to bring source data into Datasphere local tables. Rebuild your most-used BEx Queries as Consumption Models. Use the BW Bridge for complex transformations that are still actively maintained in BW. Gradually decommission BW objects as you rebuild them natively. Do not try to migrate everything at once.
When Does SAP Datasphere Make Sense?
Datasphere is a strong fit when...
- Your organisation runs SAP S/4HANA, ECC, or BW and wants to unify that data with other sources
- You need near-real-time CDC replication from SAP systems without building custom pipelines
- Your BI team uses SAP Analytics Cloud and wants a governed semantic layer without ABAP
- You are on the path from BW to cloud and want a migration-friendly environment
- You need data federation across many sources without centralising all data
- Compliance requires keeping certain data on-premise — federation lets you query it without moving it
Look elsewhere when...
- You have no SAP systems — the main value driver disappears
- You need heavy ML/AI and Python workloads — Databricks is a better fit
- You want maximum SQL performance at scale across non-SAP data — Snowflake or BigQuery are stronger
- You need open-source ecosystem tooling as first-class citizens (Airflow, Spark, Delta, dbt) — Databricks Lakehouse wins here
- Your budget is tight — Datasphere's SAP licensing model can be expensive for smaller organisations
Getting Started with SAP Datasphere
SAP offers a free trial of Datasphere on SAP BTP. Here is the recommended path to get value quickly:
- Create a Space — define your team's governance boundary and assign roles
- Set up a connection — connect to your S/4HANA sandbox, a Snowflake trial, or upload a CSV
- Create a Remote Table or Local Table — bring in a small dataset (e.g., sales orders)
- Build a View — join two tables graphically or with SQL
- Create an Analytic Model — define measures and dimensions on your view
- Connect SAC — build a story on the Analytic Model and explore the data
The graphical, no-code interface means you can complete this in under an hour without touching SQL. That low barrier to entry is intentional — Datasphere is designed for both data engineers and business analysts.
SQL access for power users
-- Query Datasphere from any SQL client via the Open SQL Schema -- Connection: HANA ODBC to <tenant>.hanacloud.ondemand.com:443 SELECT YEAR(OrderDate) AS OrderYear, ProductCategory, SUM(NetAmount) AS TotalRevenue, COUNT(DISTINCT SalesDocumentNumber) AS OrderCount FROM "SALES_SPACE"."V_SalesOrderEnriched" GROUP BY YEAR(OrderDate), ProductCategory ORDER BY OrderYear DESC, TotalRevenue DESC
Strengths and Known Limitations
Strengths
- Deep SAP integration — CDC replication from S/4HANA and BW is best-in-class
- Business semantic layer — Business Builder closes the gap between IT and business users
- Federation first — query across 100+ sources without mandatory data movement
- BW Bridge — smooth migration path from BW without big-bang rewrites
- Governed Data Spaces — data mesh architecture natively supported
- No infrastructure management — fully managed, auto-scaling on BTP
Known limitations to factor in
- SAP-centric — the platform's value multiplies with SAP source systems; without them it is an expensive HANA Cloud wrapper
- Python/Spark ecosystem — first-class Python notebooks and Spark are absent; Data Flows are simpler than Databricks notebooks
- dbt support is indirect — works via Open SQL Schema but not an official supported adapter
- UI complexity — the platform has many layers (Data Builder, Business Builder, Catalog, Spaces) with a steep learning curve for newcomers
- Pricing opacity — Capacity Units are not always easy to forecast; involve SAP licensing when scoping costs
- Community size — significantly smaller open-source community compared to Snowflake or Databricks
Conclusion
SAP Datasphere is a mature, feature-rich platform that solves a real problem: unifying SAP operational data with the broader enterprise data landscape in a governed, cloud-native environment. Its combination of native SAP connectors, CDC replication, data federation, and the Business Builder semantic layer is genuinely unique in the market.
If your organisation is SAP-heavy — running S/4HANA for core operations and SAP Analytics Cloud for reporting — Datasphere is worth serious evaluation. It removes the need for custom integration middleware, provides a governed analytics layer without ABAP, and gives you a clear migration path from BW.
If your data stack is mostly non-SAP, you are better served by Snowflake, Databricks, or BigQuery — all of which have larger ecosystems, better ML capabilities, and lower licensing overhead. The right answer is not universal; it depends on where your data lives and who your users are.
Want to evaluate whether SAP Datasphere fits your landscape — or how to combine it with Snowflake or Databricks? Get in touch — happy to help you scope it.
Frequently Asked Questions
What is SAP Datasphere?
SAP Datasphere is SAP's unified cloud data platform (formerly SAP Data Warehouse Cloud) that combines a data warehouse, data lake, data integration, data federation, and a business semantics layer in a single fully managed service on SAP Business Technology Platform (BTP).
What is a Data Space in SAP Datasphere?
A Data Space is a governed, isolated workspace inside Datasphere where a team or business domain manages its own data assets, storage quota, compute quota, and access permissions. Spaces share data with each other through controlled Cross-Space Sharing.
How does SAP Datasphere differ from SAP BW?
SAP BW is an on-premise data warehouse optimised for SAP-centric reporting with ABAP-based development. Datasphere is cloud-native, supports open standards (SQL, Parquet, Delta), connects to non-SAP sources, and replaces ABAP transformations with graphical flows and SQL. BW objects can be imported via the BW Bridge for a gradual migration.
Can SAP Datasphere connect to non-SAP sources?
Yes. Datasphere has 100+ connectors including Snowflake, BigQuery, Redshift, Azure Synapse, SQL Server, Oracle, PostgreSQL, Salesforce, Amazon S3, ADLS, and more. Federation lets you query remote data without copying it into Datasphere.
Is SAP Datasphere a replacement for Snowflake or Databricks?
No. Datasphere is optimised for SAP-heavy organisations. It provides deep S/4HANA and BW integration that Snowflake and Databricks do not offer. For non-SAP organisations, Snowflake or Databricks are almost always a better fit. Many large organisations run Datasphere alongside Snowflake or Databricks rather than replacing one with the other.
Does SAP Datasphere support dbt?
Indirectly. Datasphere's Open SQL Schema exposes a HANA Cloud SQL endpoint that dbt-hana-adapter can connect to. It is not an officially SAP-supported integration, but it works for teams that want to use dbt models alongside native Datasphere objects.