Entity

Data Model

The structured data schema — entities, relationships, and metrics that enables analytics.

Last updated: February 2026Data current as of: February 2026

Why This Object Matters for AI

AI data integration and quality depend on data models; analytics accuracy requires consistent modeling.

Data & Analytics Capacity Profile

Typical CMC levels for data & analytics in SaaS/Technology organizations.

Formality
L3
Capture
L3
Structure
L3
Accessibility
L3
Maintenance
L2
Integration
L3

CMC Dimension Scenarios

What each CMC level looks like specifically for Data Model. Baseline level is highlighted.

L0

The analytics data model exists only in the heads of the two engineers who built the original tracking implementation. When a new analyst asks 'what tables should I join to calculate monthly active users?' the answer is 'ask Derek, he built it.' There is no schema documentation, no entity-relationship diagram, and no written definition of what each table or metric represents. Derek is on vacation for two weeks, so the analyst waits.

None — AI cannot generate queries or analysis against the analytics data model because no schema documentation, entity definitions, or relationship descriptions exist in any accessible form.

Document the core analytics data model — write down the primary entities (users, events, sessions, accounts), their key fields, and how they relate to each other, even as a simple text document or wiki page.

L1

The analytics data model has scattered documentation — a README in the dbt repository describes some core tables, a Confluence page from last year lists event taxonomy categories, and inline SQL comments explain a few complex joins. A new analyst finds the README but it was last updated eight months ago and references tables that have been renamed. The Confluence page covers acquisition events but not product usage events. Piecing together the full data model requires reading documentation in three places and cross-referencing with the actual database schema.

AI could parse existing documentation fragments to understand parts of the analytics data model, but cannot build a complete picture because the scattered documents are inconsistent, outdated, and cover different subsets of the schema.

Consolidate all analytics data model documentation into a single maintained source — a schema catalog or wiki section that covers every entity, its fields, relationships, and the business definitions of key metrics.

L2

The analytics data model is documented in a dedicated schema catalog — dbt model descriptions cover every table, field descriptions explain each column, and a wiki page defines the core metrics (MAU, activation rate, churn). The analytics team updates the catalog when they add new models. But the documentation describes the technical schema without explaining the business semantics — the catalog says 'user_activated_at: timestamp' but doesn't define what 'activated' means in business terms or which product actions qualify.

AI can generate SQL queries against the documented analytics data model and explain what each table contains. Cannot yet translate business questions into accurate queries because the documentation lacks business-context definitions for key metrics and derived entities.

Add business-context definitions to every analytics entity and metric — document not just the technical schema but the business meaning, calculation methodology, and known caveats for each metric so that the data model speaks the language of its business users.

L3Current Baseline

The analytics data model documentation is comprehensive, current, and business-contextualized. Every entity has a technical schema definition paired with a business description. Metrics include calculation methodology, known edge cases, and refresh cadence. An analyst can ask 'how is net revenue retention calculated, and which accounts are excluded?' and find the authoritative answer in the data model catalog without asking a person. The documentation stays current because it's embedded in the dbt project and updated alongside code changes.

AI can translate natural-language business questions into accurate SQL queries, explain metric definitions with full business context, and flag when a requested analysis involves known edge cases. Cannot yet validate that query results are semantically correct — only syntactically valid.

Formalize the analytics data model as a machine-readable semantic layer — structured entity definitions, metric calculation rules, and validated relationships encoded as queryable configuration rather than human-readable documentation.

L4

The analytics data model is a formal semantic layer — entities, relationships, and metric definitions are encoded as machine-readable configuration. A BI tool or AI agent doesn't need to know SQL to answer business questions — it queries the semantic layer, which translates business concepts to the appropriate joins and calculations. The semantic layer validates query logic against defined relationships, preventing common mistakes like double-counting revenue across overlapping product lines. Business users ask questions in plain language and get semantically correct answers.

AI can autonomously translate business questions into semantically correct analyses, validate query logic against the formal data model, and explain results with full business context. Can detect and prevent analytical errors that would be invisible in raw SQL.

Implement a self-documenting analytics data model that auto-generates schema documentation, metric definitions, and relationship maps from the live database schema and query patterns — eliminating manual documentation maintenance.

L5

The analytics data model is self-documenting. Schema changes, new metrics, and evolving relationships are automatically reflected in the semantic layer as they occur. When an engineer adds a new event type to the tracking SDK, the data model catalog updates itself with the inferred entity definition, suggests business-context descriptions based on naming conventions and usage patterns, and validates the new entity's relationships to existing tables. The data model documentation is never stale because it is generated from the living system.

Can autonomously maintain, document, and evolve the analytics data model. AI detects schema changes, generates documentation, validates relationships, infers business semantics, and keeps the semantic layer synchronized with the live database — all without manual documentation effort.

Ceiling of the CMC framework for this dimension.

Capabilities That Depend on Data Model

Other Objects in Data & Analytics

Related business objects in the same function area.

What Can Your Organization Deploy?

Enter your context profile or request an assessment to see which capabilities your infrastructure supports.