-
Notifications
You must be signed in to change notification settings - Fork 18
Description
Add support for dimensional hierarchies to enable drill-down/roll-up operations and better aggregate reuse. This feature would just be a thin metadata layer on top of DJ's dimension nodes, but would formally encode whether a set of dimension nodes have a hierarchical relation.
Problem
Currently, DJ users work with flat dimensions and must manually specify aggregation levels in their queries. Some implicit hierarchies can be built with dimension links, but there is no formal way to express that country > state > city represents a hierarchy. This means that DJ can't leverage hierarchical relationships to optimize drill-down or roll-up operations (for example, we can't easily reuse a pre-aggregated materialization with daily data for monthly or quarterly rollups).
Solution
Add a hierarchy metadata layer, where hierarchy definitions are a new entity type that references existing dimension nodes and formalizes their hierarchical structure.
Database Schema
CREATE TABLE hierarchies (
id BIGINT PRIMARY KEY,
name VARCHAR UNIQUE NOT NULL,
description TEXT
-- No dimension reference needed - levels handle it
);
CREATE TABLE hierarchy_levels (
id BIGINT PRIMARY KEY,
hierarchy_id BIGINT REFERENCES hierarchies(id),
name VARCHAR NOT NULL,
dimension_node_id BIGINT REFERENCES nodes(id), -- Points to existing dimension node
level_order INTEGER NOT NULL,
grain_columns JSONB, -- For single-dimension hierarchies
UNIQUE(hierarchy_id, name)
);Note that grain_columns is optional, and in many cases we can leverage the existing infra around dimension links, which already define the join relationships between any two hierarchical levels. All we need to do in these cases is store the order of the levels.
Example
A geography hierarchy can be created from either a single dimension node or multiple dimension nodes, depending on how it was modeled originally.
Multi-Dimension Pattern:
hierarchy_type: multi_dimension
levels:
- name: country
dimension_node: default.country_dim
# grain_columns: null (uses primary key from dimension node)
- name: state
dimension_node: default.state_dim
# grain_columns: null (uses primary key from dimension node)
- name: city
dimension_node: default.city_dim
# grain_columns: null (uses primary key from dimension node)Single-Dimension Pattern:
hierarchy_type: single_dimension
levels:
- name: country
dimension_node: default.location_dim # Same dimension node for all levels
grain_columns: [country] # Just country column
- name: state
dimension_node: default.location_dim # Same dimension node
grain_columns: [country, state] # Country + state columns
- name: city
dimension_node: default.location_dim # Same dimension node
grain_columns: [country, state, city] # All three columnsAPI Endpoints
# Hierarchy management
POST /hierarchies/ # Create hierarchy
GET /hierarchies/ # List hierarchies
GET /hierarchies/{name} # Get hierarchy details
PUT /hierarchies/{name} # Update hierarchy
DELETE /hierarchies/{name} # Delete hierarchy
GET /hierarchies/{name}/levels # Get hierarchy levelsChecklist
Core Setup
- Database schema for hierarchies + levels
- Basic CRUD API endpoints, which includes hierarchy validation logic for joins
- UI configuration support
- YAML configuration support
- Hierarchy metadata is exposed via GraphQL
Query Generation Optimization
- Hierarchy-aware query building (e.g., automatic rollup from finer to coarser levels)
- Multi-hierarchy materialization strategies