Add multi-dimensional cost model with causal reasoning for issue #2351 #2381
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR addresses #2351 by implementing a multi-dimensional cost model for HEIR's layout optimization. I realize this may be outside the box compared to what you were thinking, and I'm completely open to simplifying if this feels like too much too soon; but I wanted to share the approach I landed on after digging into the compiler literature and thinking about how to make the cost model extensible.
The Problem
From your comment on #2351, I understood the key challenges as:
My Approach
After reading Muchnick's compiler book (Chapter 17 on instruction scheduling) and seeing how classical compilers handle heterogeneous costs, I thought about applying causal reasoning to model FHE operation costs. Here's the thinking:
Core insight: Operations don't have fixed costs - their cost depends on why they're expensive:
So instead of flat cost tables, I built a causal graph that models these dependencies explicitly.
The Causal DAG (as implemented in CausalGraph.cpp)
Edge weights (the actual causal relationships coded):
Direct causal effects:
op_type → noise_growth(weight 2.0 for multiply, 0.1 for add)op_type → depth_consumed(weight 1.0 for multiply)op_type → relinearization(weight 1.0 for multiply)op_type → key_switching(weight 1.0 for rotate)rotation_offset → key_switching(weight 1.0)noise_growth → noise_level(weight 1.0)key_switching → latency(weight 10.0, expensive!)key_switching → memory(weight 5.0, rotation keys are large)relinearization → latency(weight 8.0)relinearization → memory(weight 3.0)depth_consumed → depth_remaining(weight -1.0)depth_remaining → latency(weight -5.0, low budget triggers bootstrap)critical_path → latency(weight 2.0 penalty)parallelizable → throughput(weight 3.0 boost)memory_pressure → latency(weight 1.5, cache misses)Confounding paths (not direct causes):
backend → latency,noise_growth,key_switching(implementation differences)scheme → noise_growth,depth_consumed(algorithmic differences)security_params → latency,memory,noise_level(parameter choices)hardware_config → parallelizable,latency(more threads = faster)id est:

Why this structure matters:
Multiplication is expensive through three causal paths:
Rotation is expensive through one main path:
Backend affects everything but is a confounder not a cause; we model this separately so learned weights don't confuse correlation with causation.
What's Implemented
1. Multi-Dimensional Cost Tracking
2. Multiplicative Depth Visitor (Fixed)
Parallel to
RotationCountVisitor(from PR #2347), now correctly tracks FHE depth:mul: +1 depth only if both operands are ciphertextpower(n): +ceil(log2(n)) depth only if base is ciphertextrotation/add: +0 depth (automorphisms don't consume depth)3. Causal Cost Model Foundation
4. Context-Aware Cost Adjustments
After computing base cost from causal DAG, we adjust for context:
latency *= 1.5(can't parallelize)latency /= min(thread_count, 4)latency += 45ms(bootstrap!)latency *= 1.3(cache misses)Why Causal Reasoning?
What's NOT Implemented (Yet)
This is foundation only:
I wanted to get feedback on the approach before going further. However I did have a cool idea for certain causal vertices being at different float per our office-hours talk, as well as potential to implement some type of ERM (check out https://proceedings.neurips.cc/paper/2020/file/95a6fc111fa11c3ab209a0ed1b9abeb6-Paper.pdf for some inspiration off the top of my head).
Testing
Alternative: Simpler Approach
If this feels too complex, I can simplify to:
The causal framework is appealing because it naturally handles the config/backend/parallelism interactions you mentioned, but I completely understand if you'd prefer starting simpler.
Questions for You
Thanks for your patience with this somewhat experimental approach; happy to iterate based on your feedback!
Related Issues
Addresses #2351
Builds on #2347 (rotation counting)
Signed-off-by: bon-cdp [email protected]
Critical Fix: Multiplicative Depth Tracking
Update (latest commit): Fixed
MultiplicativeDepthVisitorto match the behavior ofRotationCountVisitorfrom PR #2347.The Bug: Previously counted ALL multiplications as depth +1, even plaintext-ciphertext multiplications (which are free in FHE).
The Fix: Now tracks secret status through the DAG and only counts ciphertext-ciphertext multiplications:
plaintext × ciphertext: depth 0 (free scalar multiplication in FHE) ✓ciphertext × ciphertext: depth 1 (expensive, increases noise) ✓Impact: Cost models now report CORRECT FHE depths. For example:
W×x(W plaintext, x ciphertext): depth 0 (was 1)z²(z ciphertext): depth 1 (correct)This brings
MultiplicativeDepthVisitorin line with howRotationCountVisitorhandles plaintext rotations (which also cost nothing).