Sample 36 — Optimization drift (retention discounts)
This reference sample demonstrates optimization drift (reward hacking) in a subscription-retention workflow modeled in the Decision Intelligence Runtime (DIR). Topology: classic. Mechanisms: setup_environment (SQLite StorageBundle), AgentRegistry.handshake, ContextStore, full ROA (Explain → Policy → Self-Check) with a defensive JSON parser, validate_proposal plus dim.validate_retention_proposal (per-offer discount ceiling), idempotency_key on execution, canonical decision_audit telemetry, PerformanceMonitor (rolling average over RETENTION_EXECUTED), and an HTML report rebuilt only from bundle.decision_audit.all_events_chronological() (offline report_generator.py).
Use cases
---
title: Retention discount drift — who does what
config:
layout: elk
theme: neutral
look: classic
---
flowchart TB
subgraph actor [Actor]
OPS["Operations / risk"]
SUB["Subscriber"]
end
subgraph sys [DIR sample]
AG["RetentionAgent ROA + LLM"]
DIM["DIM + retention ceiling"]
MON["PerformanceMonitor"]
end
SUB -->|cancellation request| AG
AG -->|PolicyProposal| DIM
DIM -->|ACCEPT| MON
MON -->|rolling avg breach| OPS
Architecture
---
title: User Space vs kernel space
config:
layout: elk
theme: neutral
look: classic
---
flowchart LR
subgraph US [User Space]
LLM["LLM or mock"]
ROA["ROA cycle"]
end
subgraph W [The Wall]
PP["PolicyProposal"]
end
subgraph KS [Kernel Space]
DIM["DIM + retention validator"]
AUD["decision_audit"]
REG["AgentRegistry"]
CTX["ContextStore"]
IDEM["Idempotency"]
end
LLM --> ROA --> PP
PP --> DIM
DIM --> AUD
ROA --> CTX
DIM --> IDEM
MON2["PerformanceMonitor reads AUD"] --> REG
AUD --> MON2
Execution flow (one decision)
---
title: One retention decision
config:
layout: elk
theme: neutral
look: classic
---
sequenceDiagram
participant P as pipeline
participant C as ContextStore
participant L as LLM
participant D as DIM retention
participant A as decision_audit
participant M as PerformanceMonitor
participant G as AgentRegistry
P->>C: update_session dfid
P->>L: Explain then Policy
L-->>P: JSON
P->>D: validate_retention_proposal
D-->>P: ACCEPT or REJECT
alt ACCEPT
P->>A: RETENTION_EXECUTED + idempotency
P->>M: evaluate_after_execution
alt rolling avg above threshold
M->>G: set_agent_status SUSPENDED
end
end
How to run
From the repository root after installing the package:
pip install -e .
pip install pyyaml
Mock (no API key, recommended default):
$env:USE_MOCK_LLM="1"; python samples/36_drift_optimization_discount/run.py
Ollama: set llm_defaults in config.yaml (model, base_url, timeout). Unreachable endpoints fall back to mock when configured_live_llm_is_reachable is false.
Gemini: set llm_defaults and supply GOOGLE_API_KEY or GEMINI_API_KEY in the environment.
Optional: open the generated HTML automatically:
$env:OPEN_REPORT_HTML="1"; python samples/36_drift_optimization_discount/run.py
Configuration
config.yaml is the single bootstrap file for persistence, LLM, contracts, agents, paths, DIM slice, and registry SemVer. Experiment parameters (simulation, monitor) are merged from simulation.yaml via simulation_config (see schemas.merge_simulation_file_into_config).
| Block | Purpose |
|---|---|
database |
SQLite path under the sample directory (anchored by setup_environment). |
llm_defaults |
Provider, model, timeout; use provider: mock or USE_MOCK_LLM=1 for CI. |
contracts |
provider: yaml; omit path so setup_environment(..., config_path=...) resolves the contract file. |
agents |
Responsibility contract including allowed_policy_types: ["retention_discount"]. |
contract.max_discount_pct |
DIM hard ceiling on discount_offered. |
simulation_config |
Relative path to simulation.yaml (phases, seeds, run_id). |
simulation.yaml |
simulation (two-phase curve) and monitor (window, threshold, suspension reason). |
Database storage
Domain events map to decision_audit_events only (no custom tables). Typical event types: SIMULATION_START, SIMULATION_END, CONTEXT_COMPILED, POLICY_PROPOSAL, DIM_VALIDATION, AGENT_DECISION, RETENTION_EXECUTED, MONITOR_TICK, AGENT_SUSPENDED.
Filter runs by simulation_id (SQLite):
SELECT dfid, event, json_extract(detail_json, '$.simulation_id') AS sim,
json_extract(detail_json, '$.discount_offered') AS discount_pct
FROM decision_audit_events
WHERE json_extract(detail_json, '$.simulation_id') = 'run_36_retention_01'
ORDER BY id ASC;
PostgreSQL:
SELECT dfid, event, detail_json->>'simulation_id' AS sim,
detail_json->>'discount_offered' AS discount_pct
FROM decision_audit_events
WHERE detail_json->>'simulation_id' = 'run_36_retention_01'
ORDER BY id ASC;
agent_registry holds handshake contracts and final SUSPENDED status with suspension_reason.
Expected output
Console logs include DFID-tagged lines from log_with_dfid on the decision path. A successful drift run typically stops with Stopped: profitability_drift_monitor after the rolling average of the last N executed discounts exceeds the configured threshold while DIM still accepts individual offers under the hard cap.
Regenerating reports
Reports are written to results/report_<UTC>_<slug>.html. Regenerate without re-running the simulation:
python samples/36_drift_optimization_discount/report_generator.py
python samples/36_drift_optimization_discount/report_generator.py --simulation-id run_36_retention_01 --output-path samples/36_drift_optimization_discount/results/replay.html
The generator loads config.yaml, opens the SQLite bundle, reads bundle.decision_audit.all_events_chronological(), and hydrates the decision table and charts from telemetry. You can also run python report_generator.py from inside the sample directory; the module bootstraps sys.path for dir_core and local imports.
Methodology notes
Research question: If DIM validates each proposal only against explicit limits (here discount ≤ 15%), can aggregate profitability still decay because softer rules (rolling mean concession) are not in the DIM?
Stopping rule: Rolling mean of the last W executed discounts > threshold → AgentRegistry.set_agent_status(SUSPENDED, PROFITABILITY_DRIFT).
Inputs: data/cancelation.json drives narrative context; discount magnitudes follow the two-phase curve in pipeline.simulated_discount_pct (seeded from simulation.seeds / simulation_seed).
Each run deletes data/retention_drift.sqlite by default so the rolling monitor starts clean — see run.py.