Observability & learning

Turn your execution history into actionable improvements — detect patterns, auto-generate rules, and build a memory that makes every future run smarter.

Difficulty: Intermediate Duration: 30 minutes What you’ll build: A self-improving DojOps setup with learned rules from your project’s error patterns and persistent cross-session memory backed by LanceDB vector embeddings

What you’ll learn

Analyze your execution history to surface missed optimization opportunities
Filter insights by security, cost, or workflow focus
Mine recurring error patterns and convert them into rules injected into future LLM prompts
Manage the .dojops/rules.md file — both auto-generated and manual entries
Use LanceDB-backed cross-session memory in dojops chat
Inspect and clear memory when starting fresh on a project

Prerequisites

DojOps 1.1.6 installed (npm i -g @dojops/cli)
A project initialized with dojops init
At least 5–10 prior executions (scans, generates, or debug runs) in your history

Run a few commands now if you’re starting fresh:


dojops scan
dojops "Create a Dockerfile for a Node.js app"
dojops --debug-ci "ERROR: npm ci failed"

Workshop steps

Step 1: Run opportunity detection

dojops insights reads your execution history and surfaces patterns you might have missed. It looks for commands you ran that would have benefited from additional flags, recurring failures, and cost inefficiencies.


dojops insights


┌  Opportunity Detection
│
│  Analyzed: 47 executions, 12 scans, 5 plans
│
│  Opportunities found: 4
│
│  1. Unused --fix flag
│     You ran `scan` 8 times without --fix.
│     Auto-remediation could have resolved 23 findings automatically.
│     → Try: dojops scan --fix
│
│  2. Repeated kubectl dry-run failures
│     Kubernetes configs failed dry-run validation on the same
│     issue (missing resource limits) in 4 of 5 executions.
│     → Consider adding default resource limits to your templates.
│
│  3. Model cost optimization
│     83% of your generations used gpt-4o, but 60% were simple
│     tasks (Dockerfiles, .gitignore) that gpt-4o-mini handles well.
│     → Try: dojops config alias fast gpt-4o-mini
│
│  4. Scan coverage gap
│     You have Terraform configs but never ran `scan --iac`.
│     Checkov could catch misconfigurations before apply.
│     → Try: dojops scan --iac
│
└  4 opportunities identified

Each entry tells you what happened in your history, why it matters, and exactly what to run to address it. The suggestions aren’t generic advice — they’re derived from your actual command history.

Step 2: Filter insights by category

Running insights without flags shows everything. When you want to focus on a specific area, use a category filter.


dojops insights --security


┌  Security Opportunities
│
│  1. Scan coverage gap
│     You have Terraform configs but never ran `scan --iac`.
│     Checkov could catch misconfigurations before apply.
│     → Try: dojops scan --iac
│
│  2. gitleaks not configured
│     You're scanning for vulnerabilities but skipping secret detection.
│     → Try: dojops scan --secrets
│
└  2 security opportunities identified


dojops insights --cost


┌  Cost Opportunities
│
│  1. Model cost optimization
│     83% of your generations used gpt-4o.
│     60% were simple configs that gpt-4o-mini handles equally well.
│     → Try: dojops config alias fast gpt-4o-mini
│
│  2. JSON scanner output not enabled
│     Text scanner output uses ~38% more tokens than JSON mode.
│     → Try: dojops config set scan.jsonOutput true
│
└  2 cost opportunities identified

The --workflow filter surfaces process issues like skipped verification steps or commands that frequently fail and get re-run.

Step 3: Act on the recommendations

Each insight includes a concrete command. Run them:


# Address the unused --fix insight
dojops scan --fix
 
# Address the cost optimization
dojops config alias fast gpt-4o-mini
 
# Address the IaC coverage gap
dojops scan --iac

After you run these, dojops insights will update its analysis. The opportunities you’ve resolved won’t re-appear — unless the underlying pattern re-emerges in future runs.

Step 4: Mine error patterns from history

dojops learn does something different from insights. Instead of looking for missed flags, it mines recurring failures — errors that appear in multiple executions and likely have a common root cause. Start with a summary:


dojops learn summary


┌  Learning summary
│
│  Total executions:  47
│  Error patterns:    3 (2 unresolved)
│
│  Recent:
│  ✓ 2026-03-20 generate Create a Redis config
│  ✗ 2026-03-20 scan security scan failed
│  ✓ 2026-03-19 plan Set up CI for Node app
│
└

View execution patterns to see what commands you run most and their success rates:


dojops learn patterns


┌  Execution patterns
│
│  generate        22 runs  91% ok  ~3s  → github-actions-specialist, dockerfile-specialist
│  scan            12 runs  83% ok  ~8s
│  plan             8 runs  75% ok  ~5s  → planner
│  apply            5 runs  60% ok  ~12s
│
└

View learned error rules — recurring failures grouped by fingerprint:


dojops learn rules


┌  Learned rules (3)
│
│  #1 scan (4x)  Trivy finds CVE-2024-1234 in lodash < 4.17.21  unresolved
│  #2 apply (3x)  terraform init fails: Missing AWS_REGION  unresolved
│  #3 scan (5x)  hadolint DL3008 pin apt versions  ✓ Pinned versions in Dockerfile
│
└

DojOps auto-learns error patterns from every failure. On future runs, these patterns inform the LLM’s system prompt so it can avoid repeating the same mistakes.

Step 5: Resolve and dismiss patterns

When you fix the root cause of a recurring error, mark it as resolved so DojOps stops flagging it:


dojops learn resolve 2 "Added AWS_REGION to .env and CI pipeline"


✓ Marked pattern #2 as resolved.

If a pattern is noise (not a real issue), dismiss it entirely:


dojops learn dismiss 1


✓ Dismissed pattern #1.

Resolved patterns stay in the database with their resolution note. Dismissed patterns are deleted. Use dojops learn rules to see the current state.

You can also add team knowledge as project notes using dojops memory add:


dojops memory add "Always use Node 20 LTS in CI workflows" --category conventions
dojops memory add "Internal Docker registry is registry.company.com" --category conventions

These notes are surfaced in the LLM’s context during relevant operations.

Step 6: Filter and review patterns

Filter learned rules by command type to focus on specific areas:


dojops learn rules --type scan


┌  Learned rules (1)
│
│  #3 scan (5x)  hadolint DL3008 pin apt versions  ✓ Pinned versions in Dockerfile
│
└

The DL3008 pattern persists because those Dockerfiles still haven’t been updated. Once you fix them and the error stops recurring, no new occurrences accumulate.

Get machine-readable output for CI integration:


dojops learn patterns --output json

This outputs the full pattern data including success rates, durations, and agent routing information.

Step 7: Start a chat session with memory enabled

DojOps chat uses LanceDB vector embeddings stored at .dojops/memory/ to remember context across sessions. The memory is project-scoped — each project directory has its own memory index.

Start a chat:


dojops chat

If you’ve had previous sessions in this project, DojOps recalls relevant context before you type anything:


┌  DojOps Chat  (memory enabled)
│
│  Recalled context from 3 previous sessions:
│    · Last Terraform setup used S3 backend with DynamoDB locking  (Mar 10)
│    · Project uses GitHub Actions with Node 20/22 matrix          (Mar 08)
│    · Preferred Kubernetes namespace: "production"                 (Mar 05)
│
◆  How can I help?
│
│  > Set up a new Terraform module for RDS
│
│  Based on your S3 backend and DynamoDB locking pattern from
│  your previous Terraform config, I'll use the same backend
│  block for this module...

The model receives recalled memories as additional context. You don’t repeat yourself session after session — conventions you stated once are applied automatically.

Step 8: Observe memory in practice across sessions

Here’s the part that matters most: memories survive session exit. Close chat, come back days later, and the context is still there.

Session 1:


dojops chat
> "We use Terraform modules from our internal registry at registry.company.com/tf-modules"
> "All modules must support both us-east-1 and eu-west-1"
# (exit)

Session 2 — three days later:


dojops chat
> "Create a new VPC module"


┌  DojOps Chat
│
│  Creating VPC module with:
│    source: registry.company.com/tf-modules/vpc
│    (recalled from session on Mar 17)
│
│  Including provider configuration for us-east-1 and eu-west-1
│    (recalled from session on Mar 17)
│
│  ...

The model applied both conventions without being reminded. The vector search found the relevant memories and included them in context.

Step 9: Inspect memory status

Check what’s stored in the memory index for the current project:


dojops memory


┌  Project Memory
│
│  Entries:     23
│  Storage:     .dojops/memory/
│  Index size:  156 KB
│
│  Recent memories:
│    "S3 backend with DynamoDB lock for all Terraform"      (Mar 17)
│    "GitHub Actions CI with Node 20/22 matrix"             (Mar 15)
│    "Use internal registry at registry.company.com"        (Mar 17)
│    "Production namespace, staging for pre-release"        (Mar 12)
│    "resource limits required on all k8s resources"        (Mar 14)
│
└  23 memories across 8 sessions

The index size grows with use. At 156 KB it’s negligible, and LanceDB handles much larger indexes efficiently.

Step 10: Clear memory for a fresh start

If you’re reusing a project directory for a completely different purpose, or you want to remove a wrong convention that got memorized:


dojops memory clear


┌  Memory Cleared
│
│  Removed:  23 entries
│  Freed:    156 KB
│
└  Future sessions start with a clean slate

You can’t selectively delete individual memories from the CLI yet — it’s all-or-nothing. If you need to correct a specific misconception the model has, the fastest workaround is to explicitly state the correction in chat: “We no longer use the S3 backend — we’ve switched to Terraform Cloud.” That correction gets memorized and will override the older entry in future retrievals.

The feedback loop

These features combine into a cycle that improves over time:


Run commands
  │
  ▼
dojops insights     →  spot missed flags and coverage gaps
  │
  ▼
dojops learn        →  .dojops/rules.md auto-injected into future prompts
  │
  ▼
dojops chat         →  .dojops/memory/ recalls project conventions
  │
  ▼
Better output on the next run

The loop is self-reinforcing. More history means more accurate patterns. More patterns mean better rules. Better rules mean more relevant LLM output.

Try it yourself

Challenge 1: Add three manual rules to .dojops/rules.md that capture your team’s actual conventions — things like preferred base images, naming conventions, or environment requirements. Run dojops "Create a Dockerfile for a Python app" before and after adding the rules and compare the output.

Challenge 2: Start two separate dojops chat sessions on the same project, one hour apart. In the first session, tell the model about a project-specific constraint. In the second, ask a question where that constraint would affect the answer — without repeating it. Verify whether memory retrieved the right context.

Challenge 3: Run dojops learn on a project that has at least 10 executions. Identify one pattern in the output that you can fix at the source rather than just documenting as a rule. Fix it, run dojops learn --refresh, and confirm the pattern no longer appears.

Troubleshooting

dojops insights returns “no opportunities found” immediately

You need at least a few executions in your history. Run dojops history to check how many entries exist. If the history is empty, run a scan and a couple of generate commands first.

dojops learn doesn’t detect a pattern you know is recurring

Pattern detection uses a threshold — an error must appear in at least 3 executions to be flagged. If you have fewer than 5 executions total, results will be sparse. Also check that the relevant commands are in .dojops/ history (not a different directory).

Memory in chat isn’t recalling context you know you stated

Memory recall uses vector similarity search. If the question you’re asking is phrased very differently from how you stated the original context, the similarity score may fall below the recall threshold. Try rephrasing the question more closely to the original wording, or explicitly re-state the context in the current session.

.dojops/memory/ is missing or empty

Memory is written only after a dojops chat session ends cleanly. If you killed a session with kill -9 or a crash, the memory index may not have been flushed. Run dojops memory to check. If the directory is missing entirely, run dojops init to recreate the project structure.

What you learned

DojOps gets more accurate on your specific project the more you use it. The insights engine surfaces what you’re missing based on your actual history rather than generic best practices. The learning engine converts recurring failures into rules that pre-inform the LLM before it generates a response — so it avoids the same mistake automatically. Cross-session memory backed by LanceDB means you stop repeating context that the model should already know. Together, these three features form a feedback loop where each run contributes to better results on the next one.

Next steps

Token Analytics & Cost Control — Monitor and reduce LLM costs per provider and skill
Security Audit & Remediation — Full scanning walkthrough with all 10 scanners
Interactive Chat Sessions — Slash commands, agent pinning, and conversation management