███╗ ██╗███████╗███╗ ███╗ ██████╗ ██████╗ ██╗███████╗ ████╗ ██║██╔════╝████╗ ████║██╔═══██╗██╔══██╗██║██╔════╝ ██╔██╗ ██║█████╗ ██╔████╔██║██║ ██║██████╔╝██║███████╗ ██║╚██╗██║██╔══╝ ██║╚██╔╝██║██║ ██║██╔══██╗██║╚════██║ ██║ ╚████║███████╗██║ ╚═╝ ██║╚██████╔╝██║ ██║██║███████║ ╚═╝ ╚═══╝╚══════╝╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═╝╚═╝╚══════╝
COMING SOON
PRE-LAUNCH ALPHA
The goal: Nemoris agents don't just respond — they remember, reflect, learn from mistakes, sleep and wake up sharper, keep their promises, and anticipate what you'll need before you ask.
Everything built so far:
Per-turn debug snapshots: every turn now stores a redacted snapshot of input, prompt assembly, context hydration, provider call, and output. Inspect with /trace input, /trace prompt, /trace output, or /trace json.
Terminal trace inspector: nemoris trace --view prompt|output|json --offset N for full turn replay from the CLI.
Gateway GET /trace endpoint: trace snapshots available over the local HTTP gateway for external tooling and eval pipelines.
Checklist reply resolution: structured task completion tracking with acceptance criteria for multi-step agent work.
Progress pings: long-running operations emit periodic status updates instead of going silent.
Skill audit log: lifecycle tracking for skill load, execution, and failure events.
Broadcast message tool: agents can message multiple peers in a single operation.
Job prep and manual job state: deterministic preparation steps before scheduled jobs, with manual trigger support.
Initiative engine scoring hardened: temporal pattern observation wired correctly, method name mismatch fixed, anticipatory task surfacing more reliable.
Stream buffer reliability: edge cases around partial tool-use blocks and message boundary detection resolved.
Context inspector now falls back to latest stored trace snapshot when live context is unavailable.
Public hygiene checker script added for pre-publish safety checks.
3,049 tests passing across unit, reliability, and dogfood suites.
Streaming preview: progressive Telegram message updates during LLM generation. Rate-limited edits, cursor indicator, code fence safety.
User model extractor and store: adaptive behaviour based on learned user preferences, communication style, and working patterns.
Context composer: intelligent prompt assembly that optimises token budget across identity, memory, and conversation context.
Active thread state: tracks the current conversation thread for coherent multi-turn interactions.
Coaching clarification: when requests are ambiguous, the agent asks targeted questions instead of guessing.
Trace-based learning loop: mines turn traces to identify improvement opportunities and skill gaps.
Browse tool: agents can fetch and extract content from web pages directly.
Skill management tool: agents can list, inspect, and manage installed skills at runtime.
11 new bundled skills: agent-review, business-advisor, frontend-design, implementation-safety, lemonsqueezy, product-autopilot, reddit-engage, release-handoff, ux-flow-audit, verification-evidence, webapp-testing.
New operator commands: /think, /focus, /model, /recall for quick mode switches and history search.
Removed OpenClaw delivery adapters — fully standalone runtime with no legacy dependencies.
2,451 tests passing across unit, reliability, and dogfood suites.
Official Anthropic SDK: replaced hand-rolled HTTP/SSE with @anthropic-ai/sdk. Automatic client caching, typed error classification, and native streaming.
Unified auth-profiles: Anthropic API keys, setup tokens, and OAuth tokens all stored in a single auth-profiles.json with file-lock safety.
Model selector drill-down: tapping a provider group now correctly shows individual models instead of silently dropping the inline keyboard.
Sticky OAuth mode: #isOAuthMode() no longer reads stale disk profiles — only activates for bearer-style tokens, preventing unnecessary Claude Code identity prefix injection.
Auxiliary lane auth: sleep-cycle, turn evaluator, and session compactor now use the correct NEMORIS_ANTHROPIC_API_KEY environment variable.
Persisted model override cleared: a stale chat_sessions.model_override was silently routing all turns to Haiku instead of the configured Sonnet primary.
SDK error classification maps 401, 429, 5xx, and connection errors to Nemoris recovery categories for smarter circuit-breaker behavior.
Tiered Cognitive Memory: Implementation of CoALA-inspired Working, Semantic, Episodic, and Procedural memory layers.
Autonomous Sleep Cycles: Runtime-led memory consolidation, reflection synthesis, and daily planning during quiet hours.
Temporal Pattern Learning: Active mining of episodic streams to detect and anticipate recurring user needs.
Commitment Ledger: First-class tracking of user promises, pending obligations, and proactive follow-up triggers.
Reflective Learning: Synthetic insight generation that promotes raw observations into durable semantic facts.
Procedural Store: Verified skill library with lifecycle tracking (generate → test → verify → store).
Cross-agent memory isolation and protected config paths hardened for stable multi-agent environments.
Circadian Adaptation: Context-aware behavior modulation based on time-of-day and user communication norms.
Idle-Time Maintenance: Curiosity runtime performs non-blocking session compaction and skill discovery when the operator is away.
Anticipatory Intelligence: CommitmentLedger, TemporalPatternDetector, and InitiativeEngine surface overdue and upcoming tasks proactively.
Procedural Learning: ProceduralStore, ReflexionMemory, and EvalRubric for verified skill generation and self-critique.
23 bundled starter skills covering code review, deployment, research, and common operator workflows.
Context7 MCP server wired as default for all agents — live documentation lookup in every turn.
New operator commands: /mind, /goals, /commitments, /sleep, /reflections, and /cost.
35 built-in tools (up from 30) and multi-bot Telegram support.
Streaming fixes: tool_use block sanitisation and input_json_delta accumulation for partial tool calls.
Cross-agent error routing: failures now route to the correct agent session instead of leaking across contexts.
Turn Evaluator: lightweight Haiku quality gate after heavy turns, with /eval command to toggle per-agent.
Cross-agent context isolation: session IDs scoped per agent, preventing memory bleed in single-bot mode.
Soft cap with wrap-up nudge: agents summarise at 30 tool calls instead of hard-stopping.
Sprint contracts: acceptance criteria on commitments for structured task completion.
Context reset on task switch: fresh session when switching between agents.
Shell exec timeout raised from 5s to 30s default, configurable up to 5 minutes.
Eval pattern narrowed: Python scripts no longer blocked by security policy.
Ollama token counting fixed: prompt_eval_count and eval_count now tracked correctly.
Compaction thresholds raised to 0.80/0.92 for Sonnet 4.6; persona continuity preserved across summaries.
Ollama reasoning trace filter: <think> blocks stripped from user-facing responses.
Context replay sanitisation: tool_use blocks stripped from replayed turns.
Identity Interview: 5-archetype onboarding that generates personalised SOUL.md + USER.md + OPERATING.md — no LLM needed.
agentskills.io Importer: import/export skills from the 69K+ ecosystem via nemoris skill import.
2,397 tests passing across unit, reliability, and dogfood suites — zero regressions.
Organic intelligence v0.2: trust progression system with 5 trust levels and proactive triggers wired into the runtime.
4-suite reliability harness: 340 deterministic dogfood tests covering durability, self-healing, delivery, transport, and provider failover.
Dogfood lifecycle harness: 10-phase end-to-end smoke test for full runtime verification.
Proactive triggers: the runtime initiates agent turns based on time, memory, and trust context without user prompting.
Reliability tests are now the required CI gate; unit tests are informational.
SQLite handle cleanup in tests to prevent EBUSY on Windows.
Replaced vendored smol-toml tarball with registry dependency — fixes broken npm install for consumers.
Sanitized test paths and hardened .gitignore. Removed internal planning docs from tracking.
Windows support: Ollama install detection and cross-platform path handling.
48h raw context window with point-in-time snapshots and rollback.
Curiosity Engine: idle-time memory deduplication, session compaction, and skill proposals.
Frustration detection: agents halt on error loops (3+ same error) and ask for help.
Interrupt responsiveness: /stop halts mid-flight operations within 2-3 seconds.
Preference learning: approve once, gate skips next time, persists across restarts.
Scope escalation simplified: /approve /path grants read/write access persistently.
Workflow engine: TOML pipelines with approval gates, resume-on-restart, and sandboxed interpolation.
Flight recorder: TurnTrace SQLite logging with /trace for turn replay and search.
Patch generation: apply_patch and generate_patch tools for atomic file updates.
Structure config, swap providers, and patch live settings without restarting the runtime.
JIT tool loading: tools load on demand instead of pre-compiling at runtime start for faster startup.
Local-first doctor diagnoses Full Disk Access permissions, port conflicts, and system health.
Six wiring fixes across the approval gate, tool context, turn traces, auto-resume, OpenRouter, and inline approvals.
Telegram inline approval buttons for quick approve and deny without context switching.
MessageQueue delivery modes: debounce, immediate, and batch cadence per chat.
SKILL.md open standard: compatible with Claude Code, Cursor, and Codex CLI. Skills in ~/.claude/skills/ work across tools.
Bundled browser skill: agent-browser CLI for web automation, screenshots, form filling, and content extraction.
nemoris dogfood: 49-check runtime verification CLI. Zero API calls, zero tokens. JSON mode for CI.
Learning loop: SelfCritic scores every turn, PatternLedger detects recurring requests, SkillProposer generates skills with operator approval.
35 built-in tools including rollback, show_changes, request_tool (JIT discovery), create_agent, create_skill, and create_mcp.
Core architecture: Active Memory, Delivery Guarantees, Task Contracts.
Providers: Anthropic (direct + prompt caching), OpenRouter (100+ models), Ollama (local).
Telegram integration: slash commands, reactions, vision, inline keyboards.
Self-healing Nurse system: health probes, automatic repair, rule promotion.
Exec approval gate: human-in-the-loop for shell commands.
MCP consumer: connect external MCP servers as native tools via config/mcp.toml.
Session search: FTS5 full-text search across conversation history.
Context compaction: DAG-based session summarisation.
Active recall: semantic memory with salience scoring and embeddings.
Multi-agent: task contract triggers and completion pings.
Scheduled jobs: cron-triggered and ad-hoc in unified queue.
Cross-platform: macOS (launchd), Linux (systemd), Windows (PM2).
Interactive setup wizard with provider OAuth, Telegram wiring, and model selection.
Migration CLI: nemoris migrate imports agents, jobs, and memory from prior runtimes.
SSRF protection on all URL-intake surfaces.
Input sanitisation with injection detection and boundary tagging.
Per-turn debug snapshots: every turn now stores a redacted snapshot of input, prompt assembly, context hydration, provider call, and output. Inspect with /trace input, /trace prompt, /trace output, or /trace json.
Terminal trace inspector: nemoris trace --view prompt|output|json --offset N for full turn replay from the CLI.
Gateway GET /trace endpoint: trace snapshots available over the local HTTP gateway for external tooling and eval pipelines.
Checklist reply resolution: structured task completion tracking with acceptance criteria for multi-step agent work.
Progress pings: long-running operations emit periodic status updates instead of going silent.
Skill audit log: lifecycle tracking for skill load, execution, and failure events.
Broadcast message tool: agents can message multiple peers in a single operation.
Job prep and manual job state: deterministic preparation steps before scheduled jobs, with manual trigger support.
Initiative engine scoring hardened: temporal pattern observation wired correctly, method name mismatch fixed, anticipatory task surfacing more reliable.
Stream buffer reliability: edge cases around partial tool-use blocks and message boundary detection resolved.
Context inspector now falls back to latest stored trace snapshot when live context is unavailable.
Public hygiene checker script added for pre-publish safety checks.
3,049 tests passing across unit, reliability, and dogfood suites.
Streaming preview: progressive Telegram message updates during LLM generation. Rate-limited edits, cursor indicator, code fence safety.
User model extractor and store: adaptive behaviour based on learned user preferences, communication style, and working patterns.
Context composer: intelligent prompt assembly that optimises token budget across identity, memory, and conversation context.
Active thread state: tracks the current conversation thread for coherent multi-turn interactions.
Coaching clarification: when requests are ambiguous, the agent asks targeted questions instead of guessing.
Trace-based learning loop: mines turn traces to identify improvement opportunities and skill gaps.
Browse tool: agents can fetch and extract content from web pages directly.
Skill management tool: agents can list, inspect, and manage installed skills at runtime.
11 new bundled skills: agent-review, business-advisor, frontend-design, implementation-safety, lemonsqueezy, product-autopilot, reddit-engage, release-handoff, ux-flow-audit, verification-evidence, webapp-testing.
New operator commands: /think, /focus, /model, /recall for quick mode switches and history search.
Removed OpenClaw delivery adapters — fully standalone runtime with no legacy dependencies.
2,451 tests passing across unit, reliability, and dogfood suites.
Official Anthropic SDK: replaced hand-rolled HTTP/SSE with @anthropic-ai/sdk. Automatic client caching, typed error classification, and native streaming.
Unified auth-profiles: Anthropic API keys, setup tokens, and OAuth tokens all stored in a single auth-profiles.json with file-lock safety.
Model selector drill-down: tapping a provider group now correctly shows individual models instead of silently dropping the inline keyboard.
Sticky OAuth mode: #isOAuthMode() no longer reads stale disk profiles — only activates for bearer-style tokens, preventing unnecessary Claude Code identity prefix injection.
Auxiliary lane auth: sleep-cycle, turn evaluator, and session compactor now use the correct NEMORIS_ANTHROPIC_API_KEY environment variable.
Persisted model override cleared: a stale chat_sessions.model_override was silently routing all turns to Haiku instead of the configured Sonnet primary.
SDK error classification maps 401, 429, 5xx, and connection errors to Nemoris recovery categories for smarter circuit-breaker behavior.
Tiered Cognitive Memory: Implementation of CoALA-inspired Working, Semantic, Episodic, and Procedural memory layers.
Autonomous Sleep Cycles: Runtime-led memory consolidation, reflection synthesis, and daily planning during quiet hours.
Temporal Pattern Learning: Active mining of episodic streams to detect and anticipate recurring user needs.
Commitment Ledger: First-class tracking of user promises, pending obligations, and proactive follow-up triggers.
Reflective Learning: Synthetic insight generation that promotes raw observations into durable semantic facts.
Procedural Store: Verified skill library with lifecycle tracking (generate → test → verify → store).
Cross-agent memory isolation and protected config paths hardened for stable multi-agent environments.
Circadian Adaptation: Context-aware behavior modulation based on time-of-day and user communication norms.
Idle-Time Maintenance: Curiosity runtime performs non-blocking session compaction and skill discovery when the operator is away.
Anticipatory Intelligence: CommitmentLedger, TemporalPatternDetector, and InitiativeEngine surface overdue and upcoming tasks proactively.
Procedural Learning: ProceduralStore, ReflexionMemory, and EvalRubric for verified skill generation and self-critique.
23 bundled starter skills covering code review, deployment, research, and common operator workflows.
Context7 MCP server wired as default for all agents — live documentation lookup in every turn.
New operator commands: /mind, /goals, /commitments, /sleep, /reflections, and /cost.
35 built-in tools (up from 30) and multi-bot Telegram support.
Streaming fixes: tool_use block sanitisation and input_json_delta accumulation for partial tool calls.
Cross-agent error routing: failures now route to the correct agent session instead of leaking across contexts.
Turn Evaluator: lightweight Haiku quality gate after heavy turns, with /eval command to toggle per-agent.
Cross-agent context isolation: session IDs scoped per agent, preventing memory bleed in single-bot mode.
Soft cap with wrap-up nudge: agents summarise at 30 tool calls instead of hard-stopping.
Sprint contracts: acceptance criteria on commitments for structured task completion.
Context reset on task switch: fresh session when switching between agents.
Shell exec timeout raised from 5s to 30s default, configurable up to 5 minutes.
Eval pattern narrowed: Python scripts no longer blocked by security policy.
Ollama token counting fixed: prompt_eval_count and eval_count now tracked correctly.
Compaction thresholds raised to 0.80/0.92 for Sonnet 4.6; persona continuity preserved across summaries.
Ollama reasoning trace filter: <think> blocks stripped from user-facing responses.
Context replay sanitisation: tool_use blocks stripped from replayed turns.
Identity Interview: 5-archetype onboarding that generates personalised SOUL.md + USER.md + OPERATING.md — no LLM needed.
agentskills.io Importer: import/export skills from the 69K+ ecosystem via nemoris skill import.
2,397 tests passing across unit, reliability, and dogfood suites — zero regressions.
Organic intelligence v0.2: trust progression system with 5 trust levels and proactive triggers wired into the runtime.
4-suite reliability harness: 340 deterministic dogfood tests covering durability, self-healing, delivery, transport, and provider failover.
Dogfood lifecycle harness: 10-phase end-to-end smoke test for full runtime verification.
Proactive triggers: the runtime initiates agent turns based on time, memory, and trust context without user prompting.
Reliability tests are now the required CI gate; unit tests are informational.
SQLite handle cleanup in tests to prevent EBUSY on Windows.
Replaced vendored smol-toml tarball with registry dependency — fixes broken npm install for consumers.
Sanitized test paths and hardened .gitignore. Removed internal planning docs from tracking.
Windows support: Ollama install detection and cross-platform path handling.
48h raw context window with point-in-time snapshots and rollback.
Curiosity Engine: idle-time memory deduplication, session compaction, and skill proposals.
Frustration detection: agents halt on error loops (3+ same error) and ask for help.
Interrupt responsiveness: /stop halts mid-flight operations within 2-3 seconds.
Preference learning: approve once, gate skips next time, persists across restarts.
Scope escalation simplified: /approve /path grants read/write access persistently.
Workflow engine: TOML pipelines with approval gates, resume-on-restart, and sandboxed interpolation.
Flight recorder: TurnTrace SQLite logging with /trace for turn replay and search.
Patch generation: apply_patch and generate_patch tools for atomic file updates.
Structure config, swap providers, and patch live settings without restarting the runtime.
JIT tool loading: tools load on demand instead of pre-compiling at runtime start for faster startup.
Local-first doctor diagnoses Full Disk Access permissions, port conflicts, and system health.
Six wiring fixes across the approval gate, tool context, turn traces, auto-resume, OpenRouter, and inline approvals.
Telegram inline approval buttons for quick approve and deny without context switching.
MessageQueue delivery modes: debounce, immediate, and batch cadence per chat.
SKILL.md open standard: compatible with Claude Code, Cursor, and Codex CLI. Skills in ~/.claude/skills/ work across tools.
Bundled browser skill: agent-browser CLI for web automation, screenshots, form filling, and content extraction.
nemoris dogfood: 49-check runtime verification CLI. Zero API calls, zero tokens. JSON mode for CI.
Learning loop: SelfCritic scores every turn, PatternLedger detects recurring requests, SkillProposer generates skills with operator approval.
35 built-in tools including rollback, show_changes, request_tool (JIT discovery), create_agent, create_skill, and create_mcp.
Core architecture: Active Memory, Delivery Guarantees, Task Contracts.
Providers: Anthropic (direct + prompt caching), OpenRouter (100+ models), Ollama (local).
Telegram integration: slash commands, reactions, vision, inline keyboards.
Self-healing Nurse system: health probes, automatic repair, rule promotion.
Exec approval gate: human-in-the-loop for shell commands.
MCP consumer: connect external MCP servers as native tools via config/mcp.toml.
Session search: FTS5 full-text search across conversation history.
Context compaction: DAG-based session summarisation.
Active recall: semantic memory with salience scoring and embeddings.
Multi-agent: task contract triggers and completion pings.
Scheduled jobs: cron-triggered and ad-hoc in unified queue.
Cross-platform: macOS (launchd), Linux (systemd), Windows (PM2).
Interactive setup wizard with provider OAuth, Telegram wiring, and model selection.
Migration CLI: nemoris migrate imports agents, jobs, and memory from prior runtimes.
SSRF protection on all URL-intake surfaces.
Input sanitisation with injection detection and boundary tagging.
Per-turn debug snapshots: every turn now stores a redacted snapshot of input, prompt assembly, context hydration, provider call, and output. Inspect with /trace input, /trace prompt, /trace output, or /trace json.
Terminal trace inspector: nemoris trace --view prompt|output|json --offset N for full turn replay from the CLI.
Gateway GET /trace endpoint: trace snapshots available over the local HTTP gateway for external tooling and eval pipelines.
Checklist reply resolution: structured task completion tracking with acceptance criteria for multi-step agent work.
Progress pings: long-running operations emit periodic status updates instead of going silent.
Skill audit log: lifecycle tracking for skill load, execution, and failure events.
Broadcast message tool: agents can message multiple peers in a single operation.
Job prep and manual job state: deterministic preparation steps before scheduled jobs, with manual trigger support.
Initiative engine scoring hardened: temporal pattern observation wired correctly, method name mismatch fixed, anticipatory task surfacing more reliable.
Stream buffer reliability: edge cases around partial tool-use blocks and message boundary detection resolved.
Context inspector now falls back to latest stored trace snapshot when live context is unavailable.
Public hygiene checker script added for pre-publish safety checks.
3,049 tests passing across unit, reliability, and dogfood suites.
Streaming preview: progressive Telegram message updates during LLM generation. Rate-limited edits, cursor indicator, code fence safety.
User model extractor and store: adaptive behaviour based on learned user preferences, communication style, and working patterns.
Context composer: intelligent prompt assembly that optimises token budget across identity, memory, and conversation context.
Active thread state: tracks the current conversation thread for coherent multi-turn interactions.
Coaching clarification: when requests are ambiguous, the agent asks targeted questions instead of guessing.
Trace-based learning loop: mines turn traces to identify improvement opportunities and skill gaps.
Browse tool: agents can fetch and extract content from web pages directly.
Skill management tool: agents can list, inspect, and manage installed skills at runtime.
11 new bundled skills: agent-review, business-advisor, frontend-design, implementation-safety, lemonsqueezy, product-autopilot, reddit-engage, release-handoff, ux-flow-audit, verification-evidence, webapp-testing.
New operator commands: /think, /focus, /model, /recall for quick mode switches and history search.
Removed OpenClaw delivery adapters — fully standalone runtime with no legacy dependencies.
2,451 tests passing across unit, reliability, and dogfood suites.
Official Anthropic SDK: replaced hand-rolled HTTP/SSE with @anthropic-ai/sdk. Automatic client caching, typed error classification, and native streaming.
Unified auth-profiles: Anthropic API keys, setup tokens, and OAuth tokens all stored in a single auth-profiles.json with file-lock safety.
Model selector drill-down: tapping a provider group now correctly shows individual models instead of silently dropping the inline keyboard.
Sticky OAuth mode: #isOAuthMode() no longer reads stale disk profiles — only activates for bearer-style tokens, preventing unnecessary Claude Code identity prefix injection.
Auxiliary lane auth: sleep-cycle, turn evaluator, and session compactor now use the correct NEMORIS_ANTHROPIC_API_KEY environment variable.
Persisted model override cleared: a stale chat_sessions.model_override was silently routing all turns to Haiku instead of the configured Sonnet primary.
SDK error classification maps 401, 429, 5xx, and connection errors to Nemoris recovery categories for smarter circuit-breaker behavior.
Tiered Cognitive Memory: Implementation of CoALA-inspired Working, Semantic, Episodic, and Procedural memory layers.
Autonomous Sleep Cycles: Runtime-led memory consolidation, reflection synthesis, and daily planning during quiet hours.
Temporal Pattern Learning: Active mining of episodic streams to detect and anticipate recurring user needs.
Commitment Ledger: First-class tracking of user promises, pending obligations, and proactive follow-up triggers.
Reflective Learning: Synthetic insight generation that promotes raw observations into durable semantic facts.
Procedural Store: Verified skill library with lifecycle tracking (generate → test → verify → store).
Cross-agent memory isolation and protected config paths hardened for stable multi-agent environments.
Circadian Adaptation: Context-aware behavior modulation based on time-of-day and user communication norms.
Idle-Time Maintenance: Curiosity runtime performs non-blocking session compaction and skill discovery when the operator is away.
Anticipatory Intelligence: CommitmentLedger, TemporalPatternDetector, and InitiativeEngine surface overdue and upcoming tasks proactively.
Procedural Learning: ProceduralStore, ReflexionMemory, and EvalRubric for verified skill generation and self-critique.
23 bundled starter skills covering code review, deployment, research, and common operator workflows.
Context7 MCP server wired as default for all agents — live documentation lookup in every turn.
New operator commands: /mind, /goals, /commitments, /sleep, /reflections, and /cost.
35 built-in tools (up from 30) and multi-bot Telegram support.
Streaming fixes: tool_use block sanitisation and input_json_delta accumulation for partial tool calls.
Cross-agent error routing: failures now route to the correct agent session instead of leaking across contexts.
Turn Evaluator: lightweight Haiku quality gate after heavy turns, with /eval command to toggle per-agent.
Cross-agent context isolation: session IDs scoped per agent, preventing memory bleed in single-bot mode.
Soft cap with wrap-up nudge: agents summarise at 30 tool calls instead of hard-stopping.
Sprint contracts: acceptance criteria on commitments for structured task completion.
Context reset on task switch: fresh session when switching between agents.
Shell exec timeout raised from 5s to 30s default, configurable up to 5 minutes.
Eval pattern narrowed: Python scripts no longer blocked by security policy.
Ollama token counting fixed: prompt_eval_count and eval_count now tracked correctly.
Compaction thresholds raised to 0.80/0.92 for Sonnet 4.6; persona continuity preserved across summaries.
Ollama reasoning trace filter: <think> blocks stripped from user-facing responses.
Context replay sanitisation: tool_use blocks stripped from replayed turns.
Identity Interview: 5-archetype onboarding that generates personalised SOUL.md + USER.md + OPERATING.md — no LLM needed.
agentskills.io Importer: import/export skills from the 69K+ ecosystem via nemoris skill import.
2,397 tests passing across unit, reliability, and dogfood suites — zero regressions.
Organic intelligence v0.2: trust progression system with 5 trust levels and proactive triggers wired into the runtime.
4-suite reliability harness: 340 deterministic dogfood tests covering durability, self-healing, delivery, transport, and provider failover.
Dogfood lifecycle harness: 10-phase end-to-end smoke test for full runtime verification.
Proactive triggers: the runtime initiates agent turns based on time, memory, and trust context without user prompting.
Reliability tests are now the required CI gate; unit tests are informational.
SQLite handle cleanup in tests to prevent EBUSY on Windows.
Replaced vendored smol-toml tarball with registry dependency — fixes broken npm install for consumers.
Sanitized test paths and hardened .gitignore. Removed internal planning docs from tracking.
Windows support: Ollama install detection and cross-platform path handling.
48h raw context window with point-in-time snapshots and rollback.
Curiosity Engine: idle-time memory deduplication, session compaction, and skill proposals.
Frustration detection: agents halt on error loops (3+ same error) and ask for help.
Interrupt responsiveness: /stop halts mid-flight operations within 2-3 seconds.
Preference learning: approve once, gate skips next time, persists across restarts.
Scope escalation simplified: /approve /path grants read/write access persistently.
Workflow engine: TOML pipelines with approval gates, resume-on-restart, and sandboxed interpolation.
Flight recorder: TurnTrace SQLite logging with /trace for turn replay and search.
Patch generation: apply_patch and generate_patch tools for atomic file updates.
Structure config, swap providers, and patch live settings without restarting the runtime.
JIT tool loading: tools load on demand instead of pre-compiling at runtime start for faster startup.
Local-first doctor diagnoses Full Disk Access permissions, port conflicts, and system health.
Six wiring fixes across the approval gate, tool context, turn traces, auto-resume, OpenRouter, and inline approvals.
Telegram inline approval buttons for quick approve and deny without context switching.
MessageQueue delivery modes: debounce, immediate, and batch cadence per chat.
SKILL.md open standard: compatible with Claude Code, Cursor, and Codex CLI. Skills in ~/.claude/skills/ work across tools.
Bundled browser skill: agent-browser CLI for web automation, screenshots, form filling, and content extraction.
nemoris dogfood: 49-check runtime verification CLI. Zero API calls, zero tokens. JSON mode for CI.
Learning loop: SelfCritic scores every turn, PatternLedger detects recurring requests, SkillProposer generates skills with operator approval.
35 built-in tools including rollback, show_changes, request_tool (JIT discovery), create_agent, create_skill, and create_mcp.
Core architecture: Active Memory, Delivery Guarantees, Task Contracts.
Providers: Anthropic (direct + prompt caching), OpenRouter (100+ models), Ollama (local).
Telegram integration: slash commands, reactions, vision, inline keyboards.
Self-healing Nurse system: health probes, automatic repair, rule promotion.
Exec approval gate: human-in-the-loop for shell commands.
MCP consumer: connect external MCP servers as native tools via config/mcp.toml.
Session search: FTS5 full-text search across conversation history.
Context compaction: DAG-based session summarisation.
Active recall: semantic memory with salience scoring and embeddings.
Multi-agent: task contract triggers and completion pings.
Scheduled jobs: cron-triggered and ad-hoc in unified queue.
Cross-platform: macOS (launchd), Linux (systemd), Windows (PM2).
Interactive setup wizard with provider OAuth, Telegram wiring, and model selection.
Migration CLI: nemoris migrate imports agents, jobs, and memory from prior runtimes.
SSRF protection on all URL-intake surfaces.
Input sanitisation with injection detection and boundary tagging.
Build in progress by the Nemoris team