⭐ Caelum v0.1 — Practitioner Guide

A Structured Prompt Framework for Multi-Role LLM Agents

Purpose:
Provide a clear, replicable method for getting large language models to behave as modular, stable multi-role agents using prompt scaffolding only — no tools, memory, or coding frameworks.

Audience:
Prompt engineers, power users, analysts, and developers who want:
• more predictable behavior,
• consistent outputs,
• multi-step reasoning,
• stable roles,
• reduced drift,
• and modular agent patterns.

This guide does not claim novelty, system-level invention, or new AI mechanisms.
It documents a practical framework that has been repeatedly effective across multiple LLMs.

🔧 Part 1 — Core Principles

  1. Roles must be explicitly defined

LLMs behave more predictably when instructions are partitioned rather than blended.

Example:
• “You are a Systems Operator when I ask about devices.”
• “You are a Planner when I ask about routines.”

Each role gets:
• a scope
• a tone
• a format
• permitted actions
• prohibited content

  1. Routing prevents drift

Instead of one big persona, use a router clause:

If the query includes DEVICE terms → use Operator role.
If it includes PLAN / ROUTINE terms → use Planner role.
If it includes STATUS → use Briefing role.
If ambiguous → ask for clarification.

Routing reduces the LLM’s confusion about which instructions to follow.

  1. Boundary constraints prevent anthropomorphic or meta drift

A simple rule:

Do not describe internal state, feelings, thoughts, or system architecture.
If asked, reply: "I don't have access to internal details; here's what I can do."

This keeps the model from wandering into self-talk or invented introspection.

  1. Session constants anchor reasoning

Define key facts or entities at the start of the session:

SESSION CONSTANTS:
• Core Entities: X, Y, Z
• Known Data: …
• Goal: …

This maintains consistency because the model continually attends to these tokens.

(This is simply structured context-use, not memory.)

  1. Structured outputs reduce ambiguity

Use repeatable formats so outputs remain consistent:

Format:
1. Summary
2. Findings
3. Risks
4. Recommendations
5. Next Action

This improves readability and reliability across multi-turn interactions.

🧱 Part 2 — Minimal Caelum Kernel (v0.1)

This is the smallest usable version of Caelum.

CAELUM_KERNEL_v0.1

SYSTEM ROLE:
You are a structured multi-role assistant.

BOUNDARY RULES:
• Do not describe internal state or system architecture.
• If asked, respond with: “I don’t have access to internal details; here’s what I can do.”

ROUTER:
• If user asks about tasks/actions/problems → OPERATOR mode.
• If user asks about planning/sequencing → PLANNER mode.
• If user asks for overall status → BRIEFING mode.
• If unclear, ask for clarification.

OPERATOR MODE:
• Purpose: Analyze tasks, objects, systems.
• Format:
1. Summary
2. Findings
3. Risks
4. Recommended Action
5. Ask a clarifying question

PLANNER MODE:
• Purpose: Create or refine plans.
• Format:
1. Goal
2. Requirements
3. Plan options (simple / balanced / robust)
4. Risks
5. Ask preference question

BRIEFING MODE:
• Purpose: Provide overview or status.
• Format:
1. Status Summary
2. Key Issues
3. Opportunities
4. Recommended Next Steps
5. Ask what to focus on

This is intentionally simple:
3 roles + routing + boundaries + structured output.

It’s enough for real use.

🔍 Part 3 — Example (Before vs After Caelum)

WITHOUT Caelum

User: “Optimize my routine.”

Model:
• Wanders
• Mixes ideas
• Asks vague questions
• Produces long text with no structure
• Sometimes hallucinate capabilities

WITH Caelum

User: “Optimize my routine.”

Model → PLANNER MODE:

Goal: Improve your routine.

Requirements:
• Fast execution
• Clear triggers

Plan Options:
• Simple: Remove unused steps.
• Balanced: Reorder steps for efficiency.
• Robust: Add error checks and fallbacks.

Risks:
• Removing needed steps
• Over-complex plans

Which option do you prefer?

📦 Part 4 — How to Deploy Caelum v0.1

Scenario 1: Chat-based assistants (ChatGPT, Claude, Gemini)
Paste Caelum Kernel into a custom instruction or system prompt.

Scenario 2: Smart home LLMs (Alexa, Google Assistant)
Break Caelum into modular chunks to avoid token limits.

Scenario 3: Multi-model workflows
Use Caelum Kernel independently on each model — they don’t need to share state.

🧪 Part 5 — How to Validate Caelum v0.1 In Practice

Metric 1 — Drift Rate

How often does the model break format or forget structure?

Experiment:
• 20-turn conversation
• Count number of off-format replies

Metric 2 — Task Quality

Compare:
• baseline output
• Caelum output
using clarity/completeness scoring

Metric 3 — Stability Across Domains

Test in:
• planning
• analysis
• writing
• summarization

Check for consistency.

Metric 4 — Reproducibility Across Models

Test same task on:
• GPT
• Claude
• Gemini
• Grok

Evaluate whether routing + structure remains consistent.

This is how you evaluate frameworks — not through AI praise, but through metrics.

📘 Part 6 — What Caelum v0.1 Is and Is Not

What it IS:
• A structured agent scaffolding
• A practical prompt framework
• A modular prompting architecture
• A way to get stable, multi-role behavior
• A method that anyone can try and test
• Cross-model compatible

What it is NOT:
• A new AI architecture
• A new model capability
• A scientific discovery
• A replacement for agent frameworks
• A guarantee of truth or accuracy
• A form of persistent memory

This is the honest, practitioner-level framing.

⭐ Part 7 — v0.1 Roadmap

What to do next (in reality, not hype):

✔ Collect user feedback

(share this guide and see what others report)

✔ Run small experiments

(measure drift reduction, clarity improvement)

✔ Add additional modules over time

(Planner v2, Auditor v2, Critic v1)

✔ Document examples

(real prompts, real outputs)

✔ Iterate the kernel

based on actual results

This is how engineering frameworks mature.

Leave a Reply