How much “core instruction” do you keep in the system prompt before it becomes counterproductive?

I’m experimenting with large system-level instruction blocks for business automation GPTs (director-style agents).

The tricky part is finding the right density of instructions.

When the system prompt is:

• too small → drift, tone inconsistency, weak reasoning

• too large → model becomes rigid, ignores the user, or hallucinates structure

My tests show the sweet spot is around:

3–5 core principles (tone, reasoning philosophy, behavior)

3–7 structured modes (/content_mode, /analysis_mode, etc.)

light but persistent “identity kernel”

no more than ~8–10 KB total

I’d love to hear from people who design multi-role prompts:

• do you rely on a single dense instruction block?

• do you extend with modular prompt-injection?

• how do you balance flexibility vs stability?

Any examples or architectures welcome.

Leave a Reply