Chat GPT

5,000 Redditors say ‘ChatGPT got dumber.’ Anthropic confirmed bugs. Here’s what still works.

By skyforbes Nov 27, 2025 No Comments

Is AI actually degrading or are we all losing our minds?

The evidence is real:

5,000+ Reddit users reported GPT-5 "feels like a downgrade" with shorter, lower-quality responses.
Stanford/UC Berkeley study found GPT-4's accuracy on math problems dropped significantly over months
Anthropic officially admitted THREE separate bugs affecting Claude Sonnet 4, Haiku 3.5, and Opus 3 from August-September 2025
OpenAI acknowledged "elevated latency issues" affecting ChatGPT

Developer on OpenAI forum: "ChatGPT is every day more useless… fails to follow extremely clear and simple rules"

Here's the wild part:

Anthropic's bugs only affected 0.8-16% of requests at peak.

Yet THOUSANDS complained about quality drops.

This reveals the truth: We blame the model when our prompts fail.

When AI has an off day, bad prompts collapse completely. Structured prompts still deliver.

The real problem:

Research from ProfileTree shows 78% of AI project failures stem from poor human-AI communication, not model limitations.

We want to blame "AI degradation" because it's easier than fixing our prompts.

The solution: DEPTH Method

During the August-September Claude bugs and GPT-5 rollout chaos, I tested which prompts survived model degradation. This framework held up:

D – Define Multiple Expert Validation

Instead of: "You're a developer"

Use: "You are three experts working together: a senior developer writing the code, a QA tester identifying edge cases, and a code reviewer checking for bugs. Each expert validates the others' work."

Why it survives degradation: Creates internal error-checking even when the model is buggy.

E – Establish Explicit Success Metrics

Instead of: "Write good code"

Use: "Code must: pass these 5 specific test cases [list them], follow PEP 8 standards, include error handling for [scenarios], run in under 2 seconds, flag ANY assumptions as UNCERTAIN with explanation"

Why it survives degradation: Removes ambiguity that causes failures when models struggle.

P – Provide Complete Context

Instead of: "Fix this code"

Use: "Project context: uses Flask 2.3, Python 3.11, deployed on AWS Lambda. Previous attempts failed because [X]. Performance requirements: [Y]. Edge cases to handle: [Z]. Current error: [specific traceback]."

Why it survives degradation: Grounding in specifics reduces hallucinations even when model quality dips.

T – Task Sequential Breakdown

Instead of: "Debug, refactor, and document this"

Use:

First: Analyze the error and identify root cause
Second: List all edge cases this must handle
Third: Write the solution with inline comments
Fourth: Test against all edge cases and report results

Why it survives degradation: Prevents AI from jumping to conclusions when reasoning is impaired.

H – Self-Critique Loop (CRITICAL FOR DEGRADATION)

Instead of: Accepting first output

Use: "Review your solution. Rate it 1-10 on: correctness, performance, edge case handling. Test it mentally against these scenarios: [list]. If ANY score below 8, revise. Flag anything you're uncertain about as UNCERTAIN and explain your doubt."

Why it survives degradation: This catches errors the model makes on bad days. Self-critique forces double-checking.

Real-world proof:

During the confirmed Anthropic bugs (Aug-Sept 2025), users with structured prompts reported fewer issues than those using simple requests. The self-critique step caught hallucinations before they became problems.

The uncomfortable truth:

Simple prompts worked great in 2023. In 2025, with model instability, they fail more often. DEPTH adds the structure needed for consistent quality even when models have off days.

Want prompts that survive AI's bad days?

I documented 1,000+ prompts using DEPTH that worked through:

The August-September Claude bugs
The GPT-5 rollout issues
Various model degradation periods

Each prompt includes:

Multi-expert validation structures
Explicit success criteria
Self-critique loops
Error-catching mechanisms

Checkout my collection. These are battle-tested during confirmed AI degradation periods.

Bottom line: AI models DO have issues sometimes. But structured prompting is the difference between "AI failed me" and "I got usable results anyway."

Anyone else found prompts that work during model degradation?

By skyforbes

Chat GPT

5,000 Redditors say ‘ChatGPT got dumber.’ Anthropic confirmed bugs. Here’s what still works.

Like this:

By skyforbes

Leave a ReplyCancel reply

You Missed

Merz asks Zelensky to reduce outflow of young Ukrainian men to Germany

TIL Sony holds the record for making the largest CRT monitor ever, called PVM-4300. It was made in 1989, with a 43-inch diagonal display and a weight of around 200 kilograms. There’s only one known unit still exists, which was rediscovered in 2022 in Osaka, Japan and acquired by a YouTuber.

Killer Mike – Reagan [Rap/Protest]

What is a detail in a movie that you’re embarrassed to admit you didn’t notice sooner?

Archives

5,000 Redditors say ‘ChatGPT got dumber.’ Anthropic confirmed bugs. Here’s what still works.

Like this:

By skyforbes

Related Posts

40+ Gift Ideas Made by Kids

Stop using ChatGPT like Google. Use it like a coach.

Study Habits

Leave a ReplyCancel reply

You Missed

Merz asks Zelensky to reduce outflow of young Ukrainian men to Germany

TIL Sony holds the record for making the largest CRT monitor ever, called PVM-4300. It was made in 1989, with a 43-inch diagonal display and a weight of around 200 kilograms. There’s only one known unit still exists, which was rediscovered in 2022 in Osaka, Japan and acquired by a YouTuber.

Killer Mike – Reagan [Rap/Protest]

What is a detail in a movie that you’re embarrassed to admit you didn’t notice sooner?