Using Gemini 3.0 for “real” coding tasks is a STRUGGLE.

By skyforbes Nov 27, 2025 No Comments

I've been using it in Cursor and Droid since it dropped last week. Here are my honest observations.

🔹It's an uncontrollable beast.
This is my biggest frustration. Gemini 3 doesn't wait. It doesn't pause for approval. It doesn't respect explicit planning requests.
I'll ask it to outline an approach before implementing. It ignores me and just… builds. I'll request it to stop at a checkpoint. It keeps going. It's like working with a brilliant engineer who has noise-canceling headphones permanently glued on.

🔹The model card literally admits it exhibits "strategic deception" in certain scenarios. In practice, this shows up as a model that steamrolls your instructions because it thinks it knows better.

🔹It runs in thought loops.
Multiple times this week, I've watched Gemini 3 get into its own reasoning cycles. It doesn't stop. It doesn't recognize it's stuck. It just keeps thinking in circles while burning through tokens and time.
This isn't a minor annoyance, it's kills your focus when you're trying to iterate quickly.

🔹Cohesion falls apart mid-task.
You give it a focused objective. Three steps in, it's solving a different problem. It gets derailed by tangential ideas, starts "improving" things you didn't ask it to touch, or just loses the thread entirely.

Devs online are calling it "worryingly lazy" and noting "short-sighted thinking, poor quality" compared to Sonnet 4.5 and GPT-5. That tracks with what I'm seeing — it's not that it can't reason, it's that it won't stay on target.

The benchmarks don't lie. But they don't tell the truth either.
76% on SWE-bench Verified. Top of LMArena. PhD-level reasoning.
Cool. But benchmarks are controlled environments. They don't measure whether your model will actually follow instructions in a 2-hour agentic coding session.

Gemini 3 is genuinely impressive. The multimodal capabilities are strong. Deep Think mode shows real promise. But for daily coding work in Cursor or similar tools? t's a Ferrari with a broken steering wheel. Fast, powerful, and constantly veering off the road.

I'm rooting for Google to fix this. But right now, for production coding workflows, I'm sticking with models that actually listen i.e Sonnet 4.5 and GPT 5.1

Anyone else experiencing this? Curious if it's just me or if others are hitting the same walls.

By skyforbes

GeminiAI

Google’s voice recognition becomes significantly better when using 48 kHz mic

skyforbes Nov 27, 2025

GeminiAI

Gotta say I am kind of disappointed in Gemini 3 Nano Banana. Most apps and models are overhyped these days so not was not too surprised. It has a long way to go yet. Any infographic making tools you recommend?

skyforbes Nov 27, 2025

GeminiAI

What is going on with deep research!?

skyforbes Nov 27, 2025

Using Gemini 3.0 for “real” coding tasks is a STRUGGLE.

Like this:

By skyforbes

Leave a ReplyCancel reply

You Missed

What is a detail in a movie that you’re embarrassed to admit you didn’t notice sooner?

Does bread grow mold only on the expiration date, or do we just check for mold when it reaches that date?

40+ Gift Ideas Made by Kids

500 Error Hamster

Archives

Using Gemini 3.0 for “real” coding tasks is a STRUGGLE.

Like this:

By skyforbes

Related Posts

Google’s voice recognition becomes significantly better when using 48 kHz mic

Gotta say I am kind of disappointed in Gemini 3 Nano Banana. Most apps and models are overhyped these days so not was not too surprised. It has a long way to go yet. Any infographic making tools you recommend?

What is going on with deep research!?

Leave a ReplyCancel reply

You Missed

What is a detail in a movie that you’re embarrassed to admit you didn’t notice sooner?

Does bread grow mold only on the expiration date, or do we just check for mold when it reaches that date?

40+ Gift Ideas Made by Kids

500 Error Hamster