Tested Gemini 3 Pro Preview for UI generation against 6 other models – here’s what happened

By skyforbes Nov 29, 2025 No Comments

Recently, we tested 7 AI models on a semi-complicated UI task: building a beautiful, functional dashboard using sample data.

Models tested: Gemini 3 Pro Preview, Claude Sonnet 4.5, Grok 4.1, Grok Code Fast 1, GPT-5.1-Codex, MiniMax M2, Gemini 2.5 Pro.

The Experiment

We gave all 7 models the same task: build an analytics dashboard for an AI code editor. We provided 4 sample metrics and chart data showing model usage distribution over 7 days, then told them to "Use your creativity to make this beautiful and functional."

All models used Next.js 15, React 19, and Tailwind CSS v4. Same stack, same data. The results were quite different, however.

The Results

2 out of 7 models failed due to Tailwind v4 knowledge cutoff (29% failure rate). They used outdated Tailwind v3 syntax, which produced unstyled dashboards. One additional model (MiniMax M2) partially failed with broken padding but had working colors and charts.

Winner: Gemini 3 Pro Preview

Google's latest model added context on top of the features we asked for. The standout was a "Recent System Events" table that showed live activity like Code Completion events, Refactor Request events, and Unit Test Gen events—each with the model that processed it, latency values, and status indicators.

Gemini 3 also got creative with product branding, naming our dashboard "SynthCode v2.4.0" instead of something generic, and added a "systems operational" status indicator. Code efficiency: 285 lines total. Not the shortest, but every line serves a purpose.

Second Place: Claude Sonnet 4.5

Claude demonstrated restraint—it knew what to add and what to skip. Added a "Live" animated pulse indicator, three helpful insight cards (Peak Hours, Most Used Language, Weekly Growth), and a footer stats bar with relevant metrics like Projects Active, Code Acceptance %, and Uptime %.

Code length: ~200 lines. Clean component structure, full-width charts.

Third Place: Grok 4.1

xAI's latest model proved that less is more. Delivered a functional analytics dashboard in only 100 lines of code. No buzzwords, irrelevant features, or overengineering. Just:

4 metric cards with icons
Area chart (code generation over 7 days)
Donut chart (model usage distribution)
"Last updated: just now" timestamp

This is enough for an MVP version of a dashboard.

GPT-5.1-Codex Over-Engineered

OpenAI's GPT-5.1-Codex added the most features (341 lines) but they were largely irrelevant. It included things like "Trigger safe-mode deploy" buttons (this was an analytics dashboard, not a CI/CD panel) and invented metrics that weren't in our prompt data—like a "Success Funnel" with made-up percentages.

The pattern: GPT-5 copy-pasted concepts from ops/SRE/infrastructure dashboards without considering if they fit this dashboard's purpose. It optimized for "sounding impressive" over "being accurate."

Key Takeaways

More features ≠ better. GPT-5's 341 lines lost to Gemini 3's 285 lines.
Training recency matters. Gemini 2.5 Pro (8 months old) failed completely on Tailwind v4. Gemini 3 Pro Preview (released yesterday) won 1st place.
Thoughtful additions > overengineering. Every feature should serve a purpose.
Sometimes minimal is best. Grok 4.1's 100 lines prove you don't need complexity to be effective.

Full breakdown with screenshots -> link.

By skyforbes

GeminiAI

Tested Gemini 3 Pro Preview for UI generation against 6 other models – here’s what happened

Like this:

By skyforbes

Leave a ReplyCancel reply

You Missed

Open Letter Regarding Daniel Foor & Ancestral Medicine

President Trump Launches Genesis Mission, Harnessing AI for US Energy, Science and Security Dominance

Why Your AI Keeps Ignoring Your Instructions (And The Exact Formula That Fixes It)

Do NOT Use Gemini CLI

Archives

Tested Gemini 3 Pro Preview for UI generation against 6 other models – here’s what happened

Like this:

By skyforbes

Related Posts

Do NOT Use Gemini CLI

Is Gemini Fast bi polar?

Please tell me there are others who find Gemini Live just as annoying as I do

Leave a ReplyCancel reply

You Missed

Open Letter Regarding Daniel Foor & Ancestral Medicine

President Trump Launches Genesis Mission, Harnessing AI for US Energy, Science and Security Dominance

Why Your AI Keeps Ignoring Your Instructions (And The Exact Formula That Fixes It)

Do NOT Use Gemini CLI