Running Benchmarks on new Gemini 3 Pro Preview

By skyforbes Dec 8, 2025 No Comments

Google has released Gemini 3 Pro Preview.

So I have run some tests and here are the Gemini 3 Pro Preview benchmark results:

– two benchmarks you have already seen on this subreddit when we were discussing if Polish is a better language for prompting: Logical Puzzles – English and Logical Puzzles – Polish. Gemini 3 Pro Preview scores 92% on Polish puzzles, first place ex aequo with Grok 4. For English puzzles the new Gemini model secures first place ex aequo with Gemini-2.5-pro with a perfect 100% score.

– next on AIME25 Mathematical Reasoning Benchmark. Gemini 3 Pro Preview once again is in the first place together with Grok 4. Cherry on the top: latency for Gemini is significantly lower than for Grok.

– next we have a linguistic challenge: Semantic and Emotional Exceptions in Brazilian Portuguese. Here the model placed only sixth after glm-4.6, deepseek-chat, qwen3-235b-a22b-2507, llama-4-maverick and grok-4.

All results below in comments! (not super easy to read since I can't attach a screenshot so better to click on corresponding benchmark links)

Let me know if there are any specific benchmarks you want me to run Gemini 3 on and what other models to compare it to.

P.S. looking at the leaderboard for Brazilian Portuguese I wonder if there is a correlation between geopolitics and model performance 🤔 A question for next week…

Links to benchmarks:

Logical Puzzles – English: https://www.peerbench.ai/benchmarks/view/95
Logical Puzzles – Polish: https://www.peerbench.ai/benchmarks/view/89
AIME25 Mathematical Reasoning: https://www.peerbench.ai/benchmarks/view/100
Semantic and Emotional Exception in Brazilian Portuguese: https://www.peerbench.ai/benchmarks/view/161

By skyforbes

GeminiAI

Running Benchmarks on new Gemini 3 Pro Preview

Like this:

By skyforbes

Leave a ReplyCancel reply

You Missed

Pay To Fully View Prior Chats?

The Cheat Codes of ChatGPT – Here are 32 shortcuts to force better outputs instantly.

Reverse Prompting

Malaysia detains more Japanese nationals over scam operations | NHK WORLD-JAPAN News

Archives

Running Benchmarks on new Gemini 3 Pro Preview

Like this:

By skyforbes

Related Posts

Reverse Prompting

Self-Development of the Day (Nov 20 · Thursday)

Does Gemini “hard reset” its memory after browser refreshes/reboots? (Context cutting off)

Leave a ReplyCancel reply

You Missed

Pay To Fully View Prior Chats?

The Cheat Codes of ChatGPT – Here are 32 shortcuts to force better outputs instantly.

Reverse Prompting

Malaysia detains more Japanese nationals over scam operations | NHK WORLD-JAPAN News