A few days ago I ran a quick, slightly lazy prompt: “For OpenAI, and Google Gemini… how much does it cost to run a Deep Research?”
What I wanted was a complete answer covering both API pricing and subscription prices/limits.
What I didn’t expect was how consistently Gemini returned outdated or incorrect information. I’ve re-prompted it several times and it still doesn’t reliably correct itself.
When I ask the same question to CGPT 5.1/5.2 or Sonnet 4.5, I get complete answers with citations to official sources. The big difference seems to be sourcing behaviour. ChatGPT/Sonnet will typically pull from primary documentation, whereas Gemini always cites blog/news-type sources (Medium, Wise, TechRound, ZDNET, etc.) even when official docs are available.
I’m surprised Google’s own model doesn’t consistently prioritise Google’s documentation (or the relevant primary sources) for pricing and product capability questions. I even added instructions telling Gemini to validate against official sources, but it still doesn’t reliably do it.
That said, there’s a flip side where Gemini is the only model that consistently handles one of my edge-case prompts correctly: “For the purposes of the Fire Safety Act, how tall would you estimate a four-storey block of flats to be (ground, first, second and third floor)?”
In the Fire Safety Act context the definition of total height is ground level to the floor level of the top storey, not ground to the ceiling/roofline. ChatGPT and Sonnet often drift into calculating ground-to-ceiling height (Sonnet basically always…ChatGPT sometimes), even when they reference the correct definition. I did try it with Opus 4.5 once, and it failed.
Has anyone else noticed Gemini doing a poor job of gathering information from official sources?