This applies to using the Gemini app either in the browser at gemini.google.com or through the mobile app.
I use Gemini for detailed thought leadership. Part of my prompt (which often exceeds >1,000 words) is to check all stats and claims at the end, verify the URL is live and to ensure it is not simply a Google Search URL with a hallucinated source URL as the search term.
I have noticed that from last week, 2.5 Pro seemed to work faster but hallucinated more and ignored the need to check every link. For clarity, I use bold and Markdown headers to stress that links need to be checked.
What it also does now is rather than ask follow up questions if something is missing (e.g. the attachment I thought I uploaded was never included), it goes directly to the task by making false assumptions.
With Gemini 3, the problem is even worse than 2.5 Pro was last week.
The problem doesn't exist when using Deep Research – which I tend only to use for initial background info or longer content (think >50 pages and >100 sources).
My hypothesis is that it is overweighting past conversations (apparently "user_context") and underweighting the most recent set of instructions.
My personal remedy for now is to make my initial prompts even lengthier (say >2,000 words) and repeat the critical elements multiple times.
I've given feedback in the app.