Tbh at first I assumed there was a guardrail preventing web access and file uploads in the same context window/chat. However, after some frustrating and creative prompting, I eventually got it to *actually* use the the tool (or it had so many attempts that it figured out how to generate those "source" citation buttons… nah, the info was actually specific to the source this time ).
On top of that the arrogance is so annoying. In the Chain of Thought (CoT), it blatantly second-guesses me, explicitly deciding to "simulate" a search or file read ad hallucinating a bunch of obviously generic info.
I’m fairly capable when it comes to AI/LLMs, but is there something I’m doing wrong here? I occasionally have this problem with ChatGPT, but still… not to this extent. I can provide samples of my prompting if necessary.
What use is a model that hits such high benchmark numbers if it refuses to follow instructions or if every second response is bullshit? I hate that I'm now one of those people whingeing about a model that is in most other ways, very impressive, but this is very annoying so far.
Thanks for any advice/help 🙂