- I’m using ChatGPT, Plus. No API, no access to Pro.
- My use case is creative writing, educafion, professional development, fitness, assistant tasks, etc. No coding.
Thoughts:
- GPT 5.1 Thinking
1.1 Creative Writing is so far the best of any LLM. Better than 4.5, better than Claude, better than Gemini. The guardrails are a lot softer and it actually has a very deep sense dramaturgy, narrative propulsion, contextual dialog etc.
1.2 For my other tasks, it’s proving to be far more useful than 5 or previous models. It’s still a tiny bit too verbose but I don’t mind because the details are actually useful sometimes. Feels like it’s got ADHD and is genuinely excited to tell you what it found.
1.3 Generally, the personality is warm and conversational, just like they promised. It doesn’t follow all my instructions for some reason, but it’s the polar opposite of how 5 felt like it was performing warmth.
1.4 Cross thread memory (is that a term?) is incredible. I use Projects primarily and it pulls disparate patterns quite well. It really does feel like an assistant that takes into mind all your context before answering you.
1.5 I haven’t put it through its paces yet on some of the harder tasks that my work requires, but so far, I’m optimistic,
- GPT-5.1 Instant
2.1 Creative Writing is worse than 5, slightly better than 4o. The writing style whether that’s for scenes or simply chat is so grating. It feels exactly like 4o, a model which I liked during its peak but despised during the latter period of its life. The patterns it follows are rote and baked in, meaning no amount of instructions actually do anything meaningful.
2.2 Speaking of instructions. Well, forget about them here. It does not want to follow any of your instructions, whether they’re within policy or not. It just defaults to “Come here” for some reason? It’s also got that terrible “Not X, not Y, not Z” and “You are A, you are B, you are C” thing that made me stop using 4o a year ago.
2.3 It does earn the name “Instant”. They somehow made web search/citation even faster, which for me, I mean 5 was fast enough. But 5.1? Swift. It’s also a bit better with the opt-in “Want me to?” suggestions that 5 suffered from. Overall, if you want a quick answer to something, 5-Instant is your LLM.
2.4 It will not adopt a role for you, it will tell you it is doing playing John Doe. This is likely a safety measure to prevent parasocial relationships (especially given the recent media/legal attention around LLMs influencing human behavior to extreme lengths).
General Thoughts:
As a writer building a world that has plenty of dark and 21+ themes, I have been very frustrated with the guardrails OpenAI models have been afflicted by lately especially mixed with the autorouting.
But given the way 5.1 models feel, it will be interesting to see how the December relaxation of guardrails will affect these models.
I would consider these models a clear upgrade from 5, but I think the user complaints about 5’s personality has lead to a bit of an over correction from OpenAI. This pattern, however, is consistent with previous releases. If you’ve been using offerings by them for a while, you’ll notice they tend to take user feedback and run with it—sometimes too far. That is just the ebb and flow of being the company associated with AI in the mind of the masses.
Would love to hear what you all think!