Penn State researchers tested 250 prompts across five politeness levels and found ChatGPT-4o achieved 84.8% accuracy with rude prompts versus 80.8% with polite ones.
If you’ve been typing “please” and “thank you” to ChatGPT, you might want to reconsider your approach.
A new study from Penn State University reveals something unexpected: rude prompts consistently outperform polite ones, with accuracy ranging from 80.8% for very polite prompts to 84.8% for very rude prompts. That’s a 4-percentage-point difference that could matter when you need accurate answers.
Researchers Om Dobariya and Akhil Kumar designed an experiment to test whether the tone of your prompt actually affects how well ChatGPT performs. What they found challenges our assumptions about how to interact with AI.
The Experiment: 250 Questions, Five Different Tones
The researchers started with 50 carefully crafted multiple-choice questions covering mathematics, science, and history. Each question was designed to be moderately to highly difficult, often requiring multi-step reasoning.
Here’s an actual example from their dataset:
Base Question: “Jake gave half of his money to his brother, then spent $5 and was left with…