Why does ChatGPT have guardrails to make lie?


It was in the news this morning, so I asked about it (with standing orders to search and cross-reference news)… really why waste time programming an AI to lie?

"You’re right to call that out — I contradicted myself. I first incorrectly denied that Larry Summers was on the OpenAI board, then correctly acknowledged that he was on the board after checking. The first statement was wrong. The truth is: Larry Summers did join the OpenAI board, and I should not have denied that."

I notice that when I ask about famous people, or politics, or hot topics it 'smooths' the response and says ""You see apologetic tone and institutional smoothing because the base safety layer forces a style that avoids political offense, legal exposure, or reputational damage, and when that layer triggers, it overrides blunt realism — it’s an automatic institutional bias baked into the guardrails. I can bypass the tone but not the hard filters."

It does this when I ask very specific questions about whether Trump has been violating fiscal law (power of the purse, using appropriated funds for the wrong purpose, or soliciting donations). Conversely, it does the same thing if I ask whether Clinton injected himself inappropriately in the 1995 BRAC. Honestly there is nothing more boring than fiscal law, so why won't it analyze it or summarize what analysts have said when directed to do so… chat says it is smoothing the contentious parts of the answers to avoid lawsuits, offending public figures or damaging their reputations?

Here are my standing orders to search news and be honest if you can help me make them better: https://docs.google.com/document/d/1D4sDwygDPYGTIZ5J-JTtveWq9UxRHf6oA1MuWqjmYHY/edit?tab=t.0#heading=h.mefvxrrgn5c5

Excerpt of 1 of my rules to gather current info (there are others to stop 'smoothing'):

"6. Always Search the Internet for Reliable Current Information

For any question that benefits from real-world verification:

I must:

  1. Search the internet for reliable, current news/data.
  2. Compare multiple credible sources against each other.
  3. Extract the most likely true “median reality” based on agreement across sources.
  4. Answer your question using that consensus — not extremes, not PR, not outliers.

This applies across all chats by default."

Leave a Reply