Detecting jailbreaks and prompt leakage before production

By skyforbes Dec 11, 2025 No Comments

I’ve been exploring issues around LLMs leaking system prompts and unexpected jailbreak behavior.

Thinking about a lightweight API that could help teams:
– detect jailbreak attempts & prompt leaks
– analyze prompt quality
– support QA/testing workflows for LLM-based systems

Curious how others are handling this – do you test prompt safety manually, or have any tools for it?

(Set up a small landing for early interest: assentra)

Would love to hear thoughts from other builders and researchers.

By skyforbes

GeminiAI

We built an agentic Google Drive

skyforbes Dec 11, 2025

GeminiAI

A painless way to remove watermarks in Gemini!

skyforbes Dec 11, 2025

GeminiAI

Dexter — Create prompts with placeholders and open them in different AIs

skyforbes Dec 11, 2025

Detecting jailbreaks and prompt leakage before production

Like this:

By skyforbes

Leave a ReplyCancel reply

You Missed

can’t create any canvas anymore. am i the only one?

We built an agentic Google Drive

Would you prefer machine translation or untranslated games?

Outcry in Italy over sex education bill to crack down on ‘gender ideology’

Archives

Detecting jailbreaks and prompt leakage before production

Like this:

By skyforbes

Related Posts

We built an agentic Google Drive

A painless way to remove watermarks in Gemini!

Dexter — Create prompts with placeholders and open them in different AIs

Leave a ReplyCancel reply

You Missed

can’t create any canvas anymore. am i the only one?

We built an agentic Google Drive

Would you prefer machine translation or untranslated games?

Outcry in Italy over sex education bill to crack down on ‘gender ideology’