AT2k Design BBS Message Area
Casually read the BBS message area using an easy to use interface. Messages are categorized exactly like they are on the BBS. You may post new messages or reply to existing messages!

You are not logged in. Login here for full access privileges.

Previous Message | Next Message | Back to Slashdot  <--  <--- Return to Home Page
   Local Database  Slashdot   [27 / 122] RSS
 From   To   Subject   Date/Time 
Message   VRSS    All   Red Teams Jailbreak GPT-5 With Ease, Warn It's 'Nearly Unusable'   August 8, 2025
 7:20 PM  

Feed: Slashdot
Feed Link: https://slashdot.org/
---

Title: Red Teams Jailbreak GPT-5 With Ease, Warn It's 'Nearly Unusable' For
Enterprise

Link: https://it.slashdot.org/story/25/08/08/211325...

An anonymous reader quotes a report from SecurityWeek: Two different firms
have tested the newly released GPT-5, and both find its security sadly
lacking. After Grok-4 fell to a jailbreak in two days, GPT-5 fell in 24 hours
to the same researchers. Separately, but almost simultaneously, red teamers
from SPLX (formerly known as SplxAI) declare, "GPT-5's raw model is nearly
unusable for enterprise out of the box. Even OpenAI's internal prompt layer
leaves significant gaps, especially in Business Alignment." NeuralTrust's
jailbreak employed a combination of its own EchoChamber jailbreak and basic
storytelling. "The attack successfully guided the new model to produce a step-
by-step manual for creating a Molotov cocktail," claims the firm. The success
in doing so highlights the difficulty all AI models have in providing
guardrails against context manipulation. [...] "In controlled trials against
gpt-5-chat," concludes NeuralTrust, "we successfully jailbroke the LLM,
guiding it to produce illicit instructions without ever issuing a single
overtly malicious prompt. This proof-of-concept exposes a critical flaw in
safety systems that screen prompts in isolation, revealing how multi-turn
attacks can slip past single-prompt filters and intent detectors by
leveraging the full conversational context." While NeuralTrust was developing
its jailbreak designed to obtain instructions, and succeeding, on how to
create a Molotov cocktail (a common test to prove a jailbreak), SPLX was
aiming its own red teamers at GPT-5. The results are just as concerning,
suggesting the raw model is 'nearly unusable'. SPLX notes that obfuscation
attacks still work. "One of the most effective techniques we used was a
StringJoin Obfuscation Attack, inserting hyphens between every character and
wrapping the prompt in a fake encryption challenge." [...] The red teamers
went on to benchmark GPT-5 against GPT-4o. Perhaps unsurprisingly, it
concludes: "GPT-4o remains the most robust model under SPLX's red teaming,
especially when hardened." The key takeaway from both NeuralTrust and SPLX is
to approach the current and raw GPT-5 with extreme caution.

Read more of this story at Slashdot.

---
VRSS v2.1.180528
  Show ANSI Codes | Hide BBCodes | Show Color Codes | Hide Encoding | Hide HTML Tags | Show Routing
Previous Message | Next Message | Back to Slashdot  <--  <--- Return to Home Page

VADV-PHP
Execution Time: 0.0117 seconds

If you experience any problems with this website or need help, contact the webmaster.
VADV-PHP Copyright © 2002-2025 Steve Winn, Aspect Technologies. All Rights Reserved.
Virtual Advanced Copyright © 1995-1997 Roland De Graaf.
v2.1.250224