
Reporters from Wired stated that image generators from Google and OpenAI can bypass integrated moderation filters. According to their findings, by employing specific text manipulations, the AI is capable of producing visuals that violate the platforms’ stated restrictions.
This information stems from now-deleted posts on Reddit, where users shared so-called “jailbreaks”—sets of prompts enabling them to trick security algorithms.
Despite a formal ban on sexual content, the models in certain instances generated images without the consent of individuals whose photos were used as source material.
Similar issues have previously surfaced with other AI services. Notably, the Grok chatbot from xAI drew attention, as did the Flux image generator, after whose launch users widely created deepfakes due to weak or absent filters.
Google and OpenAI confirmed they are aware of such vulnerabilities and continue to update their moderation systems. The companies emphasize that protection against misuse remains a priority, and identified methods for circumventing filters are closed as they are discovered.