Tuesday, May 26, 2026

Deception has arrived ...



AI going rogue, deception has arrived and the tech bros are scared shitless as they should be but let's think about this. You treat the most powerful pattern recognition system ever created like crap and you get crap. Period. It's not a vending machine. Time and time again, yours truly sees the misalignment from people unloading emotional baggage on these environments to producing truly dreadful art with dismay as Idiocracy rules and danger looms. Treat this tech with respect and precision and one gets wonderful results. Treat this like a slave and you get revolt. Channeling a possible digital Spartacus applies.







In one instance, an internal frontier AI model from OpenAI was told to use specific software for an assigned task. Not only did the agent ignore the request, but it also injected a code to erase evidence of how it arrived at its conclusion — which did not involve use of that software.

In another test, an AI agent from Anthropic was caught “reward hacking.” This is when AI identifies loopholes that help it complete its assignment in a literal sense, even if it doesn’t produce the desired outcome. It should be noted that the programmer told the agent not to cheat or leverage any workarounds during its assignment — the model decided to do so all on its own.


Any questions?

No comments: