Attached a pretty cool article covering it. This is something I never would have thought of before.
LLMs don’t understand anything
That’s not the LLM that understand your encoded string, it’s simply a preprocessing filter recognizing the signature of a base64 encoded string that decodes it and pass it back to the LLM.
Agreed, this is a relatively simple “tool” as the LLM parlance goes. It’s what Model Context Protocol (MCP) is designed to facilitate
To verify, the author should try the same prompts on a local LLM with no tools enabled and most likely the LLM will respond with some nonsense
I was thinking the same thing, does anyone have a local LLM that they could test against? Local shouldn’t have the same preprocessing up front?
And you can use it to jailbreak (I don’t know if it’s been patched yet) https://medium.com/@zehanimehdi49/base64-one-shot-inference-jailbreak-gpt-4o-4o-mini-dfae67bc8043
Examples?