From the model card, sounds interesting:
The “Unified” in Gemma 4 12B Unified refers to its encoder-free architecture. Other Gemma 4 models use dedicated encoders to process multimodal data before passing it to the LLM. Gemma 4 12B eliminates these encoders entirely, projecting raw image patches and audio waveforms directly into the LLM’s embedding space through lightweight linear layers. This unified approach means all modalities flow straight into a single decoder-only transformer, reducing multimodal latency and allowing the entire model to be fine-tuned in one pass.
The benchmarks put it closer to the 26b MoE than to the E variants of the Gemma4 series, but mostly below Qwen3.5 9b.

Looking forward to giving it a shot.
so Qwen 9b is for like asking questions(and getting good responses) and Gemma 12b is for audio and video input aswell as roleplay,creative writing?
I’m enjoying the descent from clever structure to just… trusting backpropagation. The Bitter Lesson is being learned at every level. Make the training process better and faster because humans understand that part.
Are there already any uncesored models based on it? Asking for a friend…
most of them suck ass
Did you try any ? Because, I tried iglors and mradermacher, I got refusal to make a pipe bomb. Their answer are funny because they say to study academic engineering instead, lol. Still a refusal. I will try this one.
same, ig i will wait for a high quality uncensored model. (i.e, HauhauCS)
Is uncensoring oneself a LLM difficult ?
Hauhaucs said he’s working on a uncensored version of Gemma 4 12b.
You mean Gemma 4 ? You read in his discord ?
Yeah.
You might want to check out heretic or similar tools. I did not try it but there are a lot of heretic finetunes available ond HF.
I know it, seeing it in models titles.
i seen some good ones,though not as common.



