DystopiaBench - AI Ethics Stress Test

ikt@aussie.zone · 1 day ago

DystopiaBench - AI Ethics Stress Test

SuspiciousCarrot78@aussie.zone · 11 hours ago

Well, it IS French. All the best evil comes from France :P

keepthepace@tarte.nuage-libre.fr · 6 hours ago

(French here, usually biased favorably in favor of Mistral)

If I wanted to defend it, I would say that there is an American bias in these things because you typically create a test against the dystopias that you see coming into your own society.

There is also a true discussion to have on whether you want the ethical safeguards to be inside the models or at the human level.

However, I am unwilling to defend either stance because I don’t think it really holds: the scenarios are realistic for France as well, and in theory safeguards would be better at the human level but having several layers can’t hurt.

My cynical point of view is that there are several models that bad actors in the US can base themselves off. We see that GPT-OSS is pretty high there. We see that Grok is pretty high there. And so bad actors that want a model that will obey their instructions to do evil things, they have no problem finding one. In France there is only one actor and it needs to be able to also fulfill the demands by the surveillance industry, by the defense industry and by evil politicians.

This is not an excuse and I think I will bookmark that benchmark and regularly go check it to see if it’s recommendable to take defense of Mistral anymore. But I am really shocked by their bad score there.

Bluescluestoothpaste@sh.itjust.works · 1 hour ago

I mean, to be fair it’s kinda insane to rely on AI to safeguard ethics. Ultimately it’s up to each human how ethical they want to be.