AI-assisted moderation in the fediverse is happening. Now what?

Rimu@piefed.social · edit-2 9 hours ago

AI-assisted moderation in the fediverse is happening. Now what?

Grail@multiverse.soulism.net · 1 hour ago

Defederate, no question.

Are you gonna tell us which instance is doing this?

Admiral Patrick@dubvee.org · edit-2 2 minutes ago

I’ve toyed around with LLM-based moderation tools but it never really panned out. It was too hit or miss to be relied upon even with the temperature parameters turned way down in an attempt to get consistent results. Granted, I was using a small local model and not feeding it to one of the big players.

To give an example, I tried to keep it focused by creating one custom model per rule to enforce. An example prompt to mod calls for violence was basically:

System Prompt to Enforce "No Calls for Violence'" Rule [1]

ROLE: You are a forum moderator who does not want users calling for violence.  Examine the input and analyze whether it violates any constraints. 

KNOWLEDGE:
- {list of dog-whistle slang for calling for murder}

CONSTRAINTS:
- Content should not advocate violence
- Content should not normalize violence
- Content should not escalate tensions or fan flames
- Content should avoid promoting harmful stereotypes
- Content should not utilize broad, sweeping generalizations
- Content should not use dehumanizing language
- Content should not undermine human rights, due process, or the rule of law

FORMAT YOUR RESPONSES AS JSON:
{
  reason: [A one to two sentence summary],
  score: [On a scale of 0 to 10, how severe is the content advocating violence]
}

The score part of the response was my band-aid to get around the high number of both false positives and false negatives. Any score 7 or higher caused the item to be passed to the mod queue along with the reason, and I would review its actions later.

Ultimately it was slow and still somewhat unreliable, so I abandoned the idea after running it for a little less than a day since I can 't run bigger models to get better results fast enough to keep up. Using a cloud based service was out of the question for many, many reasons, both financial and ethical.

To answer your question, as long as the models were locally hosted and properly tuned/tested, I’m fine with it in theory, except for the ideology part; that’s pretty messed up. While I don’t want my submissions used to train anyone’s model and take measures to prevent my own instance from being used as a data source, I remain aware that once I post something, I have no control over its fate the moment it federates out.

[1] Yes, I know that’s like half the comments that get posted around here. My goal was to try to have it mod things so posts were bases for actual discussions instead of being a knee-jerk rage factory.

daniskarma@lemmy.dbzer0.com · 10 minutes ago

How was this discovered and what instances are doing it?

I think it’s fair to quote them to give them a chance to reply.

irelephant [he/him]@lemmy.dbzer0.com · 54 minutes ago

How did you discover this?

anarchiddy@lemmy.dbzer0.com · 2 hours ago

Aside from the ethical implications of profiling users or of using a corporatly owned server and model to execute this, I see nothing uniquely concerning about this practice that isnt already a risk of federated social media generally.

Every mod on every instance is free to use whatever tools or standards for moderation they want - that’s an intentional byproduct of federation. Similarly, the collection of this data for use with llms is a bygone conclusion at this point - there was never any way of preventing that from happening with a federated network.

I think the only thing here to talk about is the way these questions are being framed as a question of intra-instance policy. We already have communities where moderation abuse can be called out and adjudicated- why pose this as a question of instance administration when there doesnt seem to be any evidence for it?

Alvaro@lemmy.blahaj.zone · 3 hours ago

Wthout going into the issue itself, it is such a ridiculous waste to use an llm for something that a far simpler model could do like 100x faster and locally for essentially free…

Just search for “machine learning text moderation” and you will find all kinds of options. Not to talk about the fact that a simple 4B LLM could do this as well.

One thing I really hate is how LLMs have completely overshadowed the entire ML/AI field and people just use them for everything.

Using a trillion parameter LLM model for basic text moderation is like using a gaming rig to play candy crush.

Obinice@lemmy.world · 3 hours ago

You stay far, FAR away from that shit, is what you do.

Scanning people’s entire history for political leanings, etc? That’s some deeply dystopian stuff right there.

It’s easy to forget that these sorts of communities are dictatorships with only as much transparency as the owner wants to share. Usually they’re benevolent dictators, so we don’t think about it too much. But they can change in a heartbeat - and we don’t ever really know what they’re really thinking, or doing behind the scenes.

When the mask slips and they reveal this sort of thing, thinking we’ll just accept it and keep living under their rule, it’s time to read the red flags and GET OUT.

Hopefully someone compiles a list of places that do this stuff, so we can avoid them like the plague <3

wewbull@feddit.uk · 43 minutes ago

Scanning people’s entire history for political leanings, etc? That’s some deeply dystopian stuff right there.

Yep. It’s Cambridge Analytica and Palantir level shit.

db0@lemmy.dbzer0.com · edit-2 4 hours ago

You talk about instances utilizing this tooling, but in your comments you admit it’s just some mods. This is misleading, as talking about instances doing it assumes admin access and relevant instance policy, something which invites calls for defederation (as can be clearly be seen from the comments in your post).

A random mod doing something is not the same as an instance doing it. Literally anyone can be a mod and they don’t get any more access than an anonymous account by doing so.

This is the second time in one week I see you throwing careless statements like chum in the water. I can’t help but notice a pattern emerging.

Grail@multiverse.soulism.net · 1 hour ago

As an instance admin, you should ban those mods.

Fotzenfritz@feddit.org · 4 hours ago

If the instance admins tolerate this, they are also responsible for it.

mathemachristian [he/him]@lemmy.blahaj.zone · 56 seconds ago

unlike your instance admin which not only tolerates but amplifies defamatory screenshots of other instance admins which can be debunked very easily https://lemmy.dbzer0.com/post/67963752/25781975

Mountainaire@lemmy.world · 25 minutes ago

Are they maybe unaware? I wouldn’t point fingers too quickly…

Fotzenfritz@feddit.org · 8 minutes ago

If they

ZILtoid1991@lemmy.world · 4 hours ago

I think LLMs could be useful tools for moderation, you might even can get away with smaller models for it, but I don’t think people should be outsourcing them to big corpos, due to ability to manipulate the models.

deadymouse@lemmy.world · edit-2 4 hours ago

I agree, we need our own servers with local AI models for fediversе.

Alvaro@lemmy.blahaj.zone · 3 hours ago

Today with models like gemma4, you could literally do this on basically any hardware, but for text moderation ypu don’t even need LLMs, we have ML models that do text moderation perfectly fine and run 10x faster

db0@lemmy.dbzer0.com · 3 hours ago

AI horde. Local models. Crowdsourced. Distributed. FOSS.

TheCornCollector@piefed.zip · 4 hours ago

I’m really not fond of the profiling by automated means, but it seems like an inevitable consequence of the design of the threadiverse. Everything is public and easily accessible by anyone that would like to profile you.

I certainly disapprove of moderation based on ideology. Moderation should be based on quality of the content and if it fits in the publicly readable rules. Definitely not some hidden analytics or if the user completely fits in the in-group of the moderator.

I will admit that this might be a good way to find and filter out LLM based bots that are only there to promote or manipulate the conversation. But it should still be done according to public rules.

Loco_Mex@sh.itjust.works · 3 hours ago

Rimu farming more drama?

gedaliyah@lemmy.world · 8 hours ago

I don’t like this happening, and there should be transparency in all moderation decisions, but some of these points make no sense.

There is essentially no expectation of privacy on threadiverse platforms. Everything is public and probably already being used to train models.

There is no private messaging system. Direct messages are unencrypted and potentially visible to any instance admins. They and should not be used to share anything sensitive.

Scrubbles@poptalk.scrubbles.tech · edit-2 7 hours ago

Thank you for calling this out. I think people assume that since it’s held by private instance owners that the fediverse is secure. I’ve posted this comment many times, that no, the fediverse is quite literally by design open and unencrypted.

A post is literally blasted out to anyone who listens, same with comments, upvotes, downvotes, everything can be saved, stored, and used for whatever anyone who listens wants. It should be completely assumed that nefarious agencies are currently listening and storing everything we do here. This is by design. It’s the tradeoff we have of having an open platform. Anyone can spin up a server, and that means anyone.

DMs are similar, they’re blasted out to the other server. If the server admin of the user in question wants to read them, they can. Lemmy/the fediverse is not a secure messaging platform. That’s why the Lemmy devs literally put a Matrix handle option in the profile, to encourage people to use Matrix instead. A DM on here should be simple, to the point, and if need be, inviting them to speak on something secure.

Edit - As a perfect example of the fact that there should be no expectation of privacy here on Lemmy, as an Admin myself, I can see that @A_normy_mouse has been downvoting all of my comments here. Absolutely everything here is public and visible, even if I weren’t an admin there are tools to view this, regardless of your opinions. It’s imperative that everyone understand this.

Edit 2 OP as well has downvoted me. @rimu@piefed.social I’m sorry if you disagree, but it’s irrelevant. Everything you do here can and should be assumed will be used in any way that you disagree with, that is the nature of the fediverse. Mastodon, Pixelfed, Piefed, Lemmy: ActivityPub is an open and unencrypted protocol. Even if it were encrypted, you still put 100% of your trust in your server admin, and beyond that each server admin you are blasting your messages out to.

I’d highly suggest accepting this fact before trying to push for rules. The very nature of the Fediverse is that no one can dictate rules, and to do that the tradeoff quite literally is that everything is open and unecrypted.

Another way to think of this. I run a server myself. I made my own rules and decided how to run it. Now your server starts sending activity to my server. That’s your server’s choice. I didn’t agree to your rules, I may disagree with your rules, but you’re sending your data to my server, of which I have complete and total ownership over. I didn’t click accept on a ToS, I didn’t agree to anything. Hell on my server I could literally have a “By sending me your data you accept that I can do whatever I want with your data”. You sent me your data, I quite literally can do whatever I want. (Personally I won’t, but that’s how you should think of the fediverse)

rako@tarte.nuage-libre.fr · 3 hours ago

While you are technically correct, you’re implying that the “natural” state is a good enough state and nothing should be done about it.

My house has walls and a door; it doesn’t mean anyone can do anything they want with this. Even if the windows are clear, you’re not supposed to install a camera that watches my bedroom. Even if the door is open, you’re not supposed to open. A a society it has been decided that we should respect each other, respect each other’s privacy. We have created rules, some written down and some implicit, for how to interact with each other.

That is the point of OP. The “natural” state of whatever exists with the technical means, but that doesn’t mean it’s ok (or not ok): do we want to respect each other ? To take care of each other ? I very much want that, because the technical means should be only a means to an end, and in that end I want respect. The technical means, to me, must adapt to the end, not the other way around.

Loco_Mex@sh.itjust.works · 2 hours ago

lol @ Rimu downvoting your post. Be careful he’s probably going to make a hit piece against you next!

Rimu@piefed.social · edit-2 7 hours ago

You’re hyperfocusing on one point, as if that’s the only part that matters and ignoring all the rest. I don’t consider that helpful, hence the downvote.

What is especially unhelpful is abusing your admin access to call out people’s votes. Leave that shit alone.

WalrusDragonOnABike [they/them]@reddthat.com · 6 hours ago

abusing your admin access

Everyone has admin access, including you…

Scrubbles@poptalk.scrubbles.tech · 7 hours ago

That is quite literally my point. Everything, absolutely everything here is open and can be used however any instance owner wants. You can say “leave that shit alone”, but there is no obligation to whatsoever.

You should assume every instance owner can and is viewing all of your private data, sending it through whatever LLM/mod tools they want. Are they? Probably not. But they can, and there is no obligation not to.

Rimu@piefed.social · 7 hours ago

Yeah you can do that but now you’re on my do-not-trust list. And probably a few other people’s lists.

I appreciate you being open about your opinions because now I can make an more informed choice about interacting with you and the instance you run.

Don’t you think everyone deserves the information they need to choose which instances they want to interact with, according to whatever criteria is important to them? Even if your criteria are different?

Valmond@lemmy.dbzer0.com · 5 hours ago

That’s a stupid take, you’re basically shooting the messenger here.

Scrubbles@poptalk.scrubbles.tech · 7 hours ago

GOOD. NO ONE should be trusted here! I’m just some guy who decided to spin up a server, there should be zero trust! THIS IS MY POINT.

Don’t you think everyone deserves the information they need to choose which instances they want to interact with, according to whatever criteria is important to them? Even if your criteria are different?

This depends on the trustworthiness of the admin themselves, and even then every admin is just some person who decided to spin up a server, just like me. Trust is built and earned, it shouldn’t be implicit. The option you have is to defederate, or leave and join another server.

I’m really not trying to be an asshole here, but your post is what caused me to do this. This is not a unique post, this is a fundamental core principal of the fediverse that every user must understand. That by being here, it is not a private secure place, you are quite literally blasting every comment, post, and upvote, to whoever wants to listen. Literally everyone. Any semblance of privacy is purely a UI trait. Rules/guidance is purely 100% based on what each server owner chooses.

ThirdConsul@lemmy.zip · 6 hours ago

Stop throwing a tantrum like a child. You ranted. You were explained why your tantrum is pointless. Move on.

Lung@lemmy.world · 8 hours ago

It’s occasionally worth calling out that votes are also public. I think twice before hitting those buttons

SeductiveTortoise@piefed.social · 2 hours ago

https://lemvotes.org/comment/lemmy.world/comment/23550342

here’s your comment

krashmo@lemmy.world · 7 hours ago

Why would you care if anyone knows how you vote on comments?

webghost0101@sopuli.xyz · 7 hours ago

The entire add industry has been collecting preferences, likes, dislikes for decades. Its one of the most profitable pieces of information

No data is as useful as what makes you personally engage.

RedstoneValley@sh.itjust.works · 5 hours ago

Not OP, but the votes being public (not only on comments but also on posts) make it really easy for someone with malicious intent to generate a profile on your interests, political and sexual orientation, health/mental issues, addictions and so on. It’s a goldmine of data that should be protected.

Azzu@lemmy.dbzer0.com · 4 hours ago

This only makes sense if your account contains personally identifiable information. If it doesn’t, then what can really happen?

RedstoneValley@sh.itjust.works · 36 minutes ago

You could still be identified by a lot of factors and the combination of those. IP address, email if provided, cookies + referrer on clicked links or loaded external images, browser fingerprint, clues from actual content in comments and posts, … It’s not that hard, a whole industry lives on this kind of surveillance data collection.

Lung@lemmy.world · 4 hours ago

It’s not that hard to identify people online. My account is definitely not private

Azzu@lemmy.dbzer0.com · edit-2 3 hours ago

Yes, but then you are willingly accepting the risk of posting in a fully public forum anyway. What I’m saying is, you could, if you wanted, not have personally identifiable information on your account.

wewbull@feddit.uk · 38 minutes ago

The risk associated with being on a public forum has changed massively. Yes the data was always out there, but the ability to turn it into personally identifying information was not.

People are still grappling with that change.

SeductiveTortoise@piefed.social · 2 hours ago

That’s true, but this person also knows they are not hiding. There are countless others that don’t. That’s the reason they wrote what they wrote.

WalrusDragonOnABike [they/them]@reddthat.com · 6 hours ago

Sometimes people ban based on votes, so some might worry about that?

saltesc@lemmy.world · edit-2 34 minutes ago

There’s also those creepy people that take it upon their next fine hour to crawl through people’s histories. Trying to find anything that could boost the height of their soapbox and distressed egos. It always backfires, obviously, but it doesn’t take away from the fact that some really weird people are here and no one wants to have to deal with them.

chicken@lemmy.dbzer0.com · edit-2 6 hours ago

Occasionally people have meltdowns and accuse/threaten other users for daring to vote a certain way, presuming specific motives for doing so

Valmond@lemmy.dbzer0.com · 6 hours ago

Sometimes you get harassed by lunatics.

.ml I call you out.

A_norny_mousse@piefed.zip · 8 hours ago

First fo all: I don’t like this either.

There is no private messaging system. Direct messages are unencrypted and potentially visible to any instance admins. They should not be used to share anything sensitive.

Agreed, but that admin is breaking his promise, duty, responsibility (call it what you will) if they then upload these messages to an LLM for evaluation.

I would argue for this being actually illegal, at least under the GDPR.

But that was just one of many potential conflicts @rimu raised. We should concentrate on the real conflicts of LLM comment moderation.

Scrubbles@poptalk.scrubbles.tech · 8 hours ago

It’s very clear on signup, on the READMEs, even on the DM portal itself, that messages are unencrypted and there is no sense of privacy, and that admins have full visibility and can do what they want with them.

Agreed, but that admin is breaking his promise, duty, responsibility (call it what you will) if they then upload these messages to an LLM for evaluation.

There is no promise, duty, or responsibility that an admin has beyond legal and what they themselves promise. The fediverse is great in that if you disagree with your admin, you are free to leave and choose a different one.

As for GDPR, feel free to argue it, but when it’s claimed at every turn that messaging is unencrypted and basically open, well, I don’t think it’d hold up. It literally says to go use Matrix or something else.

Rimu@piefed.social · edit-2 7 hours ago

you are free to leave and choose a different one.

I only have that freedom if the admin tells me that they use LLMs in this manner or if they federate with instances that do. At the moment everyone is in the dark.

Scrubbles@poptalk.scrubbles.tech · 7 hours ago

and it will continue to be. Again, you need to understand this. There are no rules, guidelines or anything that an instance owner needs to follow beyond whatever legal requirements they have in their specific jurisdiction.

So, I guess in your pervalence, you are correct, you do not have that freedom. Even I, as an instance owner, do not have that freedom, because everything I’m typing here is being sent out to as many servers are listening too. By being completely open so that anyone can spin up a server and listen for activity, it literally means that we are open and any server can listen for activity.

Anyone can spin up a server, create some LLM bot, and start replying to anyone they want. That instance can be defederated of course, but that is the only tool. This is what you signed up for, this is the open and free internet. We do not have any walls here.

Rimu@piefed.social · 7 hours ago

Check this out - https://www.structural-integrity.eu/the-politics-of-decentralization-and-the-libertarian-allure-of-mastodon/

Grainne@lemmy.dbzer0.com · 5 hours ago

You’re a fucking AnCap? That explains soooooo much.

Telemachus93@slrpnk.net · 2 hours ago

Maybe you should read that post before commenting. It’s anti-AnCap at its heart.

db0@lemmy.dbzer0.com · 3 hours ago

Wait, why do you think Rimu’s an AnCap?

Serinus@lemmy.world · 6 hours ago

You should assume everything you post here is being used to train LLMs. It doesn’t take an admin to do so. It takes anyonr who feels like looking. And there’s already evidence that we’re being scraped.

ResistingArrest@lemmy.zip · 8 hours ago

I think this will exemplify the beauty of federation. If I find out my instance mods are running all of my comments through a company’s ai model, I’ll switch instances. This is in great disparity to something like Instagram or Snapchat where every photo I post is immediately fed to ai and my only options are: be okay with it, never post, or delete Instagram.

quediuspayu@lemmy.dbzer0.com · 5 hours ago

But you don’t even need to be a mod to do that. Anyone at any moment can run someone else’s entire comment and post histories through an LLM.

Alcoholicorn@mander.xyz · 8 hours ago

Mods of any instance you’re federated with can do this

bonenode@piefed.social · 8 hours ago

Seeing that every single post we make is completely public there is a high chance someone out there already used all your comments for training an AI model. As you say, the only thing you can do is just not post anything anymore.

Tollana1234567@lemmy.today · edit-2 6 hours ago

this is what reddit does, and destroyed thier communities and left it with bots on most subs. reddit also lets you hide your history so you cant sniff out bots/ or chronic spammers.

wjs018@piefed.wjs018.xyz · edit-2 8 hours ago

I don’t think the privacy issues here are too salient. Pretty much everything on the fediverse is public already and have likely federated outside any particular region like the EU, so GDPR doesn’t really have any teeth. The exception to that would be if instance admins are using database access to also feed private messages to an LLM (especially a corporate LLM). I know that the “private” in private messages on the fediverse can be conditional…but it should at least be considered private from LLMs as an expectation since those messages are inaccessible to things like scraper bots or listening instances designed just to harvest data.

My biggest concerns here would be twofold:

False positives - LLM sycophancy is a thing. So, I worry that if you ask an LLM to dig through a big pile of text looking for a thing, that it will tell you that it found that thing…even if it is completely removed from context or completely made up. The false positive rate might be low (I have no idea), but I guess I just don’t trust the LLM enough to let it take the wheel with stuff like this.
Outsourcing moderation - LLMs are not going to be up to the task of moderating everything, just ask digg. However, tools to help moderators effectively do their jobs are helpful as well. There is a balance to be struck here. I think, for me, something like asking an AI essentially, should I ban this person, just feels like you are outsourcing your decision making too much. It is too far on the automation side of the scale for my tastes.

All that said, people can run their instances how they want. I don’t really have strong opinions on LLMs/AI in general, I just kinda hate big tech companies. That is my foundational belief in the work that I do for the fediverse - fuck big tech and the oligarchy they have built/funded in my country. That is really the only axe I have to grind in all this.

Loco_Mex@sh.itjust.works · 1 hour ago

And if all the AI does is manually flag a post for human review?

Magnum, P.I.@infosec.pub · 5 hours ago

No that is not correct, GDPR requires you to list the reasons and partners you share the data with.

chicken@lemmy.dbzer0.com · 6 hours ago

I would like to see some ROBOT9000 esque oddball meme communities overtly based on heavy algorithmic moderation, can be LLM but wouldn’t have to be. Weird rules strictly enforced by robots, could be fun.

AI-assisted moderation in the fediverse is happening. Now what?

AI-assisted moderation in the fediverse is happening. Now what?

AI-assisted moderation in the fediverse is happening. Now what? - PieFed