AI’s become so invasively popular and I’ve seen more evidence of its ineffectiveness than otherwise, but what I dislike most about it is that many run on datasets of stolen data for the sake of profitability à la OpenAI and Deepseek
https://mashable.com/article/openai-chatgpt-class-action-lawsuit https://petapixel.com/2025/01/30/openai-claims-deepseek-took-all-of-its-data-without-consent/
Are there any AI services that run on ethically obtained datasets, like stuff people explicitly consented to submitting (not as some side clause of a T&C), data bought by properly compensating the data’s original owners, or datasets contributed by the service providers themselves?
Getty Images has an image generator trained exclusively on licensed images. I’m not aware of any text generators that do the same.
Oh, interesting! I’ll also take a look at that one