• 0 Posts
  • 41 Comments
Joined 2 years ago
cake
Cake day: June 25th, 2023

help-circle
  • Apertus was developed with due consideration to Swiss data protection laws, Swiss copyright laws, and the transparency obligations under the EU AI Act. Particular attention has been paid to data integrity and ethical standards: the training corpus builds only on data which is publicly available. It is filtered to respect machine-readable opt-out requests from websites, even retroactively, and to remove personal data, and other undesired content before training begins.

    We probably won’t get better, but sounds like it’s still being trained on scraped data unless you explicitly opt out, including anything that may be getting mirrored by third parties that don’t opt out. Also, they can remove data from the training material retroactively… But presumably won’t be retraining the model from scratch, which means it will still have that in their weights, and the official weights will still have a potential advantage on models trained later on their training data.

    From the license:

    SNAI will regularly provide a file with hash values for download which you can apply as an output filter to your use of our Apertus LLM. The file reflects data protection deletion requests which have been addressed to SNAI as the developer of the Apertus LLM. It allows you to remove Personal Data contained in the model output.

    Oof, so they’re basically passing on data protection deletion requests to the users and telling them all to respectfully account for them.

    They also claim “open data”, but I’m having trouble finding the actual training data, only the “Training data reconstruction scripts”…




  • I don’t know about making fun of a dialect, but it’s not quite utter nonsense - “oven” sounds like “of in”, so it can be interpreted to mean that it shouldn’t be called oven, because when you put the food in it’s cold, you only eat it when taking the food out, when it’s hot.

    The sentence structure is so absurdly wrong it makes me wonder if somebody was genuinely trying to make a pun and ended up with that, or if it was intentionally butchered.






  • Literally the last two RSS items right now are about how splitting packages will require intervention for some users (plasma and Linux firmware).

    Maybe a nitpick, but the linux-firmware situation is different, it’s not about needing to install extra packages (they turned the existing package into a meta package or whatever it’s called), but about that coinciding with some changes that can break the upgrade process and require you to force uninstall a package before proceeding.

    But yeah, good point about plasma, the only differences I can even think of are that plasma is probably more popular, and definitely more important to have working.







  • Overproduce to cover everybody’s needs, and if you want to use that overproduction to cover somebody else’s problems, make that the new target and produce over it to keep a safety margin. Otherwise you’re just going to hide the problem and run into trouble when production dips.

    Not saying this is the right approach, but this is the idea I’m getting from the thread. I feel like it might not work with the economics of supply and demand combined with capitalistic greed, but if a margin exists as safety, allocating it removes that safety.




  • I think the trick might be that nothing is stopping you from using more than one 32-bit integer to represent addresses and the kernel maps memory for processes in the first place, so as long as each process individually can work within the 32-bit address space, it’s possible for the kernel to allocate that extra memory to processes.

    I do suppose on some level the architecture, as in the CPU and/or motherboard need to support retrieving memory using more than 32 bits of address space, which would also be what somebody else replied, and seems to be available since 1999 on both AMD and Intel.



  • Doesn’t change the voting situation. Since your votes need to be seen by other instances, Lemmy needs a mechanism for federating votes. Since instances are untrusted, there needs to be some way of preventing manipulation. Thus, AFAIK, Lemmy simply shares your votes across instances, letting each one tally them up. As a side effect, any server admin of an instance you can interact with can also get a list of all your votes.