• Kissaki@programming.dev
    link
    fedilink
    English
    arrow-up
    1
    ·
    5 hours ago

    R1dacted: Investigating Local Censorship in DeepSeek’s R1 Language Model

    Quoting from the abstract:

    While existing LLMs often implement safeguards to avoid generating harmful or offensive outputs, R1 represents a notable shift—exhibiting censorship-like behavior on politically charged queries. […]

    Our findings reveal possible additional censorship integration likely shaped by design choices during training or alignment, raising concerns about transparency, bias, and governance in language model deployment.