• Kissaki@programming.dev
      link
      fedilink
      English
      arrow-up
      1
      ·
      11 hours ago

      R1dacted: Investigating Local Censorship in DeepSeek’s R1 Language Model

      Quoting from the abstract:

      While existing LLMs often implement safeguards to avoid generating harmful or offensive outputs, R1 represents a notable shift—exhibiting censorship-like behavior on politically charged queries. […]

      Our findings reveal possible additional censorship integration likely shaped by design choices during training or alignment, raising concerns about transparency, bias, and governance in language model deployment.