Recent DeepSeek, Qwen, GLM models have impressive results in benchmarks. Do you use them through their own chatbots? Do you have any concerns about what happens to the data you put in there? If so, what do you do about it?

I am not trying to start a flame war around the China subject. It just so happens that these models are developed in China. My concerns with using the frontends also developed in China stem from:

  • A pattern that many Chinese apps in the past have been found to have minimal security
  • I don’t think any of the 3 listed above let you opt out of using your prompts for model training

I am also not claiming that non-China-based chatbots don’t have privacy concerns, or that simply opting out of training gets you much on the privacy front.

  • MTK@lemmy.world
    link
    fedilink
    English
    arrow-up
    6
    ·
    2 days ago

    Generally, the file size of the model is slightly larger than the VRAM needed. That’s an easy way to estimate VRAM requirements.

      • xodoh74984@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        16 hours ago

        Sorry for the slow reply, but I’ll piggyback on this thread to say that I tend to target models a little but smaller than my total VRAM to leave room for a larger context window – without any offloading to RAM.

        As an example, with 24 GB VRAM (Nvidia 4090) I can typically get a 32b parameter model with 4-bit quantization to run with 40,000 tokens of context all on GPU at around 40 tokens/sec.