Recent DeepSeek, Qwen, GLM models have impressive results in benchmarks. Do you use them through their own chatbots? Do you have any concerns about what happens to the data you put in there? If so, what do you do about it?
I am not trying to start a flame war around the China subject. It just so happens that these models are developed in China. My concerns with using the frontends also developed in China stem from:
- A pattern that many Chinese apps in the past have been found to have minimal security
- I don’t think any of the 3 listed above let you opt out of using your prompts for model training
I am also not claiming that non-China-based chatbots don’t have privacy concerns, or that simply opting out of training gets you much on the privacy front.
I believe the full size DeepSeek-R1 require about 1200 GB of VRAM. But there are many configurations that require much less. Quantization, MoE and other hacks. I don’t have much experience with MoE, however I find that quantization tend to decrease performance significantly. At least with models from Mistral.