The banality of surveillance | On an internet where everything is tracked, our privacy is maintained mainly because our data is too boring and time consuming to sort through. But AIs don't get bored.

Blaze@piefed.zip · 5 days ago

The banality of surveillance | On an internet where everything is tracked, our privacy is maintained mainly because our data is too boring and time consuming to sort through. But AIs don't get bored.

refalo@programming.dev · edit-2 4 days ago

I think the article fails to take several critical factors into consideration.

The complexity of dealing with such large amounts of information will keep increasing forever as the amount of information also grows
AI struggles with conflicting information and mistakes, which happen a lot especially when humans are involved, so eventually you will have lots of “garbage in garbage out” issues causing problems
The data one might be able to track will continuously be challenged or removed on legal/compliance bases over time, reducing its availability

For example: Yes the NSA might want our chatbot logs, but after enough people realize they might be/are getting them, people will stop feeding it as much, or introduce noise on purpose. It’s not a perfect vacuum of constant reliable information forever. We are already seeing that AI models learning from web results are getting caught up in their own slop making themselves dumber. And the sheer volume of information relative to the computing power necessary to process everything will also become a problem if they keep trying to process every single thing.

apparia@discuss.tchncs.de · 4 days ago

I think this comment is based on an extremely optimistic – bordering on fantastical – outlook.

The complexity of dealing with such large amounts of information will keep increasing forever as the amount of information also grows

The capacity and capability to handle the data will grow too.

AI struggles with conflicting information and mistakes, which happen a lot especially when humans are involved, so eventually you will have lots of “garbage in garbage out” issues causing problems

This is what data analysis is though. Extracting patterns from noisy data. Ignoring outliers. I don’t think anybody is suggesting they’ll just dump a CSV of your web history into ChatGPT and ask it if you’re probably going to a protest this weekend (although does it sound so far fetched that that might actually work?), it’ll be used in combination with existing and constantly improving data mining techniques.

The data one might be able to track will continuously be challenged or removed on legal/compliance bases over time, reducing its availability

Are you implying data protection laws will not only not be inexorably eroded year upon year by increasingly surveillance-hungry governments, but will actually get a significantly better than their current milquetoast state? I’ve gotta say, that’s seeming increasingly unlikely to me; right now we’re seeing mandatory identity verification being legislated on more and more things by more and more governments.

Yes the NSA might want our chatbot logs, but after enough people realize they might be/are getting them, people will stop feeding it as much, or introduce noise on purpose

This has to be a sarcastic reference to Snowden, right? The thing where the entire world found out about the how NSA absolutely is – not “might be” – monitoring your internet and conversation logs, and basically nobody did a fucking thing to change? That was 12 years ago.

And the sheer volume of information relative to the computing power necessary to process everything will also become a problem if they keep trying to process every single thing.

Good thing they’re not doing anything crazy to get more computing power, like buying up practically the entire global supply of RAM or building data centres at an exponentially increasing rate.