Federated services have always had privacy issues but I expected Lemmy would have the fewest, but it’s visibly worse for privacy than even Reddit.
- Deleted comments remain on the server but hidden to non-admins, the username remains visible
- Deleted account usernames remain visible too
- Anything remains visible on federated servers!
- When you delete your account, media does not get deleted on any server
In my opinion it’s unreasonable to think anything can truly be deleted in a federated system. Even if the official codebase is updated to do complete deletion & overwrite, it’s impossible to prevent some bad actor from federating in a fork that just ignores deletion requests.
Seems sensible to just not post anything that you don’t want to be available for the lifetime of the internet.
Just as it’s impossible to stop scrapers from archiving data on traditional websites. “Deleted” data is probably in a database somewhere, being sold by someone. As you said, you lose some degree of control over your data as soon as you post it. Data is valuable, and if there is a will there is a way.
In my opinion it’s unreasonable to think anything can truly be deleted in a federated system.
yeah like. this is just a byproduct of how federation works currently. i don’t even know how you’d begin to design a federated system where some of these critiques can’t be levied
Anything that is visible to another party can be hijacked - even a 1:1 communication does not guarantee that the other party doesn’t capture the data and then spread it. The only things that are private are thoughts that you have which are not shared with others in any fashion. As soon as information is shared in any fashion, it is not private.
Past this point it’s a matter of how private you think is reasonably private. You could design a system where users are in control of their own data through a series of public and private keys, ensuring that keys must be active to view content, but as stated above even in such a case and the user revoking keys does not stop other people from making copies of said data. This is akin to screenshotting an NFT. For all intents and purposes, a copy of the data as it existed at the time of copying is now publicly available.
Quibbling over the fact that you’re the one who “truly owns” the data when it comes to something like social media feels like a mostly pointless endeavor because the outcome (data is available for others to view/consume/read/etc) is the same regardless of who “owns” it. Copyright law will apply to anything you produce, if it comes to legal problems (someone copies your artwork and sells it, for example) and having a system to prove you own it is primarily a formality to make it easier to prove ownership. Generally people aren’t arguing through this lens, however, and are instead arguing through the privacy/security lens - that they don’t want people stealing/selling their data, which lol, good luck. AI models are proof that no one in the world actually cares about this ownership if they reasonably think they can get away with using your data without any real incentive to not do so - interestingly copyright law and models being trained on corporate data such as movies are a vector by which the legality of this might actually stop or slow AI development and protect the end-users data.
I don’t expect my data to be fully deleted in a centralized system either. even if it was deleted from the central server someone might have made an archive of it
and reddit is definitely guilty of this since they were bringing back peoples deleted comments and accounts
This is how I treated Reddit too. And Twitter. And everything else. I have two modes; public and private. And private is private; strong encryption and local storage. Having some middle ground is a recipe for disaster.
Exactly. Even a server to just go down one day. Theoretically it has a snapshot in time
Yeah, I was thinking about jfs.
@ffmike @elbowmacaroni advance ignoring deletion request technology like copy paste
So, I was born in the late 90’s - I don’t know if they still have “computer literacy” as a core course in schools these days, but they did when I was going through K-12 (or, well K-9… once you were in high school they assumed you knew the basics of how to use a computer, and had more advance courses).
One of the very first things we learned about the internet is that once you put something on the internet, there is no way to take it back. At the time, uploading pictures to the “cloud” and such wasn’t really a thing so we learnt this by using email: Once you’ve sent an email to someone, you cannot “unsend” it. You can kindly ask the other party to delete the copy of the email without opening it, but you cannot guarantee that the email wasn’t saved on another computer, or saved somewhere else along the route between your computer and the receiver’s computer. Clicking the send button was taught to us as “etching your letter into stone”.
Because of this, I’ve always (or at least, as far as I can remember) made sure that anything I put on the internet, or even “put into digital form” (such as even writing something in a file on your computer - you can recover deleted files from a hard drive unless you really put in the effort to actually erase it… there is a huge difference between erasing a file, and marking it as “deleted”) is something that I’m okay being tied with me forever. I’m sure if you looked hard enough, you could find me participating on message boards as a young teenager - and to that I just say “Oh well”. Is some of it probably very cringe-inducing and embarrassing? I have no doubt.
(This is also why you should take extreme caution when talking about say, your friend, on the internet - if you post something about them on the internet, you’re condemning them to this same exact thing)
Now funnily enough, as far as I understand the ActivityPub protocol, it is for all intents and purposes the exact same as email in this regard. Once you’ve sent something, there are no “take backs”. All you can do is kindly ask others to delete their copy, and that comes with zero guarantees. If I had a mastodon server, and someone deletes their toot - I could take down my server and my server would never receive that delete request. Or, just simply change the source code of the Mastodon instance on my server to straight up ignore deletion requests.
Would it be nice for Lemmy to have a way to actually delete your content? Sure. But that’s not technically feasible, and personally (as controversial as it may seem) I would rather Lemmy not try to give you the false sense that everything was completely gone forever. I’m not saying that you shouldn’t be able to delete your account off a Lemmy instance, but it shouldn’t come with an option that says “Check here to remove your data/media from all federated instances” because Lemmy/no one can promise that, and I really hate it when software (or really anyone/anything) attempts to make a promise in bad-faith knowing that they can’t possibly ever uphold it.
Anyone who thinks Reddit is “better” than Lemmy in this regard probably doesn’t realize that Reddit is making a claim they can’t keep. The most obvious example of this is all of these subreddits that have gone dark? You can bring up most of their posts on the Wayback Machine or Google Cache. That would be the case regardless of whether they were set to private, or even if they were just straight up “deleted”.
We really should not be setting the belief for people that there exists a way to completely nuke a piece of data off the internet, because you cannot make a guarantee of that being the case.
First - we’re all using alpha/beta software (Lemmy is 0.17.4, Kbin is 0.10.). None of these services are “production quality” software yet, so let’s keep that in our minds - we’re all early adopters.
The points mentioned in the OP are a bad look. Naturally. User should have expectation of their data being deleted on request - especially since this request might be regulatory privacy request (GDPR related). It’s a clear failure from the software and should be improved and iterated upon.
The expectation shouldn’t be “oh well it’s on the Internet, live with it”. While Facebook might keep mining your data after deletion request, our software shouldn’t behave like that, we should strive to be better with this stuff.
And finally, ensuring privacy in federated system is hard. Mastodon suffers from same problems. We shouldn’t give up on the idea though.
It is an early stage software and such things can be worked out, you’re right. But on the other hand, such basic elements should be based on a thorough concept before a single line is coded, and implementing something like a delete button with “Let’s just make it delete the most visible stuff for now, we can always improve that later when there is time” is recipe for disaster.
Agree, it’s a little late to change core architecture. But this is the philosophy the devs ran with, and it has the advantage of longevity when an instance goes offline, then it’s still visible to everyone else.
The more important part for privacy: Mail address is optional, and IP addresses are not stored in the database. A correctly configured instance (at least for EU legislation) also will not log IP addresses in the web server - with that you can have profiles that can’t be tied to an actual human, and you don’t have location and movement data.
The data deletion is pretty much a nice to have - it’s on the level of the Exchange feature to recall Emails: Sure, you can ask nicely, but outside of your own server pretty much nobody will care. Lemmy is federated over multiple jurisdictions, so even with full deletion implemented there’ll almost certainly be instances which will ignore the deletion request - and it will be completely legal for them to do so. More important is education about what you publish, and a basic understanding of the technical and legal realities you’ll have to deal with if you later decide you want that information gone.
I already had that discussion with my 6 year old when she wanted to publish some videos - and she understood the problems quite well.
but outside of your own server pretty much nobody will care. Lemmy is federated over multiple jurisdictions, so even with full deletion implemented there’ll almost certainly be instances which will ignore the deletion request - and it will be completely legal for them to do so
Lemmy also seems to federate your matrix_user_id, that is clear personal data. It does not matter how the data gets to the federated server, this is still user data within the scope of the GDPR. It does not matter that that server does not have an agreement with the user, the instance that would ignore a GPDR related deletion request would be in direct violation of the GDPR. Maybe it can do that without consequences, though.
I completely understand that making Lemmy fully GPDR compliant will probably be impossible, however I don’t like the approach of “we will not succeed, so we don’t make any attempt”. Instances should actually delete data when that is requested, or instance hosts can get fined. For now, Lemmy has bigger issues to solve, but eventually they should do at least a best effort attempt to respect user data.
Lemmy also seems to federate your matrix_user_id, that is clear personal data.
Just like specifying an email address when signing up adding a matrix identifier is your personal choice. Lemmy is perfectly usable without either.
It does not matter how the data gets to the federated server, this is still user data within the scope of the GDPR. It does not matter that that server does not have an agreement with the user, the instance that would ignore a GPDR related deletion request would be in direct violation of the GDPR.
Not a lawyer, but I’d say the instance outside of EU, not targetting EU users would not be in violation - though EU instances transmitting data there might.
Instances should actually delete data when that is requested, or instance hosts can get fined.
With that part I agree - but it should be made clear when deleting something that this is a local deletion, which may or may not propagate to other instances, and will almost certainly not remove the data from the internet.
EU instances transmitting data there might.
This is an interesting thought, as data transfer between the US and EU has been an issue with other social networks. Federation between an EU instance and a US instance could be seen as the same thing - data for EU users is being transferred to non-EU servers.
I had a look into the wording of the gdpr (more specifically the Data protection act as it is implemented in the UK) it seems to refer to organisations. I think most, if not all, instances are not hosted by organisations. (Just some group or individual hosting it on personal or rented hardware). Laws such as this are designed with centralization in mind, and kind of don’t make sense in the context of decentralisation.
Did anyone use reddit thinking it was private? With stuff like push shift and way back machine people shouldn’t be posting stuff they aren’t comfortable sharing anyways on a wide open message board.
Always weirded me out the people who’d treat their reddit accounts like Facebook.
With stuff like push shift and way back machine
So much this. I don’t get why people don’t remember this first thing when it comes to data storage.
Yes. “The internet never forgets” is actually a thing.
What does this have to do with Mastodon?
The same privacy issues also exist with Mastodon and all distributed systems.
Anything put on the internet is forever. No one should be publicly posting anything with the expectation that they have any control of it after it goes out. If it’s not held by the server, there’s the way back machine or even just folks taking screenshots.
Anything put on the internet is forever.
If only. Alas, it’s more “Expect anything put on the internet to be forever”, I already spent a significant amount of time looking for treasures from the earl 2000s, and even from something as recent as 2009, without any luck. I’ve also uploaded songs to YouTube that for all I know have no other sources left, neither illegal nor legal.
It’s the Internet Corrolary to Murphy’s Law: your embarrassing posts will be available online forever, but any useful information you want to find later will have been deleted when you next look for it.
I completely agree. I just don’t see how there can be any realistic expectation of privacy when publishing something publicly.
I appreciate the idea of laws establishing a right to be forgotten and I think there’s still some value in being able to take your data away from certain companies, but there’s no guarantee it wasn’t copied many times before the original location is taken down.
The Fediverse works like email. Once somebody hits send, there’s no real way to claw that back.
There’s a difference between “there’s no way to guarantee total privacy” and “the system is designed to guarantee no privacy”, though. Even the best of us fuck up and say something they shouldn’t on occasion, and plenty of people online were never given proper lessons or are too young to understand how serious revealing information is.
Whether is Lemmy, federated, corporate owned, or even your own private site - nothing you put on the internet is ever truly private. If you have a public profile someone can access it and copy it.
The only things I’ll say that I have an expectation of privacy is health related, everything else I fully expect someone else to read, copy, and multiply.
I think there should be, but I never expect there to be. Did people’s parents not teach them about putting things on the internet they didn’t want shared?
Did people’s parents not teach them about putting things on the internet they didn’t want shared?
They used to, then social media became a thing and they stopped. Suddenly, it was normal to put your entire life up online for other people to see, and if you didn’t feel comfortable doing that you were the weird one.
My rule is, never post anything you wouldn’t mind the media tracing back to you IRL and then making the top story of the day in your country. Because, while rare, that does occasionally happen!
My rule is, never post anything you wouldn’t mind the media tracing back to you IRL and then making the top story of the day in your country.
So don’t live, basically.
Or you can just maintain anonymity as best as you reasonably can and hope no one goes out of their way to identify you or the account(s). Making a new account after awhile is a safe practice. The goal is to decrease the likelihood of undesirable things, not make them impossible.Odd response, you can still “live” without documenting your activities. Were people not living pre-Facebook/Instagram?
…Are we talking posting things anonymously or posting things with your irl name and photo?
Probably because it became very profitable to let everyone do that 😔
Exactly, when you put it out there it’s out there on every single platform there is. It doesn’t matter if you “delete it”, the moment you share it you have lost control over it entirely.
For the same reasons I never understood why people post on Facebook with their own full name and life story out there in the open either.
True but you should still be able to delete your account and your comments and username leave the service. Online privacy isn’t about completely disappearing, but making yourself so hard to track the average person won’t bother digging.
Which in turn decreases the likelihood of something happening. Like locking a door.
The saying “If somebody wants to get in they will.” is a terrible one when left as is.
I mean yes but it’s still bad practice to keep deleted content. It’ll be a bad look to people interested in switching to lemmy and more people is really what it needs right now
This is generally true, but at the same time, the Internet archive doesn’t archive every single page ever.
https://github.com/LemmyNet/lemmy/issues/2977
It’s not like they’re doing it on purpose, there’s a lot of things being worked on, and this is one of them.
BTW, the OP on Raddle was spamming that message around Reddit last week and directing people to Raddle. I think he has a bone to pick with the developers’ politics more than anything.
It is reasonable that people should be able delete their posts / comments. However I don’t see how is this related to “privacy”. How can something you post on a public forum be private?
its the principle behind the ‘right to be forgotten’
if you posted something to a public forum and changed your mind, deciding it shouldnt be public after all, you should have that option
That is generally true, with exceptions like leaking someone else’s private information.
But it implicates the adjacent “right to be forgotten” rather than narrowly defined “privacy”. This could be a real legal issue in the EU.
It is. GDPR in the EU dictates that every user which requests their information has to get it in 30 days, and every user who removes their information has to be able to get it removed (I think the time span for that is even shorter, so more pressure for the server admins)
I’m also not sure how it’s enforceable in a distributed system.
Blockchains have the property of being append-only, so a blockchain is precisely what makes it impossible to delete transactions. That being said, in a distributed system, once the message leaves trusted servers, it is obviously also impossible to delete it.
Probably in the sense that if it’s not me that posted it, then I don’t have any way of truly remove it (which I think is against the EU’s laws).
What I can think of right off the top of my head is revenge porn and doxxing. Furthermore there’s also the right to be forgotten.
You can’t delete a mail you sent me, nor put your hand written letter to me in the bin. I can keep both and I can keep your name and addresses in my little black book. So there isn’t even that level of privacy in the real old fashioned communication.
And communication over the Internet was always the subject of storage. Your mail may be on the backup tape of a mail server. Your usenet posting is on archive.
So the assumption that the fediverse can forget….
The same is true for raddle. They kid themselves if they think anyone can’t record anything in there forever.
Anyway it’s also inaccurate. Deleted accounts are purged from the DB, so they’re definitelly not visible anymore
Likewise you you edit your comment, it’s edited in the DB.
This is assuming your local server is still federated. If your local gets defederated you currently have no control over any previously federated copies of your posts / comments / votes.
And it also assumes, no one made a screenshot or used the web archive, crawled it and stored it in their own DB or any other way of copying stuff. Of course!
If you post any thing publicly on the internet, there is no way to be 100% sure it can be ever deleted again.
That isn’t what I am speaking to, and the fact someone could make a copy or it is archived somewhere doesn’t make the statement that you can always remove your data from the platform true. And there is a difference between a potential copy and an original federated, distributed, and indexed version.
People need to be aware of the persistence of data, but people also have to understand the technology they are using to make their own informed decisions on how they engage.
People need to be aware of the persistence of data, but people also have to understand the technology they are using to make their own informed decisions on how they engage.
Exactly. Federation as well as the internet has restrictions in whether you can deleted your data. This should be known. Non federated data has the same problem, but the other way around. Someone running the site wants your stuff gone? It is now.
I know, what you are talking about, but there are things one has to accept, this being one of them.
the fact someone could make a copy or it is archived somewhere doesn’t make the statement that you can always remove your data from the platform true.
Why would someone think that?
And there is a difference between a potential copy and an original federated, distributed, and indexed version.
What is this difference? What do you think happens more often, screenshotting weird/compromizing stuff someone said or defederation?
But there can be a way around All that and that is deleting all Content from defederated sources. Maybe someone could make an issue or implemented it themselves…
Why would someone think that?
Because the comment I replied to, the actual thing I am addressing, makes an assertion that isn’t entirely true and could lead someone uninformed into believing they can have their information removed platform wide.
What is the difference?
Not everyone is concerned with someone digging up dirt or wildly compromising material. Most people aren’t special enough to be worried about that.
Most archives won’t be globally search indexed. An archive won’t show up on a federated search. There is more legitimacy to a federated version over someone reposting a screenshot (at least in perception, how federated could be altered or forged is another topic).
I also mention there are other reasons one might want to remove content. Just look at reddit right now, some may simply want to revoke support for a platform sometime in the future.
Sure, there could be a future where this is addressed. It isn’t right now.
I don’t disagree with you in the larger discussion on persistence of data. I am adding context to a scoped subtopic of it.
I’m behind Lemmy, but I’ve made an informed decision on what that means for my data.
You are also kidding yourself if you think that defederation will not become more common. The community we are commenting on has already defederated 2 very large instances.
So what your saying is that it’s just like Reddit in that respect.
Yeah, I can live with that, as long as everyone knows that if they really want something deleted, edit over it first.
For a humbling experience just seach for your Reddit and Lenny IDs on a seach engine. You will get a list of everything you have posted. Also some account info. It is all public. What happens when deleted, depends on who has scraped the data and their retension. This is just how public forums are and that goes all the way back to Usenet and listservs.
It’s no different than me sending an email to someone and then sending a request to delete it. There likely is still a copy on the email provider’s server and the recipient could have potentially backed up their emails to something outside of the email ecosystem.
Unfortunately the only way to be absolutely sure that there isn’t information you don’t want on the internet is to not share it at all. There will always be an issue of making sure every system actually deletes content when you request it. Like I said, that doesn’t stop anyone from backing up the data to another system. (E.g. Reddit archives from 2005 to now are available to download, even content that has already been deleted)
Honestly, I kinda question how good of a time investment it is to try and allow deletion from the public facing parts of the internet, given the numerous places where your content will be cached or otherwise stored.
There is certainly some value in simply making it as hard as possible to find things you want to delete. Why let perfect be the enemy of good, after all. There’s plenty of types of content we certainly want to do our best at deleting even if we can’t be perfect. Eg, do you wanna be the one to tell a revenge porn victim, “sorry, we can’t make it harder to find the content that harms you because we can’t delete all of it anyway”?
But at the same time, development time is limited. Everything is a trade off. We do have to decide what is most important, because we can’t do it all immediately. The fact we can’t actually delete everything does have to be a factor in this prioritization, too.
There is something to be said about ensuring people know and understand that nothing can truly be 100% deleted once it’s posted on the internet. Not that Lemmy is doing good about that, either (especially since deleted comments apparently lie about being deleted).
All this said, I do think federated, reliable deletion is critical for illegal content. Such content needs to be removed quickly and easily from as many places as possible. Without this, instance owners are put at considerable legal risk. This risk poses a threat to the scalability of the Fediverse.
Oh I wish we had the ability to fully delete our content that we’ve posted or that someone has posted of us. Illegal content is a huge concern with federation. As soon as someone pushes something like that, it gets sent to all the federated instances so they have a copy as well. That is a huge concern for instance owners (and honestly the fediverse as a whole).
I run a kbin instance and I’m a software developer for my day job. I honestly don’t have a great answer for “how do we ensure the data we request be deleted on the fediverse is actually deleted.” My best solution would for us to have several federated master databases that we maintain our federated content with. If there is a big delete flag for some content then the child instances will follow suit.
It is all public just as most forums on Reddit. No real difference. No difference with Usenet either. Relax.
Mastodon’s privacy issues are just the same as the rest of the fediverse/threadiverse.
With federation there is more openness, transparency and accountability. Take care of your privacy, use alts.
Use a pseudonym that you don’t use anywhere else and don’t dox yourself in your posts or comments
“Average user.” Think Reddit, Facebook, having communities. I’m old enough that I was a first gen internet user. Like slow-ass 56k, and bbs in terminal and Apple with floppy floppies and point/click before Gates did his hoodoo.
a good habit is also regularly abandoning/deleting an account and starting from scratch. I went thru 6 reddit accounts over my 13 years there
Same here. I had used reddit since 2010 and must have had close to a dozen accounts. I didn’t like too much info piling up under any one account. And I used a local city subreddit a lot.
same. it also helped to separate interests. each hobby/interest would get a different account, local stuff another account, maybe an “engage in politics” account or three (so I can log off and not get hateful replies at random hours of the day)
If I stick around I figure I’ll do the same with lemmy. So far local content, angry debate, and niche hobbies haven’t been a ‘problem’.
That’s a great idea
Anyone who has open discussions on the Internet and thinks they’re somehow private is a fool. Short of end to end encrypted chat I’m not sure what they expect.

































