EDIT
TO EVERYONE ASKING TO OPEN AN ISSUE ON GITHUB, IT HAS BEEN OPEN SINCE JULY 6: https://github.com/LemmyNet/lemmy/issues/3504
June 24 - https://github.com/LemmyNet/lemmy/issues/3236
TO EVERYONE SAYING THAT THIS IS NOT A CONCERN: Everybody has different laws in their countries (in other words, not everyone is American), and whether or not an admin is liable for such content residing in their servers without their knowledge, don’t you think it’s still an issue anyway? Are you not bothered by the fact that somebody could be sharing illegal images from your server without you ever knowing? Is that okay with you? OR are you only saying this because you’re NOT an admin? Different admins have already responded in the comments and have suggested ways to solve the problem because they are genuinely concerned about this problem as much as I am. Thank you to all the hard working admins. I appreciate and love you all.
ORIGINAL POST
You can upload images to a Lemmy instance without anyone knowing that the image is there if the admins are not regularly checking their pictrs database.
To do this, you create a post on any Lemmy instance, upload an image, and never click the “Create” button. The post is never created but the image is uploaded. Because the post isn’t created, nobody knows that the image is uploaded.
You can also go to any post, upload a picture in the comment, copy the URL and never post the comment. You can also upload an image as your avatar or banner and just close the tab. The image will still reside in the server.
You can (possibly) do the same with community icons and banners.
Why does this matter?
Because anyone can upload illegal images without the admin knowing and the admin will be liable for it. With everything that has been going on lately, I wanted to remind all of you about this. Don’t think that disabling cache is enough. Bad actors can secretly stash illegal images on your Lemmy instance if you aren’t checking!
These bad actors can then share these links around and you would never know! They can report it to the FBI and if you haven’t taken it down (because you did not know) for a certain period, say goodbye to your instance and see you in court.
Only your backend admins who have access to the database (or object storage or whatever) can check this, meaning non-backend admins and moderators WILL NOT BE ABLE TO MONITOR THESE, and regular users WILL NOT BE ABLE TO REPORT THESE.
Aren’t these images deleted if they aren’t used for the post/comment/banner/avatar/icon?
NOPE! The image actually stays uploaded! Lemmy doesn’t check if the images are used! Try it out yourself. Just make sure to copy the link by copying the link text or copying it by clicking the image then “copy image link”.
How come this hasn’t been addressed before?
I don’t know. I am fairly certain that this has been brought up before. Nobody paid attention but I’m bringing it up again after all the shit that happened in the past week. I can’t even find it on the GitHub issue tracker.
I’m an instance administrator, what the fuck do I do?
Check your pictrs images (good luck) or nuke it. Disable pictrs, restrict sign ups, or watch your database like a hawk. You can also delete your instance.
Good luck.
FYI to all admins: with the next release of pict-rs, it should be much easier to detect orphaned images, as the pict-rs database will be moved to postgresql. I am planning to build a hashtable of “in-use” images by iterating through all posts and comments by lemm.ee users (+ avatars and banners of course), and then I will iterate through all images in the pict-rs database, and if they are not in the “in-use” hash table, I will purge them.
Of course, Lemmy can be improved to handle this case better as well!
Perhaps someone should create a script to purge orphan images
Seems like the logical fix
Or, just tighten up the api such that uploaded pictures have a relatively short TTL unless they become attached to a post or otherwise linked somewhere.
A script is a fine stopgap measure, but we should try to treat the cause wherever possible, instead of simply addressing the symptom.
What’s the practical difference? In both cases you’re culling images based on whether they’re orphaned or not.
If you’re suggesting that the implementation be based on setting individual timers instead of simply validating the whole database at regular intervals, consider whether or not the complexity of such a system is actually worth the tradeoff.
“Complexity comshmexity”, you might say. “Surely it’s not a big deal!”. Well… what about an image that used to belong to a valid post that later got deleted? Guess you have to take that edge case into account and add a deletion trigger there as well! But what if there were other comments/posts on the same instance hotlinking the same image? Guess you have to scan the whole DB every time before running the deletion trigger to be safe! Wait… wasn’t the whole purpose of setting this up with individual jobs to avoid doing a scripted DB scan?
There are mechanisms that exist in a LOT of services for handling TTL expiry and any relevant purging that needs to be done.
That said, a cursory look at the pict-rs project doesn’t appear to have any provision for TTL, so it’s probably going to have to be done as a cron job anyways - or at least triggered by the lemmy service when an image upload isn’t used in an instance-local lemmy post within some reasonable interval.
Note that I’m specifically including “in an an instance-local post” because I am assuming admins don’t want to provide free cloud image hosting to random internet people for arbitrary non-lemmy use.
Note that I’m specifically including “in an an instance-local post” because I am assuming admins don’t want to provide free cloud image hosting to random internet people for arbitrary non-lemmy use.
Note that I at no point allude to hotlinking from outside of the instance. Unless you want it to be possible to create an image post, delete the post, and then have an orphaned image forever (thereby creating an attack vector), you do need to solve that problem. If you solve that problem without considering crossposts and comment hotlinks within the scope of your own instance, you’re going to cause breakage. If you’re forced to consider these things before triggering the deletion regardless, then you’re not saving much on performance.
Very much needed.
Wasn’t facebook also found to store images that were uploaded but not posted? This is just a resource leak . I can’t believe no one has mentioned this phrase yet. I’m more concerned about DoS attacks that fill up the instance’s storage with unused images. I think the issue of illegal content is being blown out of proportion. As long as it’s removed promptly (I believe the standard is 1 hour) when the mods/admins learn about it, there should be no liabilities. Otherwise every site that allows users to post media would be dead by now.
I’m a pentester and security consultant. From my point of view, this vulnerability has more impact than just a resource leak or DOS. We all know how often CSAM or other illegal material is uploaded to communities here as actual posts (where hundreds of viewers run into it to report it). Now imagine them uploading it and spreading it like this, and only the admin can catch it if they goes out of their way to check it?
I wouldn’t call this a high risk issue for sure. But a significant security risk regardless.
Whether it’s illegal content or storage-filling DoS attacks, the issue needs to be addressed.
I’m an instance administrator, what the fuck do I do?
There’s one more option. The awesome @db0@lemmy.dbzer0.com has made this tool to detect and automatically remove CSAM content from a pict-rs object storage.
This is a nice tool but orphaned images still need to be purged. Mentioned on the other thread that bad actors can upload spam to fill up object storage space.
That is also very true. I think better tooling for that might come with the next pict-rs version, which will move the storage to a database (right now it’s in an internal ky-value storage). Hopefully that will make it easier to identify orphaned images.
You need a GPU for that. Most $5 VPSs don’t have that.
Yeah I know. It’s supposed to be ran from your computer, not the VPS.
Pedo trolls will be the death of Lemmy, you heard it here first!
Which is why we need to act now.
Part of the problem with having an illegal series of bits. Of course people are going to use that as a weapon.
I don’t think those images should be made fully legal, but maybe we should calm the fuck down about two notches. We should keep in mind that the real crime is creating the pictures. Being effectively legal bombed by them is kind of ridiculous. As is having to keep the detection tools secret.
If you’re on a grand jury for csam, maybe you should actually see the evidence (with limited censorship) before you indict someone.
Maybe I’m wrong, but I don’t think seeing a small number of pictures is going to scar you for life. I’ve seen goatse. I’ve seen people decapitated. It’s not pleasant, and I avoid those things, but it’s not scarring.
The Station Nightclub Fire is scarring. I’ve recommended that video to people because it’s scarring in a way that can save lives. Seeing that stuff every day would absolutely be scarring.
I don’t want to see that kind of stuff to become common, but I am disturbed that people are afraid of unused images hiding on their Lemmy server.
Regardless of the debate of whether admins should be legally liable for not deleting unknown child abuse digital files,
Maybe I’m wrong, but I don’t think seeing a small number of pictures is going to scar you for life. I’ve seen goatse. I’ve seen people decapitated. It’s not pleasant, and I avoid those things, but it’s not scarring.
You shouldn’t use your own experiences to make this generalisation, given that people working at agencies prosecuting pederasts often have to receive therapy or even leave the job after continued exposure.
I am disturbed that people are afraid of unused images hiding on their Lemmy server.
Don’t you think it’s logical for someone to be worried about being vulnerable to being accused of what likely is, in many legal systems, a crime?
Yeah, I think continued exposure is different than a one off thing. It’s why I used the Grand Jury example.
And I do think it’s logical. That’s the problem. My entire point is that csam shouldn’t be so easy to weaponize.
Maybe seeking, selling, or intentionally distributing should be the crime.
Fuck you, pedo.
FYI this requires a JWT so if registrations are closed on your instance you don’t have to worry
This is for public instances.
It seems like self-hosting your own Lemmy instance with registrations, communities, and pretty much anything else turned off is still very safe to do. I still want to end up self-hosting my own Lemmy instance some time when I have more time. Though I’d rather wait for things to be more stable first, there’s bugs I’d like to be ironed out before doing that probably, like one example is I still find it annoying that upvoting a comment in a thread deletes whatever comment you’re currently typing.
Because anyone can upload illegal images without the admin knowing and the admin will be liable for it.
The admin/company isn’t liable until it is reported to them and they don’t do anything about it… That’s how all social media sites work, Google isn’t immediately liable if you upload illegal materials to GDrive and share it anonymously.
Doesn’t change the fact that this is an issue. Besides, do you think American law applies everywhere?
deleted by creator
Good point but also consider disabling pictrs until they fix the caching problem!
Is there a description on how to disable pictrs?
Remove it from docker compose.
deleted by creator
It didn’t work that easily for me. I had to redirect all pictrs traffic to 404 in my nginx config.
'm an instance administrator, what the fuck do I do?
Check your pictrs images (good luck) or nuke it. Disable pictrs, restrict sign ups, or watch your database like a hawk. You can also delete your instance.
How? I have checked, and there doesn’t seem to be any way to see the photos on my server.
I actually shut down pictrs entirely on my instance. Running pictrs in its current state is criminally negligent imo.
They are stored in the pctrs folder. They don’t have file extensions but are viewable with many image programs.
Oh, I see. I only use command line on my server, so I didn’t realize they were actual photos. Thanks!
This can be solved very easily by a cron job to clean out the folder periodically, if you’re worried about it.
Very easily you say? Maybe tell us what this cron job is so we can all add it?
Just make a cron that runs the
rm
command every day or whatever to clean out the files. Then run a SQL query at the same time to truncate any draft posts in the database. There’s no logic to this method, it just clears out the files and records related to draft posts, but it’s fast and effective.There’s a small chance it might fuck somebody up if they were writing a post at that exact moment, but you can schedule the cron for when your instance is the quietest.
Oh wow. I always assumed the images are deleted if you don’t submit the post.
😬
Sadly not the case
Other than fulling up storage, what is the actual issue? If the image is orphaned then surely nobody can actually access the content? Sure you could be blind hosting things but if nobody can get the content back out then the abuse is surely minimal apart from say a complex cyber and physical targetted campaign or simply fulling up storage…
The issue is that you can share the image link to other people. People CAN get the content back out and admins or moderators WILL NOT KNOW about it.
So if someone uploads an illegal image in the comments, copies the link and does not post the comment, then they have a link of an illegal image hosted on someone’s Lemmy instance. They can share this image to other people or report it to the FBI. Admins won’t know about this UNLESS they look at their pictrs database. Nobody else can see it so nobody can report it.
wouldn’t it be just as easy to whitelist DNS?
Yes it would, if the problem had anything to do with the DNS.
The problem is people and people have DNS.
deleted by creator
Yes. I am well aware and that would be by design.
remember - if someone on a major mobile network is uploading child photography, that device is radioactive and an instance admin is going to have options they may not have in other situations.
The idea is give instance admins control over who uploads content. Perhaps they don’t want mobile users to upload content, or perhaps they do but only major carriers, by their own definition of major.
Somewhere between “everyone” and “nobody” is an answer.
giving the instance administrator tools to help quarantine bad actors only helps, which will require layers. Reverse DNS is a cost, however; perhaps the tax is worth it when hosting images, where there is already a pause point in the end user experience, and the ramifications so severe.
Larger instances may dilligaf but a smaller instance may need to be very careful…
Just sayin…
deleted by creator
why do you say that, knowing full well DNS whitelists rely on wildcards?
deleted by creator
Why does it matter? Read some of my other posts.
deleted by creator
Explain.
He probably means whitelisting domains when posting already uploaded images, clearly not having read the post
That’s another issue. Also a necessary feature.
No I mean the user’s DNS should be whitelisted to permit uploads. If DNS not on whitelist then no upload, period.
What do you mean by “the user’s DNS” exactly??
deleted by creator
An option to prevent users to upload unless their DNS has been whitelisted. It would require explicit permission to upload, which could be handy for smaller instances.
deleted by creator
Not IP. DNS whitelist. This way if a geography or subnet is responsible for illegal material they are only allowed in if an instance granted +w.
deleted by creator
Every person on the internet has a DNS record that loops back to them. The DNS has a topography so that various elements of a domain could be whitelisted, or not.
It would be trivial to queue a request to white list, where an administrator could decide if it is worth it, having it auto expire over time.
Instance admins could share sources of bad actors.
heuristics could help determine the risk of an approval action.
deleted by creator
A lot of web software does this (Github and Gmail for example). I like it but always thought it could be abused.
They probably have the tools to deal with it. Lemmy certainly doesn’t.
@bmygsbvur Pleroma is exactly the same and no one cared in six years.
Doesn’t change the fact that this is an issue.