hardening mastodon against scraping
fellow masto instance admins of the fediverse:
by default, mastodon is leaky as fuck and there are a bunch of ways that data can be scraped and indexed from a mastodon instance
there are a few steps you can take to harden your instance against this; since there's an ongoing harassment campaign against trans masto users, now is a good time to review this
the following is not exhaustive, but it's a good start
1. Enable 'Secure Mode' on your instance. Without secure mode turned on, any of the activitypub endpoints of your instance can be scraped without http authentication -- this includes user profiles and users' public posts. This makes it ***absolutely trivial*** for a scripter to scrape all of the profiles of your instance denizens and look for keywords.
From the mastodon docs: 'When secure mode is enabled, all GET requests require HTTP signatures as well.'
It's insane to me that this isn't enabled by default. To enable it, see the 'AUTHORIZED_FETCH' parameter here: https://docs.joinmastodon.org/admin/config/#basic
This makes it more complicated to scrape, since scraping traffic now has to come from an instance that uses http signatures, and not just from some random asshole's computer.
2. Toggle some config options in preferences => administration => site settings. Here you can turn off the profile directory, disallow unauthenticated access to public pages, etc. See the screenshot below this post for the settings I use. You can make up your own mind about how strict you want to be here, but I think turning off the profile directory and the public timeline is a great idea.
3. Recommend your users disable DMs from people they don't follow. This is under preferences => notifications.
Any stuff I've missed, stuff you'd like to add, feel free to reply to this post.
Thanks for reading!