Scaling Mastodon – Part 1

It’s been nearly 2 months since E-Day, when the mass migration of people from Twitter to the fediverse started in earnest. At the time, infosec.exchange was humming along well on an AX101 (AMD 5950, 128GB RAM, 4TB NVME ssd) server from Hetzner in Germany.

I spent a lot of time tuning, tweaking, and making mistakes. I am going to spare you the details of the many of the experiments that led to this. The instance currently has close to 43000 accounts and about 6000 peak active connections during the day, with about 20000 accounts active within the past week. Make no mistake: this is crazy overkill. I set this up at a time that infosec.exchange was adding 1500-3000 new accounts per day, and had no idea when it would stop. So, consolidation and simplification lay ahead, if things slow down.

This diagram is roughly what infosec.exchange looks like today.

All these systems are Ubuntu 22.04 LTS. For those that are not redundant, I have purchased a subscription to Ubuntu’s pro service that enables live patching.

I’ll go from top to bottom with the customizations I’ve made and a bit of the why.

Load balancer

The load balancer is nginx. The first modification I needed to make was to increase the open files available to nginx. After I got above a thousand or so concurrent connections, I nginx started running out of open files. Do not waste your time editing /etc/security/limits.conf or sysctl.conf. They don’t actually fix the problem. The fix is adding this line to the /etc/systemd/system/multi-user.target.wants/nginx.service file, under the [Service] section:

LimitNOFILE=1000000

Note: there is no science behind 1000000. I grew tired of running out of open files and so picked a big number. One million seemed nice and round.

Next, we have to increase the number of connections that nginx can handle. We do this in /etc/nginx/nginx.conf. This is what works well for me:

user www-data;
worker_processes 24;
pid /run/nginx.pid;
include /etc/nginx/modules-enabled/*.conf;
events {
worker_connections 16384;
multi_accept on;
use epoll;
epoll_events 512;
}

http {

…..

Next, is the nginx site file (/etc/nginx/sites-enabled/mastodon in my case):

map $http_upgrade $connection_upgrade {
default upgrade;
” close;
}

upstream backend {
ip_hash;
server 192.168.100.1:3000;
server 192.168.100.2:3000;

}

upstream streaming {
ip_hash;
server 192.168.100.20:4000;
server 192.168.100.21:4000;
server 192.168.100.22:4000;
}

proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=CACHE:10m inactive=7d max_size=1g;

server {
if ($host = infosec.exchange) {
return 301 https://$host$request_uri;
} # managed by Certbot

listen 80;
listen [::]:80;
server_name infosec.exchange;
root /home/mastodon/live/public;
location /.well-known/acme-challenge/ { allow all; }
location / { return 301 https://$host$request_uri; }

}
server {
listen localhost:81;
location /metrics {
stub_status on;
}
}

server {
listen 443 ssl http2;
listen [::]:443 ssl http2;
server_name infosec.exchange;

ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!MEDIUM:!LOW:!aNULL:!NULL:!SHA;
ssl_prefer_server_ciphers on;
ssl_session_cache shared:SSL:10m;
ssl_session_tickets off;

# Uncomment these lines once you acquire a certificate:
ssl_certificate /etc/letsencrypt/live/infosec.exchange/fullchain.pem; # managed by Certbot
ssl_certificate_key /etc/letsencrypt/live/infosec.exchange/privkey.pem; # managed by Certbot

keepalive_timeout 70;
sendfile on;
client_max_body_size 80m;

root /home/mastodon/live/public;

gzip on;
gzip_disable “msie6”;
gzip_vary on;
gzip_proxied any;
gzip_comp_level 6;
gzip_buffers 16 8k;
gzip_http_version 1.1;
gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript image/svg+xml im
age/x-icon;

location / {
try_files $uri @proxy;
}

# If Docker is used for deployment and Rails serves static files,
# then needed must replace line try_files $uri =404; with try_files $uri @proxy;.
location = /sw.js {
add_header Cache-Control “public, max-age=604800, must-revalidate”;
add_header Strict-Transport-Security “max-age=63072000; includeSubDomains”;
# try_files $uri =404;
try_files $uri @proxy;
}

location ~ ^/assets/ {
add_header Cache-Control “public, max-age=2419200, must-revalidate”;
add_header Strict-Transport-Security “max-age=63072000; includeSubDomains”;
#try_files $uri =404;
try_files $uri @proxy;
}

location ~ ^/avatars/ {
add_header Cache-Control “public, max-age=2419200, must-revalidate”;
add_header Strict-Transport-Security “max-age=63072000; includeSubDomains”;
#try_files $uri =404;
try_files $uri @proxy;
}

location ~ ^/emoji/ {
add_header Cache-Control “public, max-age=2419200, must-revalidate”;
add_header Strict-Transport-Security “max-age=63072000; includeSubDomains”;
#try_files $uri =404;
try_files $uri @proxy;
}

location ~ ^/headers/ {
add_header Cache-Control “public, max-age=2419200, must-revalidate”;
add_header Strict-Transport-Security “max-age=63072000; includeSubDomains”;
#try_files $uri =404;
try_files $uri @proxy;
}

location ~ ^/packs/ {
add_header Cache-Control “public, max-age=2419200, must-revalidate”;
add_header Strict-Transport-Security “max-age=63072000; includeSubDomains”;
#try_files $uri =404;
try_files $uri @proxy;
}

location ~ ^/shortcuts/ {
add_header Cache-Control “public, max-age=2419200, must-revalidate”;
add_header Strict-Transport-Security “max-age=63072000; includeSubDomains”;
#try_files $uri =404;
try_files $uri @proxy;
}

location ~ ^/sounds/ {
add_header Cache-Control “public, max-age=2419200, must-revalidate”;
add_header Strict-Transport-Security “max-age=63072000; includeSubDomains”;
#try_files $uri =404;
try_files $uri @proxy;
}

location ~ ^/system/ {
add_header Cache-Control “public, max-age=2419200, immutable”;
add_header Strict-Transport-Security “max-age=63072000; includeSubDomains”;
#try_files $uri =404;
try_files $uri @proxy;
}

location ^~ /api/v1/streaming {
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Proxy “”;

proxy_pass http://streaming;
proxy_buffering off;
proxy_redirect off;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $connection_upgrade;

add_header Strict-Transport-Security "max-age=63072000; includeSubDomains";

tcp_nodelay on;

}

location @proxy {
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Proxy “”;
proxy_pass_header Server;

proxy_pass http://backend;
proxy_buffering on;
proxy_buffer_size 8k;
proxy_buffers 32 8k;

proxy_redirect off;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $connection_upgrade;

proxy_cache CACHE;
proxy_cache_valid 200 7d;
proxy_cache_valid 410 24h;
proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504;
add_header X-Cached $upstream_cache_status;

tcp_nodelay on;
error_page 404 500 501 502 503 504 /500.html;

It’s largely the same as the sample nginx config file that comes with mastodon. Note the multiple IP addresses which relate to the PUMA nodes and the streaming nodes. Also, there is “try_files $uri @proxy;” under each location, rather than the default “try_files $uri =404;”

Finally, we have to create a copy of /home/mastodon/live/public from a system running PUMA. Do not include or later remove /home/mastodon/live/public/system. This is important for certain system files, emojis, js, and so on. After I run a pre-compile on the puma nodes, I have to sync this again.

To make this more manageable, I’m breaking this up into parts:

Part 2: PUMA configuration

Part 3: Streaming configuration

Part 4: Redis and Elasticsearch configuration

Part 5: Sidekiq configuration

Part 6: Postgres configuration


Posted

in

by

Tags:

Comments

One response to “Scaling Mastodon – Part 1”

  1. Adam Avatar
    Adam

    Do you manage these configurations with something like ansible?

  2. Pauliehedron ✅ :donor: Avatar

    @jerry used every bit of the 11k characters Glitch gives ya eh? 😂​

  3. Dennis Faucher Avatar

    I see one load balancer. What happens if that one load balancer fails? TIA.

Leave a Reply

Your email address will not be published. Required fields are marked *