Follow

Okay here's what I'm going to do, or at least try!

I will setup another 1 or 2 servers for sidekiq. (sidekiq is the part of Mastodon that handles the jobs, think about all actions in the network.)

Currently we are about 100k jobs behind and I'm afraid one server just won't do..

If anyone has any experience with this could some give me a few tips? I know I'm going to fix it but it could be a lot faster!

Thank you!

@stux i am gonna ask few indian fediverse admin folks to come help you

@stux should help with the spike. In case you are seeing a high error rate on the jobs, may make sense to limit retries to 3 from default, and then replaying the jobs off the dead queue at a time when there are fewer sign ups?

@awenindo That's what I thought indeed, I just increased the number of jobs to 150 but that doesn't seem to help.. Stil 121k jobs behind..

@awenindo But the weird thing is, we almost have no failed jobs almost all get through it's just a matter of time. But I think you have a good idea indeed with setting the rate limit at max 3

@stux oh, but watch for failure rate, if that is high, that means the same jobs are being retried multiple times which is taking worker resources. Need to find out why the jobs are failing. Also might want to check your sidekiq concurrency settings for redis and maxconn on redis (assuming redis)

@awenindo I will thanks 🙂

I just increased the postgres and pgbouncer conns to 500

sidekiq has 150 now
threads = 10
procc = 40

What do you think?

@stux usually for a worker the default concurrency of 25 (threads) is good, if the job isn't doing too much. by procs you mean
40 workers deployed? Which means at any time 400 jobs are being run. Curious, What is the execution time for these jobs on the dashboard?

@stux An extremely noob question - what exactly are these jobs? Is something being triggered on the server everytime traffic comes in?

@Psyborg There are no noob questions, it's good you ask!

Every action on the platform like a follow, a toot, or a boost for example is a 'job' for sidekiq the software that handles these things. 🙂

@stux Like via an API? I know there are no noob questions and thank you so much for not making fun! I'm a C dev myself and always on the lookout for new things to absorb. Is there a sidekiq manual I can look at and not bug you?

@Psyborg Checkout this awesome docu! I think there's all you want to know 🙂
docs.joinmastodon.org/

@stux Thanks much! Another question - do you own this instance?

@Psyborg I do not own the instance that you are on that would be @Gargron

@stux @Gargron Ahh yes. So then how is it that you seem to be maintaining the jobs? Oh, is that for a client you designed?

@Psyborg That's only the jobs on our side, our server 🙂

The server that you are on is a little bigger with more sidekiq servers that handle jobs

@stux A little more clearer now. Thanks and sorry for being a pest.

@Psyborg Your very welcome! Don't worry me friend, I'm happy I can help and teach people a little! Keep it up with the good questions :mastodance:

@Psyborg @stux Sidekiq jobs = background task processing. Can’t do slow things in the HTTP request/response cycle.

@Gargron @stux Makes sense. I'm going to rtfm a bit and then try and get more hands-on once I'm next to my laptop. This interests me a lot.

@Gargron @Psyborg

Is there any guide on how to setup sidekiq, streaming and puma on different servers? I get most things but I want to be sure

@stux @Gargron @Psyborg I can help you with this, I have a lot of knowledge from my consulting work where I have to work in a dual development and devops role.

@stux @Psyborg Well, it’s the same setup as usual but you only enable one type of service, and instead of installing new databases you just make it connect to the existing ones.

@Gargron @Psyborg Thanks for clarification! So for example create a new sidekiq server, copy the env but point hosts to the ‘main’ server for Postgres and Redis? What about opening ports on both servers? Would this be secure?

@stux @Gargron @Psyborg You should do this on an internal network and should not expose those ports to the public internet. Digitalocean provides "private ips" which are a virtual lan. Adjust your IP Tables rules to allow traffic from the virtual lan and tell postgres if it's "binded to localhost" to instead "Bind to the private ip". Do NOT under ANY CIRCUMSTANCE expose ElasticSearch to the public internet.

@stux @Gargron @Psyborg Before scaling to multiple servers, make sure that you properly configured number of workers and threads to handle the load. Different server will obviously lead to some latency

@Psyborg Also getting content from the fediverse (other servers) are jobs!

@stux @Psyborg You may have to hit up Eugen on this one, that exact cocktail of multi server functionality here that also doesn't run into actual problems isn't well documented...

Sign in to participate in the conversation
Mastodon

Welcome to Mastodon!

We are just another another Mastodon server in the federated universe called the fediverse. Everyone is welcome as long as you follow our guidelines.

Support the community

Since we do not serve ads and we do not sell data we rely on your support. If you like you can donate to help us out a little.