Okay here's what I'm going to do, or at least try!
I will setup another 1 or 2 servers for sidekiq. (sidekiq is the part of Mastodon that handles the jobs, think about all actions in the network.)
Currently we are about 100k jobs behind and I'm afraid one server just won't do..
If anyone has any experience with this could some give me a few tips? I know I'm going to fix it but it could be a lot faster!
@stux i am gonna ask few indian fediverse admin folks to come help you
@digitaldutta Thank you, it's very much appreciated!
@shirishag75 It is yes!
@stux should help with the spike. In case you are seeing a high error rate on the jobs, may make sense to limit retries to 3 from default, and then replaying the jobs off the dead queue at a time when there are fewer sign ups?
@awenindo That's what I thought indeed, I just increased the number of jobs to 150 but that doesn't seem to help.. Stil 121k jobs behind..
@awenindo But the weird thing is, we almost have no failed jobs almost all get through it's just a matter of time. But I think you have a good idea indeed with setting the rate limit at max 3
@stux oh, but watch for failure rate, if that is high, that means the same jobs are being retried multiple times which is taking worker resources. Need to find out why the jobs are failing. Also might want to check your sidekiq concurrency settings for redis and maxconn on redis (assuming redis)
@awenindo I will thanks 🙂
I just increased the postgres and pgbouncer conns to 500
sidekiq has 150 now
threads = 10
procc = 40
What do you think?
@stux usually for a worker the default concurrency of 25 (threads) is good, if the job isn't doing too much. by procs you mean
40 workers deployed? Which means at any time 400 jobs are being run. Curious, What is the execution time for these jobs on the dashboard?
@stux An extremely noob question - what exactly are these jobs? Is something being triggered on the server everytime traffic comes in?
@Psyborg There are no noob questions, it's good you ask!
Every action on the platform like a follow, a toot, or a boost for example is a 'job' for sidekiq the software that handles these things. 🙂
@stux Like via an API? I know there are no noob questions and thank you so much for not making fun! I'm a C dev myself and always on the lookout for new things to absorb. Is there a sidekiq manual I can look at and not bug you?
@stux Thanks much! Another question - do you own this instance?
@Psyborg That's only the jobs on our side, our server 🙂
The server that you are on is a little bigger with more sidekiq servers that handle jobs
@stux A little more clearer now. Thanks and sorry for being a pest.
@Psyborg Your very welcome! Don't worry me friend, I'm happy I can help and teach people a little! Keep it up with the good questions
@stux @Gargron @Psyborg You should do this on an internal network and should not expose those ports to the public internet. Digitalocean provides "private ips" which are a virtual lan. Adjust your IP Tables rules to allow traffic from the virtual lan and tell postgres if it's "binded to localhost" to instead "Bind to the private ip". Do NOT under ANY CIRCUMSTANCE expose ElasticSearch to the public internet.
@Psyborg Also getting content from the fediverse (other servers) are jobs!
Welcome to Mastodon!
We are just another another Mastodon server in the federated universe called the fediverse. Everyone is welcome as long as you follow our guidelines.
Support the community
Since we do not serve ads and we do not sell data we rely on your support. If you like you can donate to help us out a little.