mstdn.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A general-purpose Mastodon server with a 500 character limit. All languages are welcome.

Administered by:

Server stats:

16K
active users

Michał "rysiek" Woźniak · 🇺🇦

My former employer, @OCCRP, just went live with their new website, and it's pretty slick!
occrp.org/en

This is bitter-sweet for me.

On one hand, glad to see them have a new site, finally! The old was a mess.

OTOH: I had designed and built the infra that hosted their site through Panama Papers (arguably OCCRP's big break). It did not rely on external CDNs or "DDoS-protection" providers.

That infra is no longer in use as of today. Replaced by Google. 🥲

🧵👇

OCCRPOrganized Crime and Corruption Reporting Project

For those curious, the infra was mostly:

- a pair of back-end servers (the main site was an ancient Joomla install…), in a production / warm standby configuration;

- a couple dozen very thin VPSes acting as (micro-) caching reverse proxies; we called them "fasadas" (from Bosnian word for a façade);

- a bunch of scripts that tied it all together.

The stripped down and simplified nginx config for the fasadas lives as a FLOSS project here:
0xacab.org/rysiek/fasada

🧵

GitLabMichał "rysiek" Woźniak / fasada · GitLabAn nginx-based front-end caching config for WordPress sites, handling high traffic and improving security.

The production / warm standby back-end servers were automagically synced every hour. Yes, including the database.

This meant that:

1. we had a close-to-production testing server always available;

2. we had a way of quickly switching to almost completely up-to-date backup back-end server in case anything went down with the production.

The set-up on these back-ends included *two* nginx instances running parallel on different ports but with same config, serving same content.

Yes, on each.

🧵

Each fasada (i.e. reverse proxy on the edge) was configured to use *both* of these nginx instances on the currently-production back-end server.

Because everything was in docker containers, we could upgrade each nginx instance separately.

Whenever we were deploying nginx config changes or were upgrading nginx itself, we would do that one instance at a time. If it got b0rked, fasadas would just stop using the b0rked back-end nginx instance and switch to the other one.

No downtime. No stress.

🧵

IP addresses of active fasadas (that is, ones that were supposed to handle production traffic) were simply added as A records for `occrp.org`.

This was Good Enough™, as browsers were already smart about selecting an endpoint IP address and sticking to it across requests related to the same domain.

This also meant that if an active fasada went under for whatever reason, visitors would mostly not notice – their browsers would retry against one of the remaining IPs.

🧵

We had about 2 dozen fasadas configured, deployed, and ready to serve production traffic at any given time.

But we only kept 4-6 actually active for `occrp.org` (and some others for other sites we hosted).

The other ones were an "emergency stash".

If an active fasada did go under, we'd swap its IP address out of occrp.org A records, and add one of the currently healthy standbys instead.

If we started getting way more traffic than the current active fasada set could handle, we'd add more.

🧵

From my experience, what brings a site down really rarely is an *actual* DDoS. Most of the time it is organic traffic spike hitting a slow back-end.

Hence:
1. microcaching
2. my exasperation with CloudFlare calling everything a DDoS 🙄

But I digress!

We did get honest-to-Dog DDoSes, some pretty substantial. When that happened we just… swapped out *all* active fasadas.

DDoS would happily continue against the 4 to 6 old IP addresses… While new visitors would get served from other nodes. 😸

🧵

See, when you're DDoSing someone, you don't want to waste your bandwidth on checking DNS records, now do you? You want to put everything you've got into these malicious packets.

And when you do, and the target just moves on to a different set of IP addresses, you're DDoSing something that does not matter. Have at it! :blobcatcoffee:

Now, I am not saying *all* DDoSes work this way.

I *am* saying that all the DDoSes I have seen against OCCRP's infra when I was there worked this way.

🧵

The time we really went down hard was when our dedi provider (which was otherwise great!) overeagerly blackholed DDoS traffic…

…blackholing also our production back-end server.

Took us 45min to deal with this, mainly because I was out at lunch and for *once* I did not take my phone with me. While a certain @smari happened to be on vacation literally on the other side of the globe.

Dealing with this meant pushing a quick config change to the fasadas to switch to the warm spare back-end.

🧵

:oof: What a blast from the past!

I should probably write this all up in a blogpost, with some more lessons-learned (for example: remember to microcache your 4xx/5xx errors and 3xx redirects as well).

Thanks for joining me for this ride down the memory lane!

I will now take your questions. :blobcatcoffee:

🧵/end

@oblomov eh, it's okay. I am proud it was still being used until now, over 4y after I had moved on from @OCCRP. 🙂

I wouldn't have made that decision, but well that is a decision a lot of orgs make. It has pros and cons. I feel the cons outweigh the pros, but it's far from clear-cut.

@rysiek @oblomov Including a party fully capable of denying service at any point is a rather strong negative point in a setup intended to be resilient.

@rysiek what's the point in having redundancy on the web server if they are both on the same machine? Thanks in advance for the answer.

@rysiek what order of magnitude were the TTLs on those records, how soon would clients notice the changes?

@viq TTL was 900, so 30min for ~full propagation.

@rysiek I worked my back up the thread because I read "microcache" as "microfiche" ... ahem!

Interesting stuff even if you stuck to current-century tech, thank you!

@rysiek Q: what TTL did you typically have on your A records?

@DamonHD 900, meaning in ~30min we could expect ~full propagation.

@rysiek OK, thanks!

Many many years ago when the BBC was hosting live UK General Election results for the first time I think that they used 5 minutes, and broke a lot of things.

(Ofc everything is faster and better tuned these days, from the DNS servers to the browsers...)

@rysiek @DamonHD

Does this mean that your way of mitigating ddoses would kick in on that timescale?

@robryk @DamonHD for some visitors it would be instantenous, if their recursive resolvers have not cached occrp.org A records yet.

For those whose did, the worst case scenario is roughly 2×TTL if the request happens *just* before we push DNS changes.

There are nuances and caveats, but that's an effective enough way of thinking about it.

@rysiek @DamonHD wouldn't it be ~1xTTL+epsilon? (The recursive resolver would reask once TTL has expired, would promptly get the new answer, with only requests it was serving before it got the new answer getting old responses, no?)

@robryk @DamonHD there are all sorts of small random delays that can push it over the edge and mean that a recursive resolver still serves the cached response even though technically the TTL should have *just* expired.

Or, a recursive resolver gets a request from user A *just* before DNS changes are pushed, caches that. Then user B issues a request *just* before the TTL expires and gets the cashed response from the recursive resolver.

@rysiek @DamonHD ah, I just realized that I don't know how DNS TTL works in _clients_. Thanks, I will need to look it up.

@rysiek Thanks for sharing! I don't think I've heard about micro caching before, can you share how low you set the time in this case?

@Herover it's all in the code.

Default: 10s for dynamic content
0xacab.org/rysiek/fasada/-/blo

And then depending on context:
0xacab.org/rysiek/fasada/-/blo
0xacab.org/rysiek/fasada/-/blo

Basically the idea is that content that is "dynamic" (i.e. generated by the CMS on the back-end) is cached just long enough to limit the number of requests hitting the back-end dramatically, but otherwise not make it annoying to people who write/edit/modify that content in the CMS.

10s -20s is pretty good for that.

GitLabservices/etc/nginx/sites/example.com.conf · master · Michał "rysiek" Woźniak / fasada · GitLabAn nginx-based front-end caching config for WordPress sites, handling high traffic and improving security.

@rysiek @Herover

Did you choose not to only coalesce concurrent requests because that's more fiddly, because the load difference between 1/10 Bq (or should I say Hz) and 1/[time it takes to serve the request of the backend] is meaningful, or for some other reason?

@rysiek why do you recommend caching 4xx/5xx responses?

@slimhazard I recommend *microcaching* these responses just like all other dynamic responses, if the back-end is dynamic.

Why? Because if you're running a databased-backed back-end, generating a 404 page or a 301 redirect might take close to as much time and resources as generating a 200 response.

And if your back-end is overwhelmed and throwing 504 Gateway Timeouts after a long wait, throwing more requests its way is also a bad idea.

@slimhazard learned that the hard way when Apple's retina macs became a thing.

On one of the sites we started getting a "@2x" requests for *every* image:
kylejlarson.com/blog/creating-

Those "@2x" images did not exist, so 404s were being returned. But these 404s were not being (micro)cached at the edge at the time, and that drove the load on the back-end through the roof.

So yeah, microcache your 3xx/4xx/5xx responses if you're doing microcaching at all. :blobcatcoffee:

Kyle J Larson · Creating Retina Images for Your WebsiteCreate Retina Images and make your websites and web apps look amazingly sharp on Apple's newest devices with Retina Displays!

@rysiek got it. I work with Varnish and see things through that lens, hence my question. We would do similar things, but differently.

In that world: 301 and 404 responses are cacheable by default, so the load for that can be taken from the backends in any case. And yes, a caching proxy has to react appropriately to distressed backends. For example with health checks, taking unhealthy backends out of rotation for a time to give them a chance to recover.

Sounds like your way worked.

@slimhazard yup, worked fine for us.

We looked at Varnish back when we were setting this up, but it did not support HTTPS (which was a necessity for us). The official way of adding HTTPS was "put nginx in front of it". So we shrugged and just used nginx.

@rysiek Yes. This should definitely be a blog post. I learned stuff!

@rysiek reminds me of when I was running a server with a popular shooting game, a great way to stop a ddos attack was to figure out which previous player was doing it (easy, they always made loudly told about themselves), and then block the persons IP. The ddos would almost always stop within minutes because most people thought they had permanently ddos'd the server out of existence, somehow 😁

@rysiek: I use parts of your NGINX configs at Peekr. Alongside using Redis, blocking AI scrapers’ user agents server-side and optimising static resources, the instance works blazing fast and can return results as fast as few milliseconds. Fastly (thanks to its Fast Forward programme) also helps us in handling static resources such as CSS/JS/WOFF2 fonts (i.e. used as an auxiliary caching method).

Also, can you please explain how $redirect_fbclid map does work? Would be interested in implementing this.

@rysiek: thank you! I’ll implement this shortly + will test this out over other commercial social media platforms and return back with my findings.

@rysiek: insofar, only Facebook seems to be the only culprit. Haven’t checked out the messaging services such as Messenger/WhatsApp since I do not operate an account on each.

Instagram and Twitter seem to be including nothing alongside the destination link.

@rysiek: additionally, I've looked at daily searches stats file and noticed a giant spike in searches again (8 days ago to yesterday, each row stands for each day):

336
491
1939
10113
57959
40733
36847
44672

...yet throughout that period I haven't noticed **any** significant slowdowns thanks to recently introduced measures such as adding Redis and reinforcing the config based upon Fasada. Results were still returned in sub-second times.