35
5

So, what happened yesterday?

1d 22h ago by piefed.social/u/kobra in piefed_meta@piefed.social

Site was down for ~8 hours or so yesterday and I didn't know where to look for updates or any kind of status page.

Most of my instances have had those but I can't seem to find any kind of status page or Mastodon account for updates with this instance. Do they exist?

I'm not 100% sure.

There's a regular background task that runs every minute. For the last couple of days it has started crashing about once per hour. Approx one in 60 runs ends badly. Not a normal crash where I receive a nice tidy error report saying which line of code it happened in, an ugly crash where all I get is an email from the background task runner saying "864 Segmentation fault (core dumped)" which is very unhelpful.

This went on for a couple of days. I ignored it because it didn't seem to be doing any harm and I have better things to do.

Then yesterday morning I wake up and the whole server is unresponsive, I can't even SSH in. The HDD light is stuck on and the fans are very loud. I turn it off and then on again, it boots up normally.

Nothing interesting in the server logs.

At this point I suspect some experiments with my LAN's internal DNS server that I've been doing recently might have caused it to go offline, as the DNS server is looking flaky and PieFed has been making way more requests than I expected it would. But that turns out to be a dead end. I revert all the DNS experiments back to how it was before, just to be sure. Then I start looking into the segfaults. It's possible that one of the segfaults eventually hit something serious enough that it caused PieFed to get overloaded.

After trying all kinds of things it turns out that I have 3 different python packages that do SSL (database connection, http connections and for encrypting passwords) and they each import a different version of the same C library. They clash and very occasionally causes a random segfault. Switching off SSL on the database connection stops the segfaults. This is fine because the connection happens over a private LAN anyway.

Having nailed down the only two broken bits that were flapping in the wind (DNS and random segfaults), I think it'll be ok again. Probably.

Oof! Thanks for all your hard work as usual

Im glad its back! Ty for the hard work! Whats the easiest way to support the project financially

I'm going to guess its here: https://piefed.social/donate

For me, using my PayPal acct, I was able to donate via Ko-fi without needing to create an acct or do anything remotely bureaucratic. Easy-peasy.

https://piefed.social/donate