Just upgraded HHVM from 3.8.0 to 3.8.1. Let's see if it's more stable. 3.8.0 crashed again today.
Looks like my monitoring script works, but HHVM keeps crashing. At least there are no outages...
Just upgraded HHVM to 3.9.0, WordPress to 4.3, Linux to 3.19.0-26, Thank/Like plugin to 1.9.6, etc 😁
The 503 errors are related to HHVM crashes. I set it up to automatically start it should it crash (it was not set up this way two weeks ago, I had to manually reboot the system while I was on vacation).
I suspect that HHVM crashes are related to memory usage. Currently I allowed HHVM garbage collection (PHP Zend garbage collection is allowed by default), let's see if it helps.
New WordPress uses more disk I/O. I retweaked the database (what reduced creating temporary tables on disk), but the disk usage is still high.
Never ending story 😁
It was not easy... Took me 24 hours to get it fixed. New WordPress is like new Windows - something will definitely be broken 😉
So, the list:
o- Went through all tables (again) checking for "text" fields, that force creating temporary tables on disk instead of in the memory. Discovered a couple of MyBB ones, that greatly affected the performance.
o- Discovered a nasty bug in WordPress, that caused very fast growing of the "Cron" DB table entry. Some users reported 1GB in one day. I thought it was a plugin and tried do disable all of them one by one but it did not help. The workaround was posted today in the WP bug track.
o- Disabled automatic WP-Cron that, by default, was being spawned by every page view.
The disk/CPU/memory performance looks very good, but I still see strange forum latencies. I'm afraid I will have to debug the database queries...
I just tried to post something today.
Took me 3 tries to get to the page.
And 2 tries to post it.
I know that running this site takes a lot of time.
But this happen today. Everything else seems ok.
Most likely it was a bad timing - I was "playing" with HHVM settings, and it turned out that some of them either prevent HHVM from starting, some of them lead to frequent crashes, and some are simply ignored. I still do not understand why some (data base?) queries take so much time....