Hi @v-namac, your
pm.max_children value seems appropriate.
Currently we kept the max_children to 300 and didn’t changed any other values. Please check and suggest the best values based on below screenshot.
The other values (pm.start_Servers, pm.min_spare_servers and pm.max_spare_servers) could be increased though, althouth you would still be facing the same issues.
And also we need to know that, is there any script / commands to restart the server automatically when we face the this kind of max_children and error establishing database (due to high spike) issues, i.e. 5XX errors?
It seems you have two different issues here:
- Error establishing database: Probably due to memory outage or CPU spikes.
- max_children limit reached (memory related)
Even though these issues may be mitigated by restarting the services when these issues occur, we don't recommend following this route as users would still find themselves unable to access the server during a short period of time.
Your server is definitely big, so we understand there are a lot of active users (you said 1.7K active users?). Therefore we would recommend the following:
1. Move the database to another instance
In Bitnami we offer Bitnami WordPress stack with MariaDB managed DB, in case you want to take a look at it, and uses Apache with mod-event + PHP-FPM instead of NGINX + PHP-FPM (both have similar performance).
Note that this solution does not include Let's Encrypt by default -but you can install it with this guide-.
If you prefer maintaining the current installation, you can move the database to a new server: Either with MySQL or MariaDB stack instances, or using Azure's Managed DB service.
2. Move assets to a CDN
We see your site have a lots of asset. We ran a webpagetest.org test and got these results: https://www.webpagetest.org/result/190122_5C_af4ac031d145af1e1c586c318792ee20/2/details/#waterfall_view_step1
There are a lot of assets being loaded through Apache. If you move these assets to a CDN you may be able to improve the performance of your site.
3. Inspect MySQL performance issues
It might be happening that your PHP-FPM issues are due to a bottleneck in the MySQL requests, if these are too slow, causing lot of users to end up being active at the same time.
You may want to check just in case, either by executing "SHOW STATUS" inside the MySQL CLI or running this command:
sudo mysqladmin status -u root -p
4. Enable cache
You might already have done this, it would be interesting to make sure static assets are being cached.
Especially it would be useful for you to take a look at the W3 Total Cache plugin settings -we have a small guide for optimizing WordPress here-.