100% de uso de CPU httpd.bin

Keywords: Moodle - Google Cloud Platform - Technical issue - Other
bnsupport ID: 2d6cd73e-a16f-d1b8-c788-52512f50e121
Description:
Buen día. Llevo un tiempo usando Moodle 3.8.3 ahora el servidor ha comenzado a saturarse porque httpd llega al 100% de la CPU, me gustaría saber si alguien ha tenido este problema y cómo solucionarlo.

Datos del servidor

Instancia en GCP Standard -8 8 núcleos 30 gb de ram

TOP datos

top - 17:21:05 up 4:43, 2 users, load average: 1.13, 1.16, 1.04
Tasks: 178 total, 1 running, 166 sleeping, 11 stopped, 0 zombie
%Cpu(s): 12.8 us, 0.2 sy, 0.0 ni, 86.8 id, 0.3 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 30887016 total, 26306536 free, 2546976 used, 2033504 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 27869864 avail Mem

Hi @John_Gutierrez,

I can see you have a lot of requests to your server

-----------------------------------
Check performance issues: Count number of requests for the 10 most active IP addresses in the last 100.000 requests
-----------------------------------
Running: tail -n 100000 access_log | awk '{print $1}' | sort | uniq -c | sort -nr | head -n 10 | awk '{print $1}'
In: /opt/bitnami/apache2/logs/

Output:

6668
6564
6533
6488                                                                                                                                                                                                                                                                            6451
6411
6388
6338
6318
6247

Can you take a look at this guide to see if there is any bot/attacker trying to access your site?

https://docs.bitnami.com/google/apps/moodle/troubleshooting/deny-connections-bots-apache/

Hello @jota and @John_Gutierrez,

I ran into similar problem and was hovering the forum then I saw your post.

I executed Jota’s command and the output wasn’t that strange. The IP with most connection belongs to our help-desk crew (1.5k a day)

Just to add more information, whenever I stop php-fpm with /opt/bitnami/ctlscript.sh script, CPU usage drops to 1% ( with running SQL and Apache ). But if I run php-fpm , even when there are only 1 user logged in moodle CPU utilization goes up to 50% ( 100% on prime time )

I already DMed my log file to you @jota as it failed to upload automatically.

Regards

Hi @mobcer,

It seems your server is receiving more requests than expected

[27-Sep-2020 06:59:20] WARNING: [pool moodle] server reached pm.max_children setting (50), consider raising it
[27-Sep-2020 07:09:52] WARNING: [pool moodle] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 8 children, there are 4 idle, and 40 total children
[27-Sep-2020 07:09:53] WARNING: [pool moodle] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 16 children, there are 3 idle, and 41 total children

You can try to increase/modify those values in the /opt/bitnami/php/etc/bitnami/common.conf file. You will need to restart the PHP-FPM service later.

Regarding the list of IPs, you mentioned that the IP with most connections belongs to your people, but what about the rest of IPs?

Hi @jota . Thanks for the reply.

I will apply your suggested modifications tonight after prime time. Just worth mentioning that even on the midnight which server has no online users, CPU is on average of 30%. And also, same user count and configuration / plug-ins used to work just fine a week ago. This had just started around 7 days ago

Here is the output on peak time:

2660
2532
1672
1626
1577
1511
1365
1352
1323
1311

Hi @mobcer,

That output only contains the number of requests, have you investigated the IPs that are requesting your site that amount of times?

tail -n 100000 access_log | awk '{print $1}' | sort | uniq -c | sort -nr | head -n 10

You will later need to access the access_log file and check the requests they are doing

https://docs.bitnami.com/google/apps/moodle/troubleshooting/deny-connections-bots-apache/

Today’s most active IP belongs to our course creator with 2.5k hit so far and She is browsing courses uploading content. Nothing strange on Logs . rest is bellow 1.9k. I don’t know what number is considered high / risky here.

I also greped top 2 most active IPs. There was no error in the output.All lines flagged with either “POST” or “GET” values.

Then I modified /opt/bitnami/php/etc/bitnami/common.conf to :

; Bitnami PHP-FPM Configuration
; Copyright 2020 Bitnami.com All Rights Reserved
;
; Note: This file will be modified on server size changes
;
pm.max_children=100
pm.start_servers=5
pm.min_spare_servers=5
pm.max_spare_servers=30
pm.max_requests=15000

and then restarted php-fpm
Result is :
CPU between 80% to 100% for 20 online users…

Check the connections with the different IPs and none is suspected of being a bot, consult courses and grades.

I got insufficient message from MaxRequestWorkers in apache log and make changes leaving httpd.conf as follows

StartServers 10 MinSpareServers 10 MaxSpareServers 30 ServerLimit 1032 MaxRequestWorkers 1024 MaxConnectionsPerChild 5000 ServerLimit 48 StartServers 32 MinSpareThreads 1024 MaxSpareThreads 1536 ThreadsPerChild 64 MaxRequestWorkers 3072 MaxConnectionsPerChild 5000 `Preformatted text`

So far it has worked, I am monitoring as follows, I will be telling you

Hello again @John_Gutierrez

Good to know everything is back to normal for you.

I saw bitnami stacks architect have preconfigured settings for different scales. Is it possible to change it for self-hosted stacks?

For instance from Micro to Medium.

Hi @mobcer,

As you mentioned, we have some preconfigured values but you can modify them without problems. You will only need to remember that if you modify the “micro” configuration and modify the instance type later, it will probably start using a different file (let’s say medium) and you will need to apply the same changes to that file if you want that specific configuration.

Let me know if you have any questions regarding this

1 Like

Hello @jota.

The problems have returned, the apache is raising again to 100%

Attached image of TOP command

Hi @John_Gutierrez

My issue seems to be solved for now. I am listing the tasks I have done here:

1 - I applied your changes to httpd.conf.
2 - I enabled a cloud-firewall offered by my hosting so my apache would not need to process anything whatsoever. Then I blocked whatever IP range was not in my trusted list ( country … etc )
3 - I optimized any image that is loading with the dashboard for all users ( course images )

now CPU usage is average at 40%.

About the processes on htop, I realized if you stop php-fpm and mysql my CPU utilization went down. If you have mysql and apache running without php-fpm, CPU will be fine until you turn php-fpm on. Even if you turn php-fpm off after that, your sql will still hog the CPU. To my basic knowledge, php-fpm puts heavy tasks over sql even if you stop it.

Hi @mobcer,

I’m glad to hear that the performance of your instance is much better now :slight_smile:

@John_Gutierrez, I suggest you take a look at the /opt/bitnami/apache2/logs/access_log file again and see if there is someone accessing your site using weird requests when you got that performance issue. I think that could be the cause of the issue and as you can see in the previous message from @mobcer, he blocked some untrusted IPs as well.

Hi.

Added the IPs with the most connections since they are not from my country to the httpd-app.conf as deny from, this has generated the 403 error on several computers, what can I do?

Hi @John_Gutierrez,

You will need to remove those IPs from the Apache’s file and restart the service again.

You don’t need to add the IPs with most connections. You need first to evaluate if they are making malicious requests to your site, that’s why I mentioned that you need to take a look at the access_log file.

Hello @jota

Remove the IP addresses from the file and the 403 error was eliminated, checking these do not make malicious queries to the site.

But we continue to present the same problems, I don’t know what other solution I can take, I ask for help please

Reviewing this error, I don’t know if it has to do with the affectation

Init: this version of mod_ssl
was compiled with a newer library (OpenSSL 1.1.1g Apr 21, 2020, currently loaded version is OpenSSL 1.1.1d 10
September 2019) - may result in undefined or erroneous behavior

Now apache consumption has risen excessively