M4.large at 50% cpu load

Keywords: CouchDB - AWS - Technical issue - Other
bnsupport ID: a38e75e7-f426-e829-717d-688c41e9b8eb
Description:
We upgraded our CouchDB AWS server instance from t2.medium to m4.large because of performance issues. Our server now has a 50% cpu load almost consistently which is obviouslt not good.
Is m4.large the wrong type?
Is our instance or db somehow corrupt/invalid?
Any help is very much appreciated!

Hi @wbison,

We have this section in our documentation that explains different commands to troubleshoot server performance issues, can you run the different commands in that guide?

https://docs.bitnami.com/aws/faq/troubleshooting/troubleshoot-server-performance/

Thanks

Hi,

Here is some output that is maybe helpful.

= top ==
Tasks: 123 total, 1 running, 122 sleeping, 0 stopped, 0 zombie
%Cpu(s): 65.6 us, 1.0 sy, 0.0 ni, 33.3 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 8173816 total, 5477304 free, 1660588 used, 1035924 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 6195316 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1384 couchdb 20 0 4331676 1.508g 10192 S 200.0 19.3 4216:06 beam.smp
1 root 20 0 119580 5712 4032 S 0.0 0.1 0:04.77 systemd

= sar ==
$ sar -r 2 10
Linux 4.4.0-1070-aws (ip-172-31-20-151) 11/14/2018 x86_64 (2 CPU)

12:47:47 PM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit kbactive kbinact kbdirty
12:47:49 PM 5449040 2724776 33.34 285036 606428 2354664 28.81 2349532 179316 4
12:47:51 PM 5450144 2723672 33.32 285036 606428 2147524 26.27 2348052 179316 8
12:47:53 PM 5578280 2595536 31.75 285036 606428 1937812 23.71 2220020 179316 8
12:47:55 PM 5451632 2722184 33.30 285036 606428 2148560 26.29 2344176 179316 8
12:47:57 PM 5300536 2873280 35.15 285036 606428 2320036 28.38 2492892 179312 64
12:47:59 PM 5407068 2766748 33.85 285036 606436 2201864 26.94 2388840 179308 88
12:48:01 PM 5536008 2637808 32.27 285036 606596 1968380 24.08 2256520 179168 136
12:48:03 PM 5554024 2619792 32.05 285036 607348 1969128 24.09 2243936 179180 196
12:48:05 PM 5412904 2760912 33.78 285036 607516 2201464 26.93 2385384 179172 228
12:48:07 PM 5469284 2704532 33.09 285036 607436 2178812 26.66 2329192 179300 288
Average: 5460892 2712924 33.19 285036 606747 2142824 26.22 2335854 179270 103

= uptime
$ uptime
12:48:43 up 2 days, 4:51, 1 user, load average: 2.25, 2.12, 2.09

bitnami@ip-172-31-20-151:~$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 3.9G 0 3.9G 0% /dev
tmpfs 799M 8.6M 790M 2% /run
/dev/xvda1 15G 3.3G 12G 23% /
tmpfs 3.9G 0 3.9G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
tmpfs 799M 0 799M 0% /run/user/999
tmpfs 799M 0 799M 0% /run/user/1000
bitnami@ip-172-31-20-151:~$ df -ih
Filesystem Inodes IUsed IFree IUse% Mounted on
udev 996K 322 996K 1% /dev
tmpfs 998K 427 998K 1% /run
/dev/xvda1 1.9M 292K 1.6M 16% /
tmpfs 998K 1 998K 1% /dev/shm
tmpfs 998K 3 998K 1% /run/lock
tmpfs 998K 16 998K 1% /sys/fs/cgroup
tmpfs 998K 4 998K 1% /run/user/999
tmpfs 998K 4 998K 1% /run/user/1000

===
bitnami@ip-172-31-20-151:/opt/bitnami$ sudo find . -type f | cut -d “/” -f 2 | sort | uniq -c | sort -n
1 bnsupport
1 bnsupport-regex.ini
1 changelog.txt
1 ctlscript.sh
1 img
1 properties.ini
1 README.txt
1 use_couchdb
4 config
6 stats
7 licenses
14 var
24 scripts
2203 couchdb
2972 common
4273 erlang

I attached bnsupport id to the original message - was that helpful?

The server is at 2.1.1. I guess I could upgrade to 2.2? Can I use the bnsupport tool for that? It asks something about upgrading at the start of the script.

Would it help if i sent couchdb logfiles?

I appreciate your help.

Regards,
Willem

Hi @wbison,
I would like to know if when you notice this load the CouchDB is doing operations (clients connected and using the DB) or if it is not. Also, I would like to know how big is your database.
M4.large should be enough, of course this will depend on the use of the DB.
The couchdb log files are sent by bnsupport. For what I can see in the logs, I can not tell if the DB is corrupted, it seems not but I can not be sure.
To upgrade the database, I would recommend to make a backup, deploy the new version, and the restore the backup. The documentation for backup/restore is here.

Best regards,
Rafael Rios.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.