Lightsail instance created from snapshot of live working site fails - error connecting to database

Keywords: WordPress - AWS - Technical issue - Services (Apache, MariaDB, MySQL…)

bnsupport ID: 58b1710f-79dc-be06-72f2-e2758a434cf4

bndiagnostic output:

? Apache: Found possible issues
https://docs.bitnami.com/general/apps/wordpress/troubleshooting/debug-errors-apache/

bndiagnostic failure reason: I do not know how to perform the changes explained in the documentation

Description:
We have a live site running on a Lightsail Bitnami Wordpress instance that we wish copy to other availability zones to create redundant servers. When we use a snapshot to create the new instance the new server responds with “Error establishing a database connection”. Have tried all the help to test the MariaDB server and reset password but nothing helps. The Bitnami diagnostic tool suggested there is an Apache error. Surely an exact image of another server should just work?

Hi @msau,

I can see all services are up and running

-----------------------------------
Get the ctlscript status
-----------------------------------
Running: ./ctlscript.sh status
In: /opt/bitnami

Output:

apache already running
mariadb already running
php-fpm already running

Could you please get the database credentials from the /opt/bitnami/wordpress/wp-config.php file and check if you can access the database using them?

mysql -u USER -pPASSWORD DATABASE_NAME

Hi jota, I get a similar response to what I get trying to access the DB using the other troubleshooting steps I have already tried from the documentation.
From the mysql command I get:

ERROR 2013 (HY000): Lost connection to MySQL server at 'handshake: reading initial communi
cation packet', system error: 104

From the mysqld_safe I get:

ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/opt/bitnami/mariadb/tmp/mysql.sock' (111)

From the phpmyadmin login screen I get:

mysqli::real_connect(): Error while reading greeting packet. PID=5881
mysqli::real_connect(): (HY000/2006): MySQL server has gone away

R.

Hi @msau

It seems that the error 111 you are experiencing when using mysqld_safe could be traced out to the server not listening in that socket (Connection Refused).

The bndiagnostic tool did not retrieve your DB configuration files and it would be interesting to see them. Could you please run the following commands and share their output?

$ sudo cat /opt/bitnami/mariadb/conf/my.cnf
$ sudo cat /opt/bitnami/mariadb/conf/bitnami/memory.conf
$ sudo tail -n 30 /opt/bitnami/mariadb/logs/mysqld.log
$ sudo gonit status
$ sudo ls -la /opt/bitnami/mariadb/tmp/

Best regards,
Jose Antonio Carmona


Was my answer helpful? Click on :heart:

Hola Jose,

sudo cat /opt/bitnami/mariadb/conf/my.cnf
[mysqladmin]
user=bn_wordpress

[mysqld]
skip_name_resolve
explicit_defaults_for_timestamp
basedir=/opt/bitnami/mariadb
port=3306
tmpdir=/opt/bitnami/mariadb/tmp
socket=/opt/bitnami/mariadb/tmp/mysql.sock
pid_file=/opt/bitnami/mariadb/tmp/mysqld.pid
max_allowed_packet=16M
bind_address=127.0.0.1
log_error=/opt/bitnami/mariadb/logs/mysqld.log
character_set_server=utf8
collation_server=utf8_general_ci
plugin_dir=/opt/bitnami/mariadb/lib/plugin

[client]
port=3306
socket=/opt/bitnami/mariadb/tmp/mysql.sock
default_character_set=UTF8
plugin_dir=/opt/bitnami/mariadb/lib/plugin

[manager]
port=3306
socket=/opt/bitnami/mariadb/tmp/mysql.sock
pid_file=/opt/bitnami/mariadb/tmp/mysqld.pid
!include /opt/bitnami/mariadb/conf/bitnami/my_custom.cnf
!include /opt/bitnami/mariadb/conf/bitnami/memory.conf


sudo cat /opt/bitnami/mariadb/conf/bitnami/memory.conf
[mysqld]
#wait_timeout = 120
long_query_time = 1
query_cache_limit=2M
query_cache_type=1
query_cache_size=128M
innodb_buffer_pool_size=256M
#innodb_log_file_size=128M
#innodb_flush_method=O_DIRECT
#tmp_table_size=64M
#max_connections = 2500
#max_user_connections = 2500
#key_buffer_size=64M


sudo tail -n 30 /opt/bitnami/mariadb/logs/mysqld.log
2021-11-12 13:37:49 0 [Note] /opt/bitnami/mariadb/sbin/mysqld (mysqld 10.3.30-MariaDB) sta
rting as process 27285 …
2021-11-12 13:37:49 0 [Note] InnoDB: Using Linux native AIO
2021-11-12 13:37:49 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2021-11-12 13:37:49 0 [Note] InnoDB: Uses event mutexes
2021-11-12 13:37:49 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2021-11-12 13:37:49 0 [Note] InnoDB: Number of pools: 1
2021-11-12 13:37:49 0 [Note] InnoDB: Using SSE2 crc32 instructions
2021-11-12 13:37:49 0 [Note] InnoDB: Initializing buffer pool, total size = 256M, instance
s = 1, chunk size = 128M
2021-11-12 13:37:49 0 [Note] InnoDB: Completed initialization of buffer pool
2021-11-12 13:37:49 0 [Note] InnoDB: If the mysqld execution user is authorized, page clea
ner thread priority can be changed. See the man page of setpriority().
2021-11-12 13:37:49 0 [Note] InnoDB: 128 out of 128 rollback segments are active.
2021-11-12 13:37:49 0 [Note] InnoDB: Removed temporary tablespace data file: “ibtmp1”
2021-11-12 13:37:49 0 [Note] InnoDB: Creating shared tablespace for temporary tables
2021-11-12 13:37:49 0 [Note] InnoDB: Setting file ‘./ibtmp1’ size to 12 MB. Physically wri
ting the file full; Please wait …
2021-11-12 13:37:49 0 [Note] InnoDB: File ‘./ibtmp1’ size is now 12 MB.
2021-11-12 13:37:49 0 [Note] InnoDB: Waiting for purge to start
2021-11-12 13:37:49 0 [Note] InnoDB: 10.3.30 started; log sequence number 39136295; transa
ction id 82827
2021-11-12 13:37:49 0 [Note] InnoDB: Loading buffer pool(s) from /bitnami/mariadb/data/ib_
buffer_pool
2021-11-12 13:37:49 0 [Note] InnoDB: Buffer pool(s) load completed at 211112 13:37:49
2021-11-12 13:37:49 0 [Note] Plugin ‘FEEDBACK’ is disabled.
2021-11-12 13:37:49 0 [Note] Recovering after a crash using tc.log
2021-11-12 13:37:49 0 [Note] Starting crash recovery…
2021-11-12 13:37:49 0 [Note] Crash recovery finished.
2021-11-12 13:37:49 0 [Note] Server socket created on IP: ‘127.0.0.1’.
2021-11-12 13:37:49 0 [Warning] ‘proxies_priv’ entry ‘@% root@ip-172-26-24-52’ ignored in
–skip-name-resolve mode.
2021-11-12 13:37:49 0 [Note] Reading of all Master_info entries succeeded
2021-11-12 13:37:49 0 [Note] Added new Master_info ‘’ to hash table
2021-11-12 13:37:49 0 [ERROR] mysqld: Can’t create/write to file ‘mysql-init’ (Errcode: 2
“No such file or directory”)
2021-11-12 13:37:49 0 [ERROR] Aborting


sudo gonit status

Uptime 37h42m52s
Last Check 2021-11-12 13:37:50.78451717 +0000 UTC m=+135720.527456577
Next Check 2021-11-12 13:39:50.78451717 +0000 UTC m=+135840.527456577
Pid 1209
Pid File /var/run/gonit.pid
Control File /etc/gonit/gonitrc
Socket File /var/run/gonit.sock
Log File /var/log/gonit.log
Process ‘apache’
status Running
pid 6080
uptime 36h1m0s
monitoring status monitored

Process ‘mariadb’
status Running
pid 28973
uptime 10m46s
monitoring status monitored

Process ‘php-fpm’
status Running
pid 5870
uptime 36h1m0s
monitoring status monitored


sudo ls -la /opt/bitnami/mariadb/tmp/
total 12
drwxrwxr-x 2 mysql mysql 4096 Nov 12 13:39 .
drwxr-xr-x 12 root root 4096 Jul 7 05:29 …
-rw-rw---- 1 mysql mysql 6 Nov 12 13:39 mysqld.pid
srwxrwxrwx 1 mysql mysql 0 Nov 12 13:39 mysql.sock


Hi @msau,

This seems to be a permissions issue, please ensure both VMs have the same permissions configuration in the /opt/bitnami/mariadb and /opt/bitnami/mariadb/data folders

For example, this is the default permissions configuration of a fresh instance

bitnami@bitnami-wordpress-d234:/opt/bitnami$ ls -la /opt/bitnami/mariadb/data/
total 122956
drwxrwxr-x 6 mysql mysql     4096 Nov 15 15:30 .
drwxrwxr-x 3 root  root      4096 Nov 10 20:47 ..
-rw-rw---- 1 mysql mysql    16384 Nov 15 15:30 aria_log.00000001
-rw-rw---- 1 mysql mysql       52 Nov 15 15:30 aria_log_control
drwx------ 2 mysql mysql     4096 Nov 15 15:30 bitnami_wordpress
-rw-rw---- 1 mysql mysql      976 Nov 15 15:30 ib_buffer_pool
-rw-rw---- 1 mysql mysql 12582912 Nov 15 15:31 ibdata1
-rw-rw---- 1 mysql mysql 50331648 Nov 15 15:31 ib_logfile0
-rw-rw---- 1 mysql mysql 50331648 Nov 15 15:30 ib_logfile1
-rw-rw---- 1 mysql mysql 12582912 Nov 15 15:30 ibtmp1
-rw-rw---- 1 mysql mysql        0 Nov 15 15:30 multi-master.info
drwx------ 2 mysql mysql     4096 Nov 15 15:30 mysql
-rw-r--r-- 1 root  root        16 Nov 15 15:30 mysql_upgrade_info
drwx------ 2 mysql mysql     4096 Nov 15 15:30 performance_schema
-rw-rw---- 1 mysql mysql    24576 Nov 15 15:30 tc.log
drwx------ 2 mysql mysql     4096 Nov 15 15:30 test

Hi Jota, I checked all the permissions in those 2 directories and they look the same on the original instance and the instance that we were getting a database error on. The other strange thing is that suddenly the website works, without us making any changes. Is there a job that runs periodically that might have fixed something? Permissions?

I made a snapshot of the instance that was giving us trouble and created a new instance from that, but once it was up a running it worked just fine and had no database connection issues.

R.

Hi @msau,

No, there is not any job changing the permissions of the solution unless you manually added it.

That’s really weird. Maybe there was a problem in the infrastructure and launching a new instance solve them. Can you confirm everything works as expected now?

Thanks

No, now I’m saying that the old instance works correctly, without us changing anything. It’s really weird. Are there any database optimisation jobs that run automatically, that could have fixed a DB issue?

Hi @msau,

There are no automatic optimisation jobs configured.

I’m not sure I understand. Do you mean the old instance is working, but the issue continues when you try to make a snapshot?

Regards,
Michiel

Michiel,

No, the problem is now gone, but it has made us very nervous because we did not find the reason and no one changed anything. It also means that at any time when we have a failure, and we spin up a new instance it might happen again. Luckily this time I was spinning up an additional instance, but if it had been to recover from a failure it could have been catastrophic. How can a problem disappear on its own after 3 days without any user intervention. But, I guess we have no choice but to wait now till it happens again.

This is really weird @msau. The only thing I can think of is a performance issue. If the new instance didn’t have enough resources and there was a bot/attacker accessing your site with malicious intentions, it may break the application temporarily till the attack ends. You can find more info here:

https://docs.bitnami.com/aws/apps/wordpress/troubleshooting/deny-connections-bots-apache/