The whole process of troubleshooting memory leaks and OOM issues under MySQL 5.7

The whole process of troubleshooting memory leaks and OOM issues under MySQL 5.7


My personal website uses MySQL version 5.7 in the backend. Some time ago, I was often killed by oom-kill. With the help of new features in 5.7, after some investigation, I finally caught this ghost.

1. Problem phenomenon

Some time ago, my website often got wind from time to time, prompting that the database could not be connected, prompting:

Error establishing database connection

I thought it was a personal website anyway, and it didn’t matter. It might also be because the VPS configuration is too low. If the traffic is too large, it’s easy to go wrong.

Later, Master Qi Rong said: Look at the rotten door made by the carpenter 😓(⊙﹏⊙)b

So I made up my mind to solve the problem and can no longer be despised. As a DBA, I cannot tolerate the database hanging up for no reason, even if it is a personal VPS.

2. Troubleshooting

First of all, a cron task is added to automatically check whether the mysqld process is alive every minute, if not, start it. This is also our guiding ideology for solving problems in the enterprise: find the problem as soon as possible, but before confirming the problem, give priority to ensuring service availability, no matter what method you use.

Next, look at the MySQL log to see if you can find any clues. However, nothing was found. It should have been killed abnormally, so only the mysqld startup process was recorded, and there was no abnormal shutdown process.

Check the system log again, and finally see that the mysqld process was killed because of OOM:

figure 1

As you can see, the mysqld process consumed the most memory and was selected by oom-killer and killed.

Since it is OOM, let's look at the overall memory consumption of the system at that time:

figure 2

Are there obvious signs of memory leaks? Students who are unclear can read the following article to popularize it first:

  • Find the culprit of SWAP on the MySQL server . Now we have basically made it clear that the mysqld process consumes a lot of memory due to a memory leak, and was eventually oom-killed. So, how do we know why the mysqld process consumes memory and where is it used? Fortunately, MySQL 5.7 P_S (short for performance_schema) integrates such a function, which can help us understand this information easily. Therefore, we can find out who is consuming more memory in MySQL by executing the following SQL: yejr@imysql> select event_name,SUM_NUMBER_OF_BYTES_ALLOC from memory_summary_global_by_event_name order by SUM_NUMBER_OF_BYTES_ALLOC desc LIMIT 10;
    In Figure 3 we noticed that " memory/innodb/mem0mem " consumes the most memory. Then execute the following SQL to see which internal threads consume more memory: yejr@imysql>select event_name, SUM_NUMBER_OF_BYTES_ALLOC from memory_summary_by_thread_by_event_name order by SUM_NUMBER_OF_BYTES_ALLOC desc limit 20;
    In Figure 4, we noticed that in the above results, many non-MySQL internal background threads (thread_id value is relatively large) use "memory/innodb/mem0mem", and the memory consumption is also relatively large. Seeing this phenomenon, my sixth sense tells me that it should be caused by too high concurrent threads or unreasonable thread allocation. Because the front end is a short PHP connection, not a long connection, there should not be so many threads, presumably caused by multi-threaded connections or thread pools. So I checked the parameter options and status related to the number of threads and connections. The basic confirmation should be that the thread pool was opened, which caused a memory leak and continued to consume memory. In the end, the mysqld process consumed too much memory, which was given by the system. oom-kill. The following is my thread pool related settings:
    Figure 5 When we turn off the thread pool function, the memory leak problem also disappears, the mysqld process is never oom-killed anymore, and it seems to be a solution. After several repeated tests, the following conclusions were finally observed:
  • Opening P_S and thread pool at the same time will cause memory leaks;
  • Open P_S and thread pool at the same time, but use the "one-thread-per-connection" mode instead of the "pool-of-threads" mode (only the extra port function is used), and memory leaks will not occur;
  • Only open P_S, do not open the thread pool, there will be no memory leaks;
  • Do not open P_S, only open the thread pool, and there will be no memory leaks;
  • Close P_S and thread pool at the same time, no memory leak will occur;

To explain, my MySQL version is:

  • 5.7.17-11-log Percona Server (GPL), Release 11, Revision f60191c earlier using the official version 5.7.13 is also problematic. 3. Conclusions and suggestions. In front-end applications often have short bursts of connections or similar scenarios, opening the thread pool is very helpful to alleviate the queue of user connection requests , and can avoid the problem of MySQL connections being full and bursting in an instant. But the thread pool is not suitable for all scenarios, as the problems encountered in this case are not suitable (I opened the thread pool more for the extra port function). As we mentioned above, starting from MySQL 5.7, a lot of useful views have been added to help us further understand some things happening inside MySQL. The following memory-related views have been added to P_S:
  • memory_summary_by_account_by_event_name
  • memory_summary_by_host_by_event_name
  • memory_summary_by_thread_by_event_name
  • memory_summary_by_user_by_event_name
  • memory_summary_global_by_event_name

Looking at the name of the view, we know that this can help us to view memory consumption statistics from multiple perspectives such as account (including authorized host information), host, thread, user (not including authorized host information), and the overall overall situation .

In addition to memory statistics, there are some views related to transactions and replication, and some of the original views have been further enhanced.

在MySQL 5.7中,还集成了sys schema,关于sys schema大家可以看本文下方的推荐链接。sys schema主要是对Information_schema以及Performance_Schema的视图进一步增强处理,提高结果的可读性。比如,我们可以查看当前实例总消耗的内存,以及内存主要由哪些部分给占用了,也可以透过sys schema来查看:


建议大家应该花些时间好好再深入理解下I_S、P_S、sys schema,对我们这些非底层代码的MySQL DBA而言,这些都是很好的辅助手段。

Reference: 在 MySQL 5.7下排查内存泄露和 OOM 问题全过程 - 云+社区 - 腾讯云