Server CPU remains high-problem solving process

Server CPU remains high-problem solving process

Basic overview

On a server cluster, the server's CPU remains high for a long time, and the response time is always very slow. Even if the server's CPU capacity is expanded, the reduction effect of the server's CPU is not very obvious.

The reasons for the high CPU can be summarized as follows:

  • Too many loops or endless loops
  • Too much data is loaded, resulting in a lot of large objects
  • Too many objects are generated, and the GC recycles too frequently (such as: string splicing)

For the above situation, the difficulty is not to optimize the code, but to locate the problem. Below, we will use Dump to capture the packet to locate the problem. Before introducing this content, we must first review the basic knowledge of garbage collection in .Net and the preparation of a tool.

Basic knowledge

Garbage collection trigger conditions

  • The code shows the static method calling System.GC
  • Windows reports low memory conditions
  • CLR is uninstalling AppDoamin
  • CLR is shutting down

Large object garbage collection

The CLR divides objects into large objects and small objects. It considers that bytes larger than 85,000 bytes or larger are large objects. The CLR treats large objects and small objects in different ways:

  • Large objects are not allocated in the address space of small objects, but in the process address space and elsewhere
  • GC does not compress large objects, and it is too expensive to move them in memory, but this will cause fragmentation of the address space, so that OutOfMemeryException will be thrown .
  • Large objects are always collected in the second generation.

Tool preparation

  1. Download the windbg file
  2. Related DLLs prepare clr.dll and sos.dll, (both are under the installation directory of the corresponding .Net version, and my installation directory is C:\Windows\Microsoft.NET\Framework64\v4.0.30319)
  3. The DUMP file of a higher period of cpu operation (how to obtain it will be described below)
  4. Prepare the test code. For the convenience of demonstration, a code with potential problems is simply written here:
public class Common
    public static List<string> GetList()
        var list=new List<string>();
        for (int i = 0; i <10000; i++)
        return list;

    public static string GetString(List<string> list)
        var str = "";
        foreach (var l in list)
            str += string.Format("'{0}',", l);
        if (str.Length> 0)
        return str;

We know that during the splicing of strings, each string is an object, and a new object is generated after splicing. Therefore, there will be a lot of GC operations in the GetString method. Let's call this code below. In the case of CPU, in order to simulate concurrency, we open multiple tags, and each tag is refreshed every 1s.

Grab dump

Select w3wp.exe corresponding to the application pool in the task manager, right-click -> create dump file. After the creation is complete, the specified path will be prompted

According to the above steps, we prepare the files for our analysis as follows:

Analysis Dump

  • Open windbg and load the corresponding dump file
  • Configure Sysmbol, add "cache c:\mysymbol;srv"
  • Load sos.dll and clr.dll, the command is as follows: .load D:\windbg\sos.dll .load D:\windbg\clr.dll
  • Run the command !threadpool to display information about the managed thread pool, and some other SOS debugging extension commands .
  • Run! runaway to query the thread IDs that take a long time for the CPU
  • Run ~22s (enter thread view), kb (view corresponding call)
  • Run ~* kb to view the stack calls of all threads
  • Search for threads where GC and large objects appear above (ctrl+f search: GarbageCollectGeneration and allocate_large_object)
  • You can see that the thread that triggered the GC is thread 31
  • Run the command ~31s to enter the 31 thread, and then run !clrstack to view the stack call. Finally, you can locate the problematic code. The concatenation of strings causes a large number of objects to be generated, which triggers the GC.

(End of this article)

Author: Old pay if they feel there is help for you, you can subscribe below, or select the right side of the donation, if there are problems, please donate after consultation, thank you if you have any intellectual property rights, copyright issues or theory wrong, please correct me . Freely reprint-non-commercial-non-derivative-keep the signature, please follow: Creative Commons 3.0 License , please join group 113249828 for communication: click to add group or email me

Reference: Server CPU remains high-problem solving process-cloud + community-Tencent Cloud