This third post of the ClrMD series focuses on how to retrieve value of static and instance fields by taking timers as an example. The next post will dig into the details of figuring out which method gets called when a timer triggers. As an example, the associated code lists all timers in a dump and covers both articles.
Part 1: Bootstrapping ClrMD
Marshaling data from a dump
Beyond heap navigation shown in the previous post, the big thing to understand about ClrMD is that the retrieved information is often an address. An address from another address space because the dump is seen as another process just like if you were debugging it live. Your code will need to access the other process memory corresponding to this address; not directly with a pointer/reference indirection or with the raw Win32 ReadProcessMemory API function but via APIs like GetObjectType or GetValue.
To illustrate how to navigate into the dump address space with ClrMD, we will show how to list the timers that have been started. This can be useful to investigate various issues, such as leaks or timers being stuck.
Know your framework
A naive implementation, like the string example of the previous post, would try to list all object instances in the CLR heap and look at Timer instances only. However, as it has been mentioned already, this is very inefficient in terms of performance; especially for 10+ GB dumps…
It is time to figure out what happens in the .NET runtime when your code creates a new timer. If the source code of the version of the CLR you are using is not available, start your favorite IL decompiler and look at the System.Threading.Timer implementation details. The parameters given to the constructors (such as the due time, period, and callback method, in addition to its optional parameter if any) are not stored in the class itself but in the TimerQueueTimer helper class.
The Timer constructor code, after a few sanity checks, calls the TimerSetup method to wrap a TimerQueueTimer in a TimerHolder that is stored in the Timer m_timer field.
This is where things start to become interesting: this TimerQueueTimer class adds each new instance into a linked list kept by a singleton object stored in the static s_queue field of the TimerQueue class. The following figure shows the relation between instances after three timers are created:
So… a fast way to list the timers would be to get the unique static instance of TimerQueue, look at its m_timers field and iterate on each TimerQueueTimer by following their m_next field until it contains null. The rest of the post details the following operations with ClrMD:
- quickly getting a ClrType
- reading a static field
- reading an instance field
to fill up a collection of our own TimerInfo used to easily create a summary:
This is wrapped inside a helper method described in the next few sections:
As explained in the previous post, you need to ensure that the process was not in the middle of a garbage collection when the dump was taken by checking the value of the ClrHeap.CanWalkHeap property.
Standing on the shoulders of giants
I have found the different steps to get access to the static fields of classes in the ClrMD implementation from GitHub. In addition to the samples, I highly recommend that you take a look at the classes under Desktop:
These types are using optimized ways to access information from the CLR.
Let’s go back to our first goal: getting the value of the static s_queue field of the TimerQueue class. One of the very efficient optimization found in the ClrMD implementation is to directly get a ClrType from a module and call its GetTypeByName method instead of iterating the heap until an instance of the type is found. In our case, we need to access TimerQueue which is a type from mscorlib. Here is the code of the helper function from Desktop\threadpool.cs to get a ClrModule for mscorlib:
The following line sets timerQueueType with the ClrType corresponding to TimerQueue:
Next, get the ClrStaticField corresponding to the static field s_queue:
The staticField variable is not the static instance but rather a way to access it… or them.
But where are my statics!
Let’s take some time to explain a “detail” of the .NET Framework to help you understand how to get the static TimerQueue instance. Unlike previous Windows frameworks, .NET allows a process to contain several running environments called application domains (a.k.a. AppDomains). For a better isolation, each AppDomain has its own set of static variables: this is why you need to iterate on each AppDomain with ClrMD to access the static instances:
The address returned by ClrStaticField.GetValue is nullable because, in an AppDomain where no TimerQueue has ever been used, its fields won’t be initialized.
We don’t really need to map this address from the dump address space into something usable in the tool. Instead, only the value of the m_timers field is interesting to be able to start iterating on the list of timers.
How to get the values of instance fields?
Now that we have an address in the dump and the ClrType describing the type of the corresponding object (TimerQueue here), it is easy to retrieve the value of one of its instance fields. Since this action is needed again and again to move from one TimerQueueTimer object to the next, it is valuable to create a helper method:
The address of the object in the dump is used to get its ClrType. The ClrInstanceField (instead of a ClrStaticField as for the s_queue case) describing the property exposes the expected GetValue method. Note that the return value of GetValue is defined as System.Object but you should understand it as the numeric value stored in the dump (or the other process address space) at the given address. For the simple value types such as boolean, number and even ulong address, a cast will be enough to transparently marshal the value into the tool with ClrMD.
Let’s go back to writing the code to access to head of the TimerQueueTimer list from the TimerQueue static instance:
currentPointer holds the address of each TimerQueueTimer in the list kept by the static TimerQueue.
Note the ((ulong)currentPointer) != 0) test in the while loop to detect the end of the list when the m_next field is null.
After enumerating each timer, the next post will show how to extract details such as the due time, the period, and even which method is called when it ticks.
Staff Software Engineer, R&D.
Senior Software Engineer
Our lovely Community Manager / Event Manager is updating you about what's happening at Criteo Labs.See DevOps Engineer roles