High CPU usage during backup

The technical support forum for Firestreamer-RM 2.x (the virtual tape drive for NTBackup).
chcpnf
Posts: 11
Joined: 29 Jul 2011, 14:21

Post by chcpnf »

So, here I am again, with some logs. I have plotted some performance indicators vs time. The backup started on 8/12 at 21.00h. Sorry for the German labels.
Attachments
Transactions involving the drive where the backup is written to.
Transactions involving the drive where the backup is written to.
Transactions_Dest.png (10.03 KiB) Viewed 31081 times
CPU usage vs time.
CPU usage vs time.
CPU_Time.png (8.16 KiB) Viewed 31081 times
Available RAM in megabytes. This supports a memory leak.
Available RAM in megabytes. This supports a memory leak.
Available_RAM.png (8.57 KiB) Viewed 31081 times
chcpnf
Posts: 11
Joined: 29 Jul 2011, 14:21

Post by chcpnf »

And some more...
Attachments
Task manager after the backup has finished.
Task manager after the backup has finished.
Taskman_After.png (7.62 KiB) Viewed 31081 times
Task manager during backup.
Task manager during backup.
Taskman_Backup.png (8.13 KiB) Viewed 31081 times
Transactions involving the drive where the backup is stored.
Transactions involving the drive where the backup is stored.
Transactions_Dest.png (10.03 KiB) Viewed 31081 times
jsf
Cristalink Support
Posts: 300
Joined: 29 Aug 2010, 09:03

Post by jsf »

In the Task Manager process list, check which process uses most of the memory. If it's System, then the memory leak happens in some driver (is a Silicon Image controller involved?); see the FAQ link in my previous post on how to troubleshoot it. If it's some other user mode process, then you can simply disable it.
Best regards,
John Smith
Cristalink Support
chcpnf
Posts: 11
Joined: 29 Jul 2011, 14:21

Post by chcpnf »

I had a screenshot of the process list as well, but didn't save it. Bottom line, there was no process showing up with abnormal use of memory. But in the two taskman screenshots, the main difference is in the amount of kernel memory ("Kernel-Speicher"), which seems to support a driver-related problem. The kb article dealing with storport.sys apparently does not apply to Windows Server 2003, but I'll look into the poolmon utility.
There is indeed a Silicon Image controller (3114) present in the system, but it is not involved in the backup process. The source drive is a RAID on a 3ware 9500s controller, the destination is the aforementioned iSCSI network drive.
jsf
Cristalink Support
Posts: 300
Joined: 29 Aug 2010, 09:03

Post by jsf »

You may want to search the Internet for more info on how to track down kernel memory leaks.

You can also try Driver Verifier Manager (verifier.exe). Select several drivers to verify at a time. Select "Special Pool" and "Pool Tracking", don't select anything else. Verifier allows you to see certain info about "currently verified drivers", including the number and total size of memory allocations.
Best regards,
John Smith
Cristalink Support
chcpnf
Posts: 11
Joined: 29 Jul 2011, 14:21

Post by chcpnf »

Thanks, will do.
chcpnf
Posts: 11
Joined: 29 Jul 2011, 14:21

Post by chcpnf »

Ok, one last try... I have taken some stack traces in Process Explorer during backup, maybe they contain some hints. The CPU use is divided between Interrupts and the System process. Inside system, I captured the stack of the two threads that momentarily used the most CPU time.
I also had a look at the output of verifier.exe, but I'm unable to make much of it. However, there seem to be no failing pool memory allocations.
Attachments
Stack trace of the NRDVDSTR.sys thread consuming most CPU time.
Stack trace of the NRDVDSTR.sys thread consuming most CPU time.
NRDVDSTR.png (4.84 KiB) Viewed 31078 times
Stack trace of the ntoskrnl.exe thread consuming most CPU time.
Stack trace of the ntoskrnl.exe thread consuming most CPU time.
ntoskrnl.png (3.5 KiB) Viewed 31078 times
System process in Process Explorer
System process in Process Explorer
System.png (8.88 KiB) Viewed 31078 times
jsf
Cristalink Support
Posts: 300
Joined: 29 Aug 2010, 09:03

Post by jsf »

It doesn't really matter why the CPU usage increases when the memory is low. There may be many different reasons. One of them, looking at your last screenshots, Windows is busy swapping thread contexts.

In Verifier, look not for failed allocations but for the total size and number of allocations. The driver that leaks memory will have these values constantly increasing as the backup progresses. Eventually it will have tens or hundreds of megabytes allocated comparing to the low values at the beginning of the backup. Check the drivers in the storage stack first.
Best regards,
John Smith
Cristalink Support
Locked