[Insight-users] Multithreading registration and Condor
Sara Rolfe
smrolfe at u.washington.edu
Tue Sep 27 13:41:54 EDT 2011
Hello,
I'm attempting to run a multithreaded registration. I am using a
computing cluster with 13 nodes managed by Condor. Each node us a
dual quad-core with 32GB RAM, running the 64-bit version of RHEL.
I recompiled ITK with the CMAKE flags:
ITK_USE_REVIEW
ITK_USE_OPTIMIZED_REGISTRATION_METHODS
and added the line:
registration->SetNumberOfThreads( 8 );
to my registration code. I've run the registration with the number of
threads set to 1, 4 and 8. However, I'm getting no improvement in the
run time. I would any advice on debugging this. I suspect that I am
not actually getting the additional threads. When I look at the job
details it looks like only one CPU was requested.
Below are the results from "condor_history -long pid" for the three
jobs (1, 4, and 8 threads):
2076.0 1 Thread Run Time: 0+05:28:28
LocalUserCpu 0
LocalSysCpu 0
RemoteUserCpu 23858
RemoteSysCpu 10
RequestCpus 1
2075.0 4 threads Run Time: 0+05:27:41
LocalUserCpu 0
LocalSysCpu 0
RemoteUserCpu 23680
RemoteSysCpu 9
RequestCpus 1
2077.0 8 Threads Run Time: 0+05:30:14
LocalUserCpu 0
LocalSysCpu 0
RemoteUserCpu 24025
RemoteSysCpu 10
RequestCpus 1
I also reran the job requesting 8 threads so I could look at the
allocation. The job was allocated one slot on one node. I then Iogged
onto that node and used the "top" command and the results are below.
Here it looks to me like multiple CPUs are being used. I'd appreciate
any thoughts on interpreting this.
top - 09:53:43 up 14 days, 20:45, 1 user, load average: 2.31, 0.84,
0.31
Tasks: 233 total, 3 running, 230 sleeping, 0 stopped, 0 zombie
Cpu0 : 0.0%us, 0.0%sy, 40.2%ni, 59.8%id, 0.0%wa, 0.0%hi,
0.0%si, 0.0%st
Cpu1 : 0.0%us, 0.0%sy, 85.7%ni, 14.3%id, 0.0%wa, 0.0%hi,
0.0%si, 0.0%st
Cpu2 : 0.0%us, 0.3%sy, 49.2%ni, 50.5%id, 0.0%wa, 0.0%hi,
0.0%si, 0.0%st
Cpu3 : 0.0%us, 0.0%sy, 84.9%ni, 15.1%id, 0.0%wa, 0.0%hi,
0.0%si, 0.0%st
Cpu4 : 0.0%us, 0.3%sy, 45.0%ni, 54.6%id, 0.0%wa, 0.0%hi,
0.0%si, 0.0%st
Cpu5 : 0.0%us, 0.0%sy, 84.4%ni, 15.6%id, 0.0%wa, 0.0%hi,
0.0%si, 0.0%st
Cpu6 : 0.0%us, 0.3%sy, 39.9%ni, 59.8%id, 0.0%wa, 0.0%hi,
0.0%si, 0.0%st
Cpu7 : 0.0%us, 0.3%sy, 84.4%ni, 15.2%id, 0.0%wa, 0.0%hi,
0.0%si, 0.0%st
Cpu8 : 0.0%us, 0.3%sy, 95.7%ni, 4.0%id, 0.0%wa, 0.0%hi,
0.0%si, 0.0%st
Cpu9 : 0.0%us, 0.0%sy, 93.0%ni, 7.0%id, 0.0%wa, 0.0%hi,
0.0%si, 0.0%st
Cpu10 : 0.0%us, 0.0%sy, 85.7%ni, 14.3%id, 0.0%wa, 0.0%hi,
0.0%si, 0.0%st
Cpu11 : 0.0%us, 0.3%sy, 95.0%ni, 4.7%id, 0.0%wa, 0.0%hi,
0.0%si, 0.0%st
Cpu12 : 0.0%us, 0.0%sy, 94.7%ni, 5.3%id, 0.0%wa, 0.0%hi,
0.0%si, 0.0%st
Cpu13 : 0.0%us, 0.0%sy, 99.3%ni, 0.7%id, 0.0%wa, 0.0%hi,
0.0%si, 0.0%st
Cpu14 : 0.0%us, 0.0%sy, 92.0%ni, 8.0%id, 0.0%wa, 0.0%hi,
0.0%si, 0.0%st
Cpu15 : 0.0%us, 0.0%sy, 96.7%ni, 3.3%id, 0.0%wa, 0.0%hi,
0.0%si, 0.0%st
Mem: 32948672k total, 3668612k used, 29280060k free, 359324k buffers
Swap: 31262480k total, 0k used, 31262480k free, 1014788k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
3333 smrolfe 35 10 3004m 2.0g 3700 R 1267.0 6.4
5:40.47 condor_exec.exe
3406 smrolfe 15 0 12880 1216 832 R 0.3 0.0
0:00.22 top
1 root 15 0 10368 632 536 S 0.0
0.0 0:01.79 init
2 root RT -5 0 0 0 S 0.0 0.0
0:00.04 migration/0
3 root 34 19 0 0 0 S 0.0
0.0 0:00.00 ksoftirqd/0
4 root RT -5 0 0 0 S 0.0
0.0 0:00.00 watchdog/0
5 root RT -5 0 0 0 S 0.0
0.0 0:00.03 migration/1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.itk.org/pipermail/insight-users/attachments/20110927/41c30472/attachment.htm>
More information about the Insight-users
mailing list