[Beowulf] Performance degrading
Reuti
reuti at staff.uni-marburg.de
Mon Dec 21 16:01:57 PST 2009
Hi,
Am 15.12.2009 um 23:22 schrieb Jörg Saßmannshausen:
> Hi Gus,
>
> thanks for your comments. The problem is not that there are 5
> NWChem running.
> I am only starting 4 processes and the additional one is the master
> which
> does nothing more but coordinating the slaves.
isn't NWChem using Global Arrays internally, and Open MPI is only
used for communication? Which version of GA is included with your
current NWChem?
-- Reuti
> Other parts of the program behaving more as you expect it (again
> parallel
> between nodes, taken from one node):
> 14902 sassy 25 0 2161m 325m 124m R 100 2.8 14258:15 nwchem
> 14903 sassy 25 0 2169m 335m 128m R 100 2.9 14231:15 nwchem
> 14901 sassy 25 0 2177m 338m 133m R 100 2.9 14277:23 nwchem
> 14904 sassy 25 0 2161m 333m 132m R 97 2.9 14213:44 nwchem
> 14906 sassy 15 0 978m 71m 69m S 3 0.6 582:57.22 nwchem
>
> As you can see, there are 5 NWChem running but the fifth one does
> very little.
> So for me it looks like that the internode communication is a
> problem here and
> I would like to pin that down.
>
> For example, on the new dual quadcore I can get:
> 13555 sassy 20 0 2073m 212m 113m R 100 0.9 367:57.27 nwchem
> 13556 sassy 20 0 2074m 209m 109m R 100 0.9 369:11.21 nwchem
> 13557 sassy 20 0 2074m 206m 107m R 100 0.9 369:13.76 nwchem
> 13558 sassy 20 0 2072m 203m 103m R 100 0.8 368:18.53 nwchem
> 13559 sassy 20 0 2072m 178m 78m R 100 0.7 369:11.49 nwchem
> 13560 sassy 20 0 2072m 172m 73m R 100 0.7 369:14.35 nwchem
> 13561 sassy 20 0 2074m 171m 72m R 100 0.7 369:12.34 nwchem
> 13562 sassy 20 0 2072m 170m 72m R 100 0.7 368:56.30 nwchem
> So here there is no internode communication and hence I get the
> performance I
> would expect.
>
> The main problem is I am no longer the administrator of that
> cluster so
> anything which requires root access is not possible for me :-(
>
> But thanks for your 2 cent! :-)
>
> All the best
>
> Jörg
>
> Am Dienstag 15 Dezember 2009 schrieb beowulf-request at beowulf.org:
>> Hi Jorg
>>
>> If you have single quad core nodes as you said,
>> then top shows that you are oversubscribing the cores.
>> There are five nwchem processes are running.
>>
>> In my experience, oversubscription only works in relatively
>> light MPI programs (say the example programs that come with
>> OpenMPI or
>> MPICH).
>> Real world applications tend to be very inefficient,
>> and can even hang on oversubscribed CPUs.
>>
>> What happens when you launch four or less processes
>> on a node instead of five?
>>
>> My $0.02.
>> Gus Correa
>> ---------------------------------------------------------------------
>> Gustavo Correa
>> Lamont-Doherty Earth Observatory - Columbia University
>> Palisades, NY, 10964-8000 - USA
>> ---------------------------------------------------------------------
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
> Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf
mailing list