[bproc]MPI chokes
Arthur H. Edwards,1,505-853-6042,505-256-0834
edwards@icantbelieveimdoingthis.com
Thu, 15 Mar 2001 08:46:55 -0700
Erik Arjan Hendriks wrote:
> On Wed, Mar 14, 2001 at 04:44:29PM -0700, Art Edwards wrote:
>
>> I've installed Scyld on a small cluster and I'm trying to
>> run the test programs that come with beompi
>>
>> The codes run on one node. However, when I try to run
>> on multiple nodes I get the following error
>>
>> jarrett/home/edwardsa>mpirun -np 2 pi3p
>> p0_28682: p4_error: net_create_slave: bproc_rfork: -1
>> p4_error: latest msg from perror: Invalid argument
>> jarrett/home/edwardsa>bm_list_28683: p4_error: interrupt SIGINT: 2
>>
>> I have asked about this in a previous message, so here
>> are two more specific questions.
>>
>> The master node has a hostname that is not node0. The first
>> slave node is, as far as beosetup, is node0. Is this a problem?
>
> In BProc's terms, the nodes are numbered 0 through n-1. The front end
> is node -1.
>
>
>> When beompi assigns nodes does it look at a machines file?
>> Should I install a HOSTNAME file on each slave?
>
> BProc doesn't use any host names anywhere so nothing involving
> hostnames will affect whether or an rfork works.
>
> There's some other MPI issue going on here.
>
> - Erik
>
>
>
Thanks for the reply. The program dies in the PMPI_INIT phase. What
should I be doing to figure this out?
Art Edwards