[Beowulf] p4_error: net_recv read:  probable EOF on socket: 1
    Mark Hahn 
    hahn at physics.mcmaster.ca
       
    Mon May  8 11:04:34 PDT 2006
    
    
  
> p4_error:interrupt SIGSEGV: 11
well, some program tried to access inappropriate memory.
note that this _can_ be due to hardware problems (overheating,
bad memory, etc).
> p4_error: net_recv read:  probable EOF on socket: 1
afaik, this is from a different node and just means that it noticed
that its socket closed to the peer who SEGV'ed.
> This error occurs after running the code for several hours using all
> processors in my cluster.  I have seen several postings similar to this
> on the web, however, I have not seen any posted solutions.  My
for a good reason - the problem is probably particular to the cluster,
not general to the software...
> Mpich_1.2.1 compiled w/ Portland compilers
that said, it seems inappropriate to be running a quite old version.
wow, that actually dates from 09/05/2000, at least according to the 
timestamps on the mpich ftp server...
    
    
More information about the Beowulf
mailing list