[Beowulf] p4_error
    Mark Hahn 
    hahn at physics.mcmaster.ca
       
    Thu Dec 29 14:33:05 PST 2005
    
    
  
> The following are usual errors that we have to counter every day::
> 
> p11_2754:(1.148519)net_recv failed for fd=3.
> p11_22754 : p_4error net_recv read,errno=:104
> p16_2754 : p4_error : interrupt S1GSEGV:11
your program on p16 seg-faults (bad address, etc - could be your program,
some library, or marginal hardware).  p11 is trying to communicate with it,
and quite sensibly reports that the socket between them has disappeared:
/usr/include/asm/errno.h:#define        ECONNRESET      104     /* Connection reset by peer */
    
    
More information about the Beowulf
mailing list