[Beowulf] Troubleshooting NFS stale file handles

Gavin W. Burris bug at wharton.upenn.edu
Thu Apr 20 06:17:32 PDT 2017


Hi, Prentice.

Have you checked MTU matches on all NICs and is honored by the router?

Cheers.

On Wed 04/19/17 02:34PM EDT, Prentice Bisbal wrote:
> 
> On 04/19/2017 02:17 PM, Ellis H. Wilson III wrote:
> >On 04/19/2017 02:11 PM, Prentice Bisbal wrote:
> >>Thanks for the suggestion(s). Just this morning I started considering
> >>the network as a possible source of error. My stale file handle errors
> >>are easily fixed by just restarting the nfs servers with 'service nfs
> >>restart', so they aren't as severe you describe.
> >
> >If a restart on solely the /server-side/ gets you back into a good
> >state this is an interesting tidbit.
> That is correct, restarting NFS on the server-side is all it takes
> to fix the problem
> >Do you have some form of HA setup for NFS?  Automatic failover
> >(sometimes setup with IP aliasing) in the face of network hiccups
> >can occasionally goof the clients if they aren't setup properly to
> >keep up with the change.  A restart of the server will likely
> >revert back to using the primary, resulting in the clients
> >thinking everything is back up and healthy again.  This situation
> >varies so much between vendors it's hard to say much more without
> >more details on your setup.
> >
> My setup isn't nearly that complicated. Every node in this cluster
> has a /local directory that is shared out to the other nodes in the
> cluster. The other nodes automount this by remote directory as
> /l/hostname, where "hostname" is the name of owner of the
> filesystem. For example, hostB will mount hostA:/local as /l/lhostA.
> 
> No fancy fail-over or anything like that.
> >Best,
> >
> >ellis
> >
> >P.S., apologies for the top-post last time around.
> >
> NO worries. I'm so used to people doing that, in mailing lists that
> I've become numb to it.
> 
> Prentice
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Gavin W. Burris
Senior Project Leader for Research Computing
The Wharton School
University of Pennsylvania
Search our documentation: http://research-it.wharton.upenn.edu/about/
Subscribe to the Newsletter: http://whr.tn/ResearchNewsletterSubscribe


More information about the Beowulf mailing list