[tulip-bug] ADMtek Comet bug

Donald Becker becker@scyld.com
Tue, 17 Oct 2000 21:36:44 -0400 (EDT)


On Tue, 17 Oct 2000, Dan Hollis wrote:

> Linksys v4.0 and v4.1 cards seem to randomly hang on transmit. The cards
> will continue to receive packets, just totally unable to
> transmit. Unplugging and replugging the RJ45 cable has no effect. An
> ifdown/ifup will clear the hang for a while, until it locks up again some
> hours/days later.
> 
> This is a repeatable bug on 2 completely different machines in completely
> different locations connected to completely different networks with
> completely different switches:
> 
> Athlon K7/700 on KX133 chipset connected to Linksys EZXS55W
> Celeron 366 on 440BX chipset connected to SMC EZ-NET 24SW

We are using a bunch of Centaur/P cards on our clusters and are not seeing
any problems. 

> When the card hangs, the driver prints this over and over:
> 
> eth0: Transmit timed out, status fc664010, CSR12 00000000, resetting...

The status indicates no problems.
Run 'tulip-diag' when this occurs.

> It looks like despite the driver's claim of "resetting", it really doesnt
> reset the card. Purely ifdown/ifup will clear the card.
> 
> The bug happens with either v0.92 4/17/2000 or v0.92l 8/19/2000 so it is
> not related to the driver version.

What is the start-up message.

Donald Becker				becker@scyld.com
Scyld Computing Corporation		http://www.scyld.com
410 Severn Ave. Suite 210		Second Generation Beowulf Clusters
Annapolis MD 21403			410-990-9993