Transmit timed out: which driver is best now?
Andrey Savochkin
saw@saw.sw.com.sg
Tue Mar 14 21:11:02 2000
Hello,
On Tue, Mar 14, 2000 at 03:30:58PM -0800, Michael J. Rensing wrote:
> I've got a 39 node cluster running, all with the Intel 82588 chip.
> Currently, the systems are running a 2.2.12-20 kernel with the 1.06
> eepro100 drivers. I'm getting the "Transmit timed out / Trying to
> restart the transmitter" problem so many people have been discussing.
>
> Can anyone tell me what the best (current) solution is?
> a) upgrade to 1.09l
> b) use Intel drivers
> c) use Andrei's drivers (where do I get them)
> d) other solution
I recommend (c) :-)
The necessary changes are incorporated into 2.2.15pre13 and later kernels
in ftp.kernel.org/pub/linux/kernel/people/alan/2.2.15pre/
For 2.3 kernels the driver is available at
ftp://ftp.sw.com.sg/pub/Linux/people/saw/kernel/v2.3/
>
> Also, will this really fix things, or could I be risking further
> problems?
My changes address exactly the core reasons of the problems and implement
- accurate tbusy management without race conditions (except the one forced
by the hardware design)
- correct buffer ring refilling (the usual reason for card hangs and
thus timeouts, happen under high load)
- correct multicast list setup (avoiding stray pointers in the TX ring)
- thoroughly tested TX timeout handler to avoid looping timeouts because of
incomplete reset and reconfiguration.
I haven't heard about more problems with my clone of the driver (except not
clear problems with 82559ER cards and interrupt acknowledgement). I
personally use this version driver on rather loaded and critical servers.
Best regards
Andrey V.
Savochkin
-------------------------------------------------------------------
To unsubscribe send a message body containing "unsubscribe"
to linux-eepro100-request@beowulf.org