[eepro100] Transmit timed out with high Tx load
Donald Becker
becker@scyld.com
Sun Feb 3 23:57:00 2002
On Thu, 31 Jan 2002, Andrew Pam wrote:
> I have a router with six Intel PCI EtherExpress Pro100 adapters,
> eth0 through eth5, interrupts as follows:
>
> eth0 IRQ5, eth1 IRQ12, eth2 IRQ10, eth3 IRQ11, eth4 IRQ5, eth5 IRQ12
...
> The system is RedHat 7.2 with kernels 2.4.9-7 and 2.4.17. eth4 is not
> presently in use, and IRQ5 is also shared with USB. eth1,2,3 and eth5
> have no problems whatsoever even under fairly heavy load. eth0 however
> constantly has transmit timeouts and errors, regardless of whether the
> usb driver module is loaded or not.
>
> With the stock eepro100 driver from kernels 2.4.9 and 2.4.17 (v1.09j-t)
> the following errors are logged:
>
> Jan 31 13:23:56 statistix kernel: NETDEV WATCHDOG: eth0: transmit timed out
> Jan 31 13:23:56 statistix kernel: eth0: Transmit timed out: status ffff ffff at
> 9179585/9179613 command 0001a000.
This status (0xffff) indicates that the eepro100 device is not
responding. It might be powered off, or the PCI address decoding isn't
working correctly.
> I compiled and installed the latest v1.19 drivers from www.sycld.com
> and now get the following errors:
>
> Jan 31 15:48:57 statistix kernel: Command 00ff was not immediately accepted, 100
> 01 ticks!
This indicates a similar problem.
> Jan 31 15:49:01 statistix kernel: eth0: IRQ 5 is physically blocked! Failing bac
> k to low-rate polling.
This is misleading -- the value 0xffff looks as if the chip is trying to
raise an interrupt.
We will need more info to track this down.
Donald Becker becker@scyld.com
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210 Second Generation Beowulf Clusters
Annapolis MD 21403 410-990-9993