[tulip-bug] patch avoids lockups under high load
Josip Loncaric
josip@icase.edu
Fri, 02 Feb 2001 18:19:56 -0500
This is a multi-part message in MIME format.
--------------D96B7C3FC45C0A1E12BF396E
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
When subjected to very heavy load, our FA310TX (PNIC) cards are prone to
protracted (even indefinite) lockups with tulip.c including the latest
v0.92t (1/15/2001).
The problem is that under heavy load the driver masks interrupting
sources and expects to receive TimerInt, which never arrives (the timer
is apparently nonfunctional on FA310TX). Sometimes so many sources are
masked that no new interrupts can arrive and the card is completely
locked until network restart. In my stress tests, this happens about
2-6 times per minute (i.e. with probability of about one in a million).
Attached is a simple patch which avoids this problem with high
probability. The idea is acknowledge all interrupt sources as usual but
to avoid turning off interrupting sources unless work budget was
exceeded a couple of times in a row. This still protects the system,
but drastically reduces the probability of a lockup.
We are now using patched tulip.c:v0.92t now and so far the lockup
problem seems to be gone.
Sincerely,
Josip
--
Dr. Josip Loncaric, Senior Staff Scientist mailto:josip@icase.edu
ICASE, Mail Stop 132C PGP key at http://www.icase.edu./~josip/
NASA Langley Research Center mailto:j.loncaric@larc.nasa.gov
Hampton, VA 23681-2199, USA Tel. +1 757 864-2192 Fax +1 757 864-6134
--------------D96B7C3FC45C0A1E12BF396E
Content-Type: text/plain; charset=us-ascii;
name="tulip.c-0.92t-patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="tulip.c-0.92t-patch"
--- tulip.c-0.92t Mon Jan 15 02:29:36 2001
+++ tulip.c Fri Feb 2 17:39:50 2001
@@ -38,6 +38,7 @@
/* Maximum events (Rx packets, etc.) to handle at each interrupt. */
static int max_interrupt_work = 25;
+static int HiLoadCount = 0; /* number of times in a row when max_interrupt_work exceeded */
#define MAX_UNITS 8
/* Used to pass the full-duplex flag, etc. */
@@ -2828,12 +2829,16 @@
to develop a good interrupt mitigation setting.*/
outl(0x8b240000, ioaddr + CSR11);
} else {
+ if (HiLoadCount++ > 0) {
/* Mask all interrupting sources, set timer to re-enable. */
outl(((~csr5) & 0x0001ebef) | AbnormalIntr | TimerInt,
ioaddr + CSR7);
outl(0x0012, ioaddr + CSR11);
+ }
}
break;
+ } else {
+ HiLoadCount=0;
}
} while (1);
--------------D96B7C3FC45C0A1E12BF396E--