Fw: [eepro100] Kernel panic: Aiee, killing interrupt handler!
Kallol Biswas
kallol@efi.com
Fri Jan 11 13:03:03 2002
--------------InterScan_NT_MIME_Boundary
Content-Type: multipart/alternative;
boundary="------------11DBDA31FDDA6043442BE2F5"
--------------11DBDA31FDDA6043442BE2F5
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Have you tried putting a differentcard?I guess you will see the panic
again.
We have been having this problem on a celeron based system. The
progress I have made
so far is that for some reason the global variables' address space is
getting corrupted.
One of the panics:
Subject:
Re: [eepro100] Kernel panic: Aiee, killing interrupt handler!
Date:
Wed, 09 Jan 2002 18:27:19 -0800
From:
Kallol Biswas <kallol@efi.com>
CC:
gareth.baron@efi.com, frederic.roussel@efi.com
References:
1 , 2
Here is a dmesg buffer
from a crash:
# Oops: 0000
CPU: 0
EIP: 0010:[<3b3a3938>]
EFLAGS: 00010046
eax: 3b3a3938 ebx: 00000000 ecx: c01b9f24 edx: 00000000
esi: c01b8000 edi: 00003fb9 ebp: c01b9f1c esp: c01b9f08
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 0, process nr: 0, stackpage=c01b9000)
Stack: c01b9f24 00000000 c010a3c9 c01184a5 00000000 00001fda c0109fc8
00000000
c01b8000 00000000 c01b8000 00003fb9 00001fda 00000000 00000018
ffff0018
ffffff00 c0107811 00000010 00000246 c01b8000 0009b800 c0106000
0009b800
Call Trace: [<c010a3c9>] [<c01184a5>] [<c0109fc8>] [<c0107811>]
[<c0106000>] [<c0107f9d>] [<c010783a>]
[<c01090b0>] [<c0109068>] [<c0106000>] [<c0100018>] [<c0106088>]
[<c0106000>] [<c0100175>]
Code: <1>Unable to handle kernel paging request at virtual address
3b3a3938
current->tss.cr3 = 00101000, %cr3 = 00101000
The tarce is
c010a3c9 : do_IRQ =>>>>>> calls the routine do_bottom half at this
line
c01184a5: do_bottomhalf
c0109fc8:common_interrupt
,.....................................
Panic is due to invalid EIP value 0010:3b3a3938, (note that the EAX
register also has the same value.).
The do_bottom_half routine:
smlinkage void do_bottom_half(void)
{
int cpu = smp_processor_id();
if (softirq_trylock(cpu)) {
if (hardirq_trylock(cpu)) {
__sti();
run_bottom_halves();
__cli();
hardirq_endlock(cpu);
}
softirq_endlock(cpu);
}
:w
The dump of the routine from object file:
00000000 <do_bottom_half>:
0: 83 ec 0c sub $0xc,%esp
3: 55 push %ebp
4: 57 push %edi
5: 56 push %esi
6: 53 push %ebx
7: 83 3d 00 00 00 00 00 cmpl $0x0,0x0
e: 75 54 jne 64 <do_bottom_half+0x64>
10: c7 05 00 00 00 00 01 movl $0x1,0x0
17: 00 00 00
1a: 31 ed xor %ebp,%ebp
1c: bf 00 00 00 00 mov $0x0,%edi
21: 83 3d 00 00 00 00 00 cmpl $0x0,0x0
28: 75 32 jne 5c <do_bottom_half+0x5c>
2a: fb sti
2b: 8b 1d 00 00 00 00 mov 0x0,%ebx
31: 23 1d 00 00 00 00 and 0x0,%ebx
37: 89 d8 mov %ebx,%eax
39: f7 d0 not %eax
3b: ba 00 00 00 00 mov $0x0,%edx
40: 21 02 and %eax,(%edx)
42: be 00 00 00 00 mov $0x0,%esi
47: 90 nop
48: f6 c3 01 test $0x1,%bl
4b: 74 04 je 51 <do_bottom_half+0x51>
4d: 8b 06 mov (%esi),%eax
4f: ff d0 call *%eax
51: 83 c6 04 add $0x4,%esi
54: c1 eb 01 shr $0x1,%ebx
57: 75 ef jne 48 <do_bottom_half+0x48>
59: fa cli
5a: 89 f6 mov %esi,%esi
5c: c7 44 3d 00 00 00 00 movl $0x0,0x0(%ebp,%edi,1)
63: 00
64: 5b pop %ebx
65: 5e pop %esi
66: 5f pop %edi
67: 5d pop %ebp
68: 83 c4 0c add $0xc,%esp
6b: c3 ret
The machine code from the Emulator after the crash is:
010:C0118454 83 EC 0C 55 57 56 53 83 3D 60 67 1D C0 00 75 54
0010:C0118464 C7 05 60 67 1D C0 01 00 00 00 31 ED BF 60 67 1D
0010:C0118474 C0 83 3D 5C 67 1D C0 00 75 32 FB 8B 1D AC BD 1A
0010:C0118484 C0 23 1D A8 BD 1A C0 89 D8 F7 D0 BA A8 BD 1A C0
0010:C0118494 21 02 BE 80 67 1D C0 90 F6 C3 01 74 04 8B 06 FF
0010:C01184A4 D0 83 C6 04 C1 EB 01 75 EF FA 89 F6 C7 44 3D 00
0010:C01184B4 00 00 00 00 5B 5E 5F 5D 83 C4 0C C3
The machine code for do_bottom_half is not corrupted.
At offset 4f: the instruction call *%eax casues the IP to be loaded with
3b3a3938 which is the value of eax.
active = get_active_bhs();
clear_active_bhs(active);
bh = bh_base;
do {
if (active & 1)
(*bh)();
bh++;
active >>= 1;
} while (active);
}
call *%eax corresponds to (*bh)();
probably the memory at bh_base got corrupted.
sam wrote:
> Hello All I did not get any solution so i am posting again. Sam
> ----- Original Message -----
> From: sam
> To: eepro100@scyld.comSent: Thursday, January 10, 2002 4:35 AMSubject:
> [eepro100] Kernel panic: Aiee, killing interrupt handler!
> Hello All,I am gtting this error when linux machine get more tarffic
> on NIC Kernel panic: Aiee, killing interrupt handler!
> In interrupt handler not syncing System configuration is P-III 900
> MHz512 MB RAM2 100Mbs ONboard eepro100 NICRedhat-7.1
> kernel-2.4.13 Thanks-Sam
--------------11DBDA31FDDA6043442BE2F5
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: 7bit
<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
<body bgcolor="#FFFFFF">
Have you tried putting a differentcard?I guess you will see
the panic again.
<br>We have been having this problem on a celeron based system. The
progress I have made
<br>so far is that for some reason the global variables' address space
is getting corrupted.
<p>One of the panics:
<p>Subject:
<br> Re: [eepro100]
Kernel panic: Aiee, killing interrupt handler!
<br> Date:
<br> Wed, 09 Jan
2002 18:27:19 -0800
<br> From:
<br> Kallol Biswas
<kallol@efi.com>
<br> CC:
<br> gareth.baron@efi.com,
frederic.roussel@efi.com
<br> References:
<br> 1 , 2
<br>
<br>
<br>
<p>Here is a dmesg buffer
<br> from a crash:
<p># Oops: 0000
<br>CPU: 0
<br>EIP: 0010:[<3b3a3938>]
<br>EFLAGS: 00010046
<br>eax: 3b3a3938 ebx: 00000000 ecx: c01b9f24
edx: 00000000
<br>esi: c01b8000 edi: 00003fb9 ebp: c01b9f1c
esp: c01b9f08
<br>ds: 0018 es: 0018 ss: 0018
<br>Process swapper (pid: 0, process nr: 0, stackpage=c01b9000)
<br>Stack: c01b9f24 00000000 c010a3c9 c01184a5 00000000 00001fda c0109fc8
<br>00000000
<br> c01b8000 00000000 c01b8000 00003fb9
00001fda 00000000 00000018
<br>ffff0018
<br> ffffff00 c0107811 00000010 00000246
c01b8000 0009b800 c0106000
<br>0009b800
<br>Call Trace: [<c010a3c9>] [<c01184a5>] [<c0109fc8>] [<c0107811>]
<br>[<c0106000>] [<c0107f9d>] [<c010783a>]
<br> [<c01090b0>] [<c0109068>]
[<c0106000>] [<c0100018>] [<c0106088>]
<br>[<c0106000>] [<c0100175>]
<br>Code: <1>Unable to handle kernel paging request at virtual address
<br>3b3a3938
<br>current->tss.cr3 = 00101000, %cr3 = 00101000
<br>
<p>The tarce is
<p>c010a3c9 : do_IRQ =>>>>>> calls the routine do_bottom half
at this line
<br>c01184a5: do_bottomhalf
<br>c0109fc8:common_interrupt
<br>,.....................................
<p>Panic is due to invalid EIP value 0010:3b3a3938, (note that the EAX
<br>register also has the same value.).
<p>The do_bottom_half routine:
<p>smlinkage void do_bottom_half(void)
<br>{
<br> int cpu = smp_processor_id();
<p> if (softirq_trylock(cpu))
{
<br>
if (hardirq_trylock(cpu)) {
<br>
__sti();
<br>
run_bottom_halves();
<br>
__cli();
<br>
hardirq_endlock(cpu);
<br>
}
<br>
softirq_endlock(cpu);
<br> }
<br>:w
<p>The dump of the routine from object file:
<p>00000000 <do_bottom_half>:
<br> 0: 83 ec 0c
sub $0xc,%esp
<br> 3: 55
push %ebp
<br> 4: 57
push %edi
<br> 5: 56
push %esi
<br> 6: 53
push %ebx
<br> 7: 83 3d 00 00 00 00 00
cmpl $0x0,0x0
<br> e: 75 54
jne 64 <do_bottom_half+0x64>
<br> 10: c7 05 00 00 00 00 01 movl
$0x1,0x0
<br> 17: 00 00 00
<br> 1a: 31 ed
xor %ebp,%ebp
<br> 1c: bf 00 00 00 00
mov $0x0,%edi
<br> 21: 83 3d 00 00 00 00 00 cmpl
$0x0,0x0
<br> 28: 75 32
jne 5c <do_bottom_half+0x5c>
<br> 2a: fb
sti
<br> 2b: 8b 1d 00 00 00 00
mov 0x0,%ebx
<br> 31: 23 1d 00 00 00 00
and 0x0,%ebx
<br> 37: 89 d8
mov %ebx,%eax
<br> 39: f7 d0
not %eax
<br> 3b: ba 00 00 00 00
mov $0x0,%edx
<br> 40: 21 02
and %eax,(%edx)
<br> 42: be 00 00 00 00
mov $0x0,%esi
<br> 47: 90
nop
<br> 48: f6 c3 01
test $0x1,%bl
<br> 4b: 74 04
je 51 <do_bottom_half+0x51>
<br> 4d: 8b 06
mov (%esi),%eax
<br> 4f: ff d0
call *%eax
<br> 51: 83 c6 04
add $0x4,%esi
<br> 54: c1 eb 01
shr $0x1,%ebx
<br> 57: 75 ef
jne 48 <do_bottom_half+0x48>
<br> 59: fa
cli
<br> 5a: 89 f6
mov %esi,%esi
<br> 5c: c7 44 3d 00 00 00 00 movl
$0x0,0x0(%ebp,%edi,1)
<br> 63: 00
<br> 64: 5b
pop %ebx
<br> 65: 5e
pop %esi
<br> 66: 5f
pop %edi
<br> 67: 5d
pop %ebp
<br> 68: 83 c4 0c
add $0xc,%esp
<br> 6b: c3
ret
<p>The machine code from the Emulator after the crash is:
<br>010:C0118454 83 EC 0C 55 57 56 53 83 3D 60 67 1D C0 00 75 54
<br>0010:C0118464 C7 05 60 67 1D C0 01 00 00 00 31 ED BF 60 67 1D
<br>0010:C0118474 C0 83 3D 5C 67 1D C0 00 75 32 FB 8B 1D AC BD 1A
<br>0010:C0118484 C0 23 1D A8 BD 1A C0 89 D8 F7 D0 BA A8 BD 1A C0
<br>0010:C0118494 21 02 BE 80 67 1D C0 90 F6 C3 01 74 04 8B 06 FF
<br>0010:C01184A4 D0 83 C6 04 C1 EB 01 75 EF FA 89 F6 C7 44 3D 00
<br>0010:C01184B4 00 00 00 00 5B 5E 5F 5D 83 C4 0C C3
<p>The machine code for do_bottom_half is not corrupted.
<p>At offset 4f: the instruction call *%eax casues the IP to be loaded
with
<br>3b3a3938 which is the value of eax.
<br>
<p> active = get_active_bhs();
<br> clear_active_bhs(active);
<br> bh = bh_base;
<br> do {
<br>
if (active & 1)
<br>
(*bh)();
<br>
bh++;
<br>
active >>= 1;
<br> } while (active);
<br>}
<p>call *%eax corresponds to (*bh)();
<p>probably the memory at bh_base got corrupted.
<br>
<br>
<p>sam wrote:
<blockquote TYPE=CITE><style></style>
<font face="Arial"><font size=-1>Hello
All</font></font> <font face="Arial"><font size=-1>I did not get any
solution so i am posting again.</font></font> <font face="Arial"><font size=-1>Sam</font></font>
<div style="FONT: 10pt arial">----- Original Message -----
<div style="BACKGROUND: #e4e4e4; font-color: black"><b>From:</b> <a href="mailto:mrjackin@yahoo.co.uk" title="mrjackin@yahoo.co.uk">sam</a></div>
<b>To:</b> <a href="mailto:eepro100@scyld.com" title="eepro100@scyld.com">eepro100@scyld.com</a><b>Sent:</b>
Thursday, January 10, 2002 4:35 AM<b>Subject:</b> [eepro100] Kernel panic:
Aiee, killing interrupt handler!</div>
<font face="Arial"><font size=-1>Hello All,</font></font><font face="Arial"><font size=-1>I
am gtting this error when linux machine get more tarffic on NIC</font></font> <font face="Arial"><font size=-1>Kernel
panic: Aiee, killing interrupt handler!</font></font>
<br><font face="Arial"><font size=-1>In interrupt handler not syncing</font></font> <font face="Arial"><font size=-1>System
configuration is</font></font> <font face="Arial"><font size=-1>P-III
900 MHz</font></font><font face="Arial"><font size=-1>512 MB RAM</font></font><font face="Arial"><font size=-1>2
100Mbs ONboard eepro100 NIC</font></font><font face="Arial"><font size=-1>Redhat-7.1
kernel-2.4.13</font></font> <font face="Arial"><font size=-1>Thanks</font></font><font face="Arial"><font size=-1>-Sam</font></font></blockquote>
</body>
</html>
--------------11DBDA31FDDA6043442BE2F5--
--------------InterScan_NT_MIME_Boundary--