[eepro100] eepro100 and Intel STL2
Tim Cutts
tim.cutts@incyte.com
Wed, 9 May 2001 10:40:57 +0100
On Wed, May 09, 2001 at 04:02:42AM -0400, Donald Becker wrote:
> On Tue, 8 May 2001, Will Francis wrote:
>
> > > On Wed, 2 May 2001 wfrancis@incyte.com wrote:
> >
> > > Our 27Bz-8 Scyld Beowulf version includes the corrections, and STL2
> > > systems are now in our testing lab.
> >
> > I can not locate the 27Bz-8 distribution on your
> > FTP server.
>
> Right now it's available only to our partners.
>
> > Has this driver been released somewhere else? If
> > not, any idea when it might be publicly available?
>
> In a week or two.
I've been seeing a similar problem to Will, also using STL2
motherboards. I run a much smaller farm of machines at one of Incyte's
other locations, here in the UK.
The symptoms for me are that jobs doing a lot of NFS reads from the
wedge in a non-interruptible wait on disk.
The network interface is still alive, but the process remains hung.
The wedge is associated with a kernel log message:
kernel: eepro100: cmd_wait for(0xffffff80) timedout with(0xffffff80)!
and then huge numbers of:
kernel: nfs: task 291659 can't get a request slot
Machines based on the Lancewood motherboard do not have the same
problem. All machines are identically configured, using kernel 2.2.19.
All machines are dual-processor.
This seems to be related to the discussions on this list back in
February, regarding the detection of the receiver lock-up bug.
The older machines enable the work-around:
eth0: Intel PCI EtherExpress Pro100 82557, 00:90:27:F6:2C:37, IRQ 21.
Receiver lock-up bug exists -- enabling work-around.
Board assembly 000000-000, Physical connectors present: RJ45
Primary interface chip i82555 PHY #1.
General self-test: passed.
Serial sub-system self-test: passed.
Internal registers self-test: passed.
ROM checksum self-test: passed (0x04f4518b).
Receiver lock-up workaround activated.
The newer machines do not:
eth0: OEM i82557/i82558 10/100 Ethernet, 00:D0:B7:B7:17:A1, IRQ 18.
Board assembly 000000-000, Physical connectors present: RJ45
Primary interface chip i82555 PHY #1.
General self-test: passed.
Serial sub-system self-test: passed.
Internal registers self-test: passed.
ROM checksum self-test: passed (0x04f4518b).
I'm interested to note that the newer machines' eepro100 seems to be
detected as a much more generic card than the older machines. Is this
correct?
It's interesting that lspci produces quite a lot of "Unknown device"
messages on the STL board, but not on the Lancewood board. For example,
compare the EEPro entries under lspci -vv on the above two machines:
old:
00:0e.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 08)
Subsystem: Intel Corporation EtherExpress PRO/100+ Management Adapter
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 8 min, 56 max, 64 set, cache line size 08
Interrupt: pin A routed to IRQ 21
Region 0: Memory at f4102000 (32-bit, non-prefetchable)
Region 1: I/O ports at 2800
Region 2: Memory at f4000000 (32-bit, non-prefetchable)
Capabilities: [dc] Power Management version 2
Flags: PMEClk- AuxPwr- DSI+ D1+ D2+ PME+
Status: D0 PME-Enable- DSel=0 DScale=2 PME-
new:
00:03.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 08)
Subsystem: Intel Corporation: Unknown device 1229
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 8 min, 56 max, 66 set, cache line size 08
Interrupt: pin A routed to IRQ 18
Region 0: Memory at fb101000 (32-bit, non-prefetchable)
Region 1: I/O ports at 5400
Region 2: Memory at fb000000 (32-bit, non-prefetchable)
Capabilities: [dc] Power Management version 2
Flags: PMEClk- AuxPwr- DSI+ D1+ D2+ PME+
Status: D0 PME-Enable- DSel=0 DScale=2 PME-
Is this symptomatic of a more generic problem regarding PCI detection on
these motherboards?
Tim.
--
Tim Cutts PhD Tel: +44 1223 454918
Incyte Genomics
Botanic House, 100 Hills Road, Cambridge, CB2 1FF, UK