[eepro100] About eepro "out of resources" bug
Alexander Gdalevich
gdalevich@hotmail.com
Tue, 25 Sep 2001 17:33:07 -0400
What driver are you using?
It does not look like eepro100.c from Scyld, nor fxp_if for FreeBSD, nor
e100 from Intel. BTW, is it written in C++ !? Also, how often do you see
this happening?
I am working on an embedded system project using 82559 chip and we did have
a PCI problem that had caused what you are describing. However, we have a
custom PCI interface and I don't beleive any chipset on the market would
have such bugs in it :)
There is another possible problem, first it seemed like a hairsplitting, but
seeing your post I see that it might actually be a problem.
Here is the line from eepro100.c (Scyld driver)
sp->last_rxf->status &= cpu_to_le32(~0xC0000000);
I am not sure if it does the same thing as your:
last_rxf.status(last_rxf.status() & ~0xc0000000);
but the problem is as follows. Suppose you've read the status field just as
the device was about to update it. When you and it with (~0xC0000000) and
write back you might write the old value of the status bits overwriting
whatever the device had set it to.
In fact, something similar is being done for command ring and there a
different way is being used to clear suspend bit.
#define clear_suspend(cmd) ((char *)(&(cmd)->cmd_status))[3] &= ~0x40
This way only a single byte is being modified.
P.S. I am very interested in the driver you are using. What is it?
>From: "Joe Kulig" <joek@websprocket.com>
>To: <eepro100@scyld.com>
>Subject: [eepro100] About eepro "out of resources" bug
>Date: Tue, 25 Sep 2001 17:04:57 -0400
>
>Let me share my observations on the "out of resources" bug. I too have
>come across this and this what I have seen:
>
>The Receive Frame Area (RFA) looks like this:
>
>RFD 0
>+0 0xc0000001
>+12 0xSSSS0000
>
>RFD 1
>+0 0x1
>+12 0xSSSSc040
>
>RFD 2
>+0 0xa020
>+12 0xSSSSc040
>
>...
>
>RFD N
>+0 0xa020
>+12 0xSSSSc040
>
>SSSS = data buffer size
>
>I have concluded (from dumping received packets in memory and a snooping
>the tcp line) that the eepro100 did not update the first field in RFD 1
>when it received the packet. The EOF, F, and ACTUAL COUNT fields are
>updated in the fourth field. I have verified that the data has been
>received correctly. What happens is the rx processing for the eepro100
>is looking for an rxComplete (bit 15, 0x8000) to be set. This has not
>happened, but the EOF, F, and ACTUAL COUNT fields in fourth RFD field
>have been updated. The interrupt does occur but RFD 1 is not processed.
>So the eepro keeps dumping packets into the remain RFD's until the RFD
>is reached and then it reports that it is out resources.
>
>I don't know why this condition happens but I suspect it is a
>hardware/pci problem. I do not know if Intel is aware of this condition.
>
>I do have a fix for this. Instead of looking for rxComplete in the first
>RFD field, I look for EOF and F in the fourth field. This seems to work.
>At least I have no seen this type of "out resources error" condition
>occur.
>
>The code for the relevant fix is below and is written Java. This is
>because my driver is written for an object oriented OS that is written
>in Java. One should see how the fix could be applied to the eepro100.c
>linux version.
>
>
>with best regards,
>
>Joe Kulig phone: 216-357-2580
>joek@websprocket.com fax: 216-357-2584
>2253 Professor Street Cleveland, OH 44113
>
>
> final static int PacketReceived = 0xc000;
>
> final int rx() {
> int entry = cur_rx & RX_RING_SIZE-1;
> int status;
> int rx_work_limit = dirty_rx + RX_RING_SIZE - cur_rx;
> RxFD rxf;
>
> if (debug > 4)
> System.out.println(" In rx().");
> rxRing[entry].flushHeader();
> int count;
> while (rxRing[entry] != null &&
> ((count = rxRing[entry].count()) & PacketReceived) ==
>PacketReceived) {
> int pkt_len = count & 0x07ff;
>
> if (--rx_work_limit < 0)
> break;
> status = rxRing[entry].status();
> if (debug > 4)
> System.out.println(" rx() status " + Integer.toHexString(status) +
> " len " + pkt_len);
> if ((status & (RxErrTooBig|RxOK|0x0f90)) != RxOK) {
> if ((status & RxErrTooBig)!=0)
> System.out.println(name +": Ethernet frame overran the Rx buffer,
>status " + Integer.toHexString(status));
> else if ( ! ((status & RxOK)!=0)) {
> /* There was a fatal error. This *should* be impossible. */
> rx_errors++;
> sb.append(name).append(": Anomalous event in rx(), status ");
> sb.append(Integer.toHexString(status));
> System.out.println(sb.toString());
> sb.setLength(0);
> }
> } else {
> if ((drv_flags & HasChksum)!=0)
> pkt_len -= 2;
>
> /* Check if the packet is long enough to just accept without
> copying to a properly sized skbuff. */
> // if (pkt_len < rx_copybreak) {
> // /* Packet is in one chunk -- we can copy + cksum. */
> // // eth_copy_and_sum(skb, rx_skbuff[entry]->tail, pkt_len, 0);
>
> // } else {
> /* Pass up the already-filled skbuff. */
> addPacket(rxRing[entry]);
> rxRing[entry].size(pkt_len);
> rxRing[entry].flush();
> rxRing[entry] = null;
> if ((drv_flags & HasChksum)!=0) {
> // u16 csum = get_unaligned((u16*)(skb->head + pkt_len))
> // if (desc_count & 0x8000)
> // skb->ip_summed = CHECKSUM_UNNECESSARY;
> }
> rx_packets++;
> }
> entry = (++cur_rx) & RX_RING_SIZE-1;
> rxRing[entry].flushHeader();
> }
>
> for(; cur_rx-dirty_rx>0; dirty_rx++){
> entry = dirty_rx & RX_RING_SIZE-1;
>
> rxRing[entry] = rxPackets[rxPacketIndex];
> rxf = rxRing[entry];
> rxPacketIndex++;
> rxPacketIndex &= (rxPackets.length-1);
>
> rxf.status(0xc0000001);
> rxf.count(PKT_BUF_SZ<<16);
> rxf.link(0);
>
> last_rxf.link(rxf.bufferAddress);
> last_rxf.status(last_rxf.status() & ~0xc0000000);
> last_rxf.cleanHeader();
> last_rxf = rxf;
> rxf.cleanHeader();
> }
>
> last_rx_time = jiffies;
> return 0;
> }
>
>
>
>_______________________________________________
>eepro100 mailing list
>eepro100@scyld.com
>http://www.scyld.com/mailman/listinfo/eepro100
_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp