Scyld 27Z-8 Gig Net - HELP!
Art Edwards
edwardsa at plk.af.mil
Thu Sep 26 12:20:25 PDT 2002
I'm running the public, free version of bz8. Is it true that we can, with
a little toil, make it work with GB ethernet? I'm bying 3com cards and
a Catalyst 4K switch.
Art Edwards
On Thu, Sep 26, 2002 at 11:35:50AM -0400, Karen Keadle-Calvert wrote:
> Stanley,
>
> I know you said you modified all of the files, but just to review, under
> 27z-8, you need to modify the file /etc/beowulf/config.boot to add the
> device and vendor information for the newer e1000 card. So you'll need
> to add the following line:
>
> pci 0x8086 0x100E e1000
>
> In addition, make sure you have a 'bootmodule' entry for "e1000" near
> the beginning of the file. Next rebuild your node boot floppy and
> beoboot images and try rebooting.
>
> If you've already done all of that (which it sounds like you have), then
> attached are some directions for building an e1000 driver under Scyld.
>
> Hopefully, this solves your problem.
>
> Regards,
>
> Karen
>
>
> Stanley, Matthew D. wrote:
>
> >I have several clusters running the public release of 27Z-8. They have
> >been, up until now exclusively via-rhine and 3c59x based 100mbit clusters.
> >We wanted to upgrade to gigabit ethernet and decided to upgrade our 4
> >machine cluster using Dlink DGE-500T cards (ns820/ns83820 based). I
> >compiled the latest netdrivers.tgz file and the ns820 driver appeared to
> >work fine as a link to the outside world but did not function on the
> >beoboot floppy even though I compiled for that kernel and even did a full
> >kernel set rebuild (rpm -bb) including the new netdrivers.tgz file. What
> >happened was right after it would find the card, find the master server
> >and assign the IP address it would just sit at the line where it requests
> >/var/beowulf/boot.img.
> >
> >Ok, so I gave up on Dlink cards, and purchased 4 Intel PRO/1000MT cards,
> >the new version which requires the new release of drivers since it's PCI
> >id is 8086:100E and not 8086:1000. I again compiled the drivers and
> >tested the card to the internet side with 0 problems. I then create my
> >boot images and try to boot, it gets a little farther than the Dlink, it
> >will actually starts to boot the net boot image and then locks up and
> >never completes.
> >
> >Am I missing something here? Ive modified all of the files, it finds the
> >cards, it even works for days on the internet if I switch my card to the
> >eth0 and not eth1. It appears to be a driver issue yet I have similar
> >problems with two completely different sets of cards. I have even tried
> >using a 100 mbit hub instead of a gigabit switch with identical results.
> >I can also just take out the cards and put in 3c59x cards and the problem
> >is fixed!
> >
> >We use our clusters for NAMD only, is there a way to just install full
> >versions of Scyld and then execute bpslave? If so, what modifications
> >need to be done to the node_up and other scripts to make that work. I
> >realize this means more administration, but at this point I have spent
> >weeks trying to make this work, I can install and update 4 machines in a
> >matter of a couple hours.
> >
> >Are there settings in beoboot which changes the way it gets the
> >information from the master node, maybe making it more reliable like
> >broadcast/multicast, etc?
> >
> >Any help would be appreciated,
> >
> >Matt Stanley
> >Systems Administrator
> >Structural Biology Core
> >University of Missouri - Columbia
> >_______________________________________________
> >Beowulf mailing list, Beowulf at beowulf.org
> >To change your subscription (digest mode or unsubscribe) visit
> >http://www.beowulf.org/mailman/listinfo/beowulf
> >
> >
>
> HOW TO ADD DRIVERS - Example shown for Intel Pro/1000 series gigabit adapters
> ------------------
>
>
> => If available, get the prebuilt modules for the appropriate kernel from:
> ftp://www.scyld.com/pub/beowulf/<version>/updates
>
> For example, for the 2.2.19-12 kernel:
> ftp://www.scyld.com/pub/beowulf/27z-8/updates/e1000-3.6.8.1.tar.gz
>
> => If not available, download source code for driver. The Intel Pro/1000
> series driver can be found at ftp://www.intel.com/df-support/2897/eng or
> http://downloadfinder.intel.com/scripts-df/Product_Filter.asp?ProductID=415 or
> http://support.intel.com/support/go/linux/e1000.htm
>
> NOTE: If the kernel source rpm was not installed, you'll have to do that
> first. It is installed by default under 27cz-9, but not under
> 28cz-8-beta2. The kernel source is available on the distribution
> CD under Scyld/RPMS/kernel-source-2.4.9-21.1.i386.rpm
>
> => Add this line to the beginning of the Makefile
> CFLAGS = $(KCFLAGS)
>
> => Make the beoboot, SMP, and UP modules for the version of the Scyld
> kernel that you are running under (27cz-9 shown here):
>
> > make KCFLAGS="-D__BOOT_KERNEL_H_ -D__module__beoboot"
> > mv e1000.o /lib/modules/2.2.19-14.beobeoboot/net
> > make KCFLAGS="-D__BOOT_KERNEL_H_ -D__BOOT_KERNEL_SMP=1"
> > mv e1000.o /lib/modules/2.2.19-14.beosmp/net
> > make KCFLAGS="-D__BOOT_KERNEL_H_ -D__BOOT_KERNEL_UP=1"
> > mv e1000.o /lib/modules/2.2.19-14.beo/net
>
> => Add new entries for this module to the PCI table
>
> 1. Add, if necessary, the following bootmodule entry to the configuration
> file (in /etc/beowulf/config.boot for 27cz-9 and /etc/beowulf/config for
> 28cz-4):
> bootmodule e1000
>
> 2. Add entries to the device list for each device supported by this driver
> (in /etc/beowulf/config.boot for 27cz-9 and /usr/share/kudzu/pcitable for
> 28cz-1):
> pci 0x8086 0x1000 e1000
> pci 0x8086 0x1001 e1000
> pci 0x8086 0x1004 e1000
> pci 0x8086 0x1008 e1000
> pci 0x8086 0x1009 e1000
> pci 0x8086 0x100c e1000
>
> => Build the dependency file (for each kernel) used by modprobe to load the
> correct module:
>
> For single processor kernel:
> depmod -a -e -F /boot/System.map-2.2.19-14.beo 2.2.19-14.beo
>
> For SMP (more than one processor machine) kernel:
> depmod -a -e -F /boot/System.map-2.2.19-14.beosmp 2.2.19-14.beosmp
>
> For beoboot kernel (Stage 1 image):
> depmod -a -e -F /boot/System.map-2.2.19-14.beobeoboot 2.2.19-14.beobeoboot
>
>
> => Rebuild the Phase 1 and Phase 2 kernel images:
> /usr/bin/beoboot -1 -f -o /dev/fd0 -c "apm=power-off"
> /usr/bin/beoboot -2 -n -k /boot/vmlinuz-`uname -r` -o /var/beowulf/boot.img -c "apm=power-off"
>
>
> NOTE:
> ----
> If your master node is single processor and your compute node is SMP,
> and you don't have a SMP kernel installed, you'll have to get the RPM
> from the distribution CD and install it (using rpm -U). This happens
> when you install on a single processor machine because the installer
> selects the kernel to be installed based on the machine being installed
> on. You must run the same kernel on all of the machines in the cluster.
> The SMP kernel can run on both single processor and SMP machines.
>
More information about the Beowulf
mailing list