[Beowulf] Kill zombies after a parallel run
Toon Knapen
toon.knapen at fft.be
Mon May 8 05:31:09 PDT 2006
Peter Jakobi kindly just gave me following reply:
There's
- zap: the kill example in Larry Wall's Perlbook
# interactive verification
# any regular expression matching any
# string in the output of ps -ef, ...
# I tend to keep hacking my ancient copy of this
# so currently my copy can be run non-interactively,
# kill children, kill per tty (carefully craft your
# regex, otherwise DO NOT use with -y to randomly
# kill wrong processes!!!), or list/nice processes
# instead of killing.
# for a short while, I've put a copy here:
http://www.oa.shuttle.de/kefk/tmp/zap
non-internactive and a bit heavy-handed:
- killall; by name,can also kill acc. to PGID (process groups)
- killproc; by name of executable; -G incl. children in
current process group or session(check that these
are identical?). -g
to kill the incl. other process in the group.
# you are also able to get the list of processes
# the use a specific file via lsof, than pass the
# pids to kill. Quickly, but pid reuse hopefully
# doesn't occur within a few secs. You'd need to
# check the kernel to be certain that this is
# the case (any other kernel behaviour I'd consider
# a bug).
- skill/snice
# adds selection by tty, command, ... . But still
# only command binary name in the sense of killproc.
> I think what the OP is asking is how to kill (automagicallY) all
processes in a parallel run once one process crashed (due to
segmentation failure or soth.)
> Generally if one process (in the whole bunch of processes) crashes,
all other processes will wait eternally from the moment they try to
communicate with the crashed process or at the MPI_Finalize. So how can
one kill all remaining processes?
More information about the Beowulf
mailing list