NAME¶
killer - Background job killer
SYNOPSIS¶
killer [
-h] [
-V] [
-n] [
-d]
DESCRIPTION¶
killer is a perl script that gets rid of background jobs. Background jobs
are defined as processes that belong to users who are not currently logged
into the machine. Jobs can be run in the background (and are expempt from
killer's acctions) if their scheduling priority has been reduced by
increasing their
nice(1) value or if they are being run through
condor. For more details, see the
PACKAGE main section of this
document.
The following sections describe the
perl(1) packages that make up the
killer program. I don't expect that the version that works for me will work
for everyone. I think that the ProcessTable and Terminals packages offer
enough flexibility that most modifications can be done in the main package.
Command line options
- -h
- Tell me how to get help
- -V
- Display version number
- -n
- Do not kill, just print what would be killed
- -d
- Enable debug output
PACKAGE ProcessTable¶
Each ProcessTable object contains hashes (or associative arrays) that map
various aspects of a job to the process ID (PID). The following hashes are
provided:
- pid2user
- Login name associated with the effective UID that the
process is running as.
- pid2ruser
- Login name associate with the real UID that the process is
running as.
- pid2uid
- Effective UID that the process is running as.
- pid2ruid
- Real UID that the process is running as.
- pid2tty
- Terminal associated with the process.
- pid2ppid
- Parent process of the process
- pid2nice
- nice(1) value of the process.
- pid2comm
- Command name of the process.
Additionally, the %remainingprocs hash provides the list of processes that will
be killed.
The intended use of this package calls for
readProcessTable to be called
to fill in all of the hashes defined above. Then, processes that meet specific
requirements are removed from the %remainingprocs hash. Those that are not
removed are considered to be background processes and may be killed.
new¶
This function creates a new
ProcessTable object.
Example:
my $ptable = new ProcessTable;
initialize¶
This function (re)initializes arrays and any environment variables for external
commands. It generally will not need to be called, as it is invoked by
new().
Example:
# Empty out the process table for reuse
$ptable->initialize();
readProcessTable¶
This function executes the
ps(1) command to figure out which processes
are running. Note that it requires a SYSV style
ps(1).
Example:
# Get a list of processes from the OS
$ptable->readProcessTable();
cleanForkBombs¶
This function looks for a large number of processes owned by one user, and
assumes that it is someone that is using
fork() for the first time. An
effective way to clean up such a mess is to "kill -STOP" each
process then "kill -KILL" each process.
Note this function ignores such mistakes by root. If root is running a
fork(2) bomb, this script wouldn't run, right? Also, you should be sure
that the number of processes mentioned below (490) is less (equal to would be
better, right?) than the maximum number of processes per user. Also, the OS
should have a process limit at least a couple hundred higher than any
individual. Otherwise, you will have to use the power switch to get rid of
fork bombs.
Each time a process is sent a signal, it is logged via syslog(3C).
Example:
# Get rid of fork bombs. Keep track of who did it in @idiots.
my @idiots = $ptable->cleanForkBombs();
getUserProcessIds user¶
This returns the list of process ID's where the login associated with the real
UID of the process matches the argument to the function.
Example:
# Find all processes owned by httpd
my @webservers = $ptable->getUserProcessIds('httpd');
getUniqueTtys¶
This function returns a list of terminals in use. Note that the format will be
the same as given by
ps(1), which will generally lack the leading
"/dev/".
Example:
# Get a list of all terminals that processes are attached to
my @ttylist = $ptable->getUniqueTtys();
removeProcessId pid¶
This function removes pid from the list of processes to be killed. That is, it
gets rid of a process that should be allowed to run. Most likely this will
only be called by other functions in this package.
Example:
# For some reason I know that PID 1234 should be allowed to run
$ptable->removeProcessId(1234);
removeProcesses psfield, psvalue¶
This function removes processes that possess certain traits. For example, if you
want to get rid of all processes owned by the user "lp" or all
processes that have /dev/console as their controlling terminal, this is the
function for you.
psfield can be any of the following
- pid
- Removes process id given in second argument.
- user
- Removes processes with effective UID associated with login
name given in second argument.
- ruser
- Removes processes with real UID associated with login name
given in second argument.
- uid
- Removes processes with effective UID given in second
argument.
- ruid
- Removes processes with real UID given in second
argument.
- tty
- Removes processes with controlling terminal given in second
argument. Note that it should NOT start with "/dev/".
- ppid
- Removes children of process with PID given in second
argument.
- nice
- Removes children with a nice value equal to the second
argument.
- comm
- Removes children with a command name that is the same as
the second argument.
Examples:
# Allow all imapd processes to run
$ptable->removeProcesses('comm', 'imapd');
# Be sure not to kill print jobs
$ptable->removeProcesses('ruser', 'lp');
removeChildren pid¶
This function removes all decendents of the given pid. That is, if the pid
argument is 1, it will ensure that nothing is killed.
Example:
# Be sure not to kill off any mail deliveries (assumes you have
# written getSendmailPid()). (Sendmail changes uid when it does
# local delivery.)
$ptable->removeChildren(getSendmailPid);
removeCondorChildren¶
Condor is a batch job system that allows migration of jobs between machines (see
http://www.cs.wisc.edu/condor/). This ensures that condor jobs are left alone.
Example:
# Be nice to the people that are running their jobs through condor.
$ptable->removeCondorChildren();
findChildProcs pid¶
This function finds and returns a list of all of the processess that are
descendents of a the PID given in the first argument.
Example:
# Find the processes that are decendents of PID 1234
my @procs = $ptable->findChildProcs(1234);
getTtys user¶
This function returns a list of tty's that are in use by processes owned by a
particular user.
Example:
# find all tty's in use by gerdts.
my @ttylist = getTtys('gerdts');
getUsers¶
This function lists all the users that have active processes.
Example:
# Get all users that are logged in
my @lusers = $ptable->getUsers()
removeNiceJobs¶
This function removes all jobs that have a nice value greater than 9. That is,
they have a lower sceduling priority than the default (0).
Example:
# Allow people to run background jobs so long as they yield to
# those with "foreground" jobs
$ptable->removeNiceJobs();
printProcess filehandle, pid¶
This function displays information about the process, kinda like "ps |
grep" would.
Example:
# Print info about init to STDERR
$ptable->printProcess(\*STDERR, 1);
printProcessTable¶
printProcessTable filehandle¶
This function prints info about all the processes discoverd by
readProcessTable. If an argument is given, it should be a file handle
to which the output should be printed.
Examples:
# Print the process table to stdout
$ptable->printProcessTable();
# Mail the process table to someone
open MAIL '|/usr/bin/mail someone';
$ptable->printProcessTable(\*MAIL);
close(MAIL);
printRemainingProcesses¶
printRemainingProcesses filehandle¶
This function prints info about all the processes discoverd by
readProcessTable, but not removed from %remainingprocs. If an argument
is given, it should be a file handle to which the output should be printed.
Examples:
# Print the jobs to be killed to stdout
$ptable->printRemainingProcesses();
# Mail the jobs to be killed to someone
open MAIL '|/usr/bin/mail someone';
$ptable->printRemainingProcesses(\*MAIL);
close(MAIL);
getRemainingProcesses¶
Returns a list of processes that are likely background jobs.
Example:
# Get a list of the processes that I plan to kill
my @procsToKill = $ptable->getRemainingProcesses();
killAll signalNumber¶
Sends the specified signal to all the processes listed. A syslog entry is made
for each signal sent.
Example:
# Send all of the remaining processes a TERM signal, then a
# KILL signal
$ptable->killAll(15);
sleep(10); # Give them a bit of a chance to clean up
$ptable->killAll(9);
PACKAGE Terminals¶
The Terminals package provides a means for figuring out how long various users
have been idle.
new¶
This function is used to instantiate a new Terminals object.
Example:
# Get a new Terminals object.
my $term = new Terminals;
initialize¶
This function figures out who is on the system and how long they have been idle
for. It will generally only be called by
new().
Example:
# Refresh the state of the terminals.
$term->initialize();
showConsoleUser¶
This function returns the login of the person that is physically sitting at the
machine.
Example:
# Print out the login of the person on the console
printf "%s is on the console\n", $term->showConsoleUser();
initializeTty terminal statparts¶
This initializes internal structures for the given terminal.
getIdleTime user¶
Figure out how long a user has been idle. This is accomplished by examining all
terminals that the user owns and returns the amount of time since the most
recently accessed one was used. Additionally, if the user is at the console it
is possible that he/she is not typing, yet is quite active with the mouse or
typing into an application that does not use a terminal.
Example:
# Figure out how long the user on the console has been idle
my $consoleIdle = $term-getIdleTime($term->showConsoleUser());
printEverything¶
Prints to stdout who is on what terminal and how long they have been idle. Only
useful for debugging.
Example:
# Take a look at the contents of structures in my
# Terminals object
$term->printEverything();
PACKAGE main¶
The main package is the version used on the Unix workstations at the University
of Wisonsin's Computer-Aided Engineering Center (CAE). I suspect that folks at
places other than CAE will want to do things slightly differently. Feel free
to take this as an example of how you can make effective use of the
processTable and Terminals packages.
Configuration options¶
- $forkadmin
- Email address to notify of fork bombs
- $killadmin
- Email address to notify of run-of-the-mill kills
- $fromaddr
- Who do email messages claim to be from?
- $stubbornadmin
- Email address to notify when jobs will not die
- @validusers
- These are the folks that you should never kill off
- $minuid
- Do not kill processes of users with uid lower than this
value.
- $maxidletime
- The maximum number of seconds that a user can be idle
without being classified as having "background" jobs.
If I am a user really trying to avoid a background job killer, I would likely
include a signal handler that would wait for signal 15. When I saw it, I would
fork causing the parent to die and the child would continue on to do my work.
Assuming that everyone thinks like me, I figure that I will need to make at
least two complete passes to clear up the bad users. The first pass is
relatively nice (sends a signal 15, followed a bit later by a signal 9). A
well-written program will take the signal 15 as a sign that it should clean up
and then shut down. When a process gets a signal 9, it has no choice but to
die.
The second pass is not so nice. It finds all background processes, sends them a
signal 23 (SIGSTOP), then a signal 9 (SIGKILL). This pretty much (but not
absolutely) guarantees that processes are unable to find a way around the
background job killer.
gatherInfo¶
This function gathers information from the Terminals and ProcessTable packages,
then based on that information decides which jobs should be allowed to run.
Specifically it does the following:
- •
- Instantiates new ProcessTable and Terminals objects. Note
that Terminals::new fills in all the necessary structures to catch users
that have logged in between calls to gatherinfo.
- •
- Reads the process table
- •
- Removes condor processes and condor jobs from the list of
processes to be killed.
- •
- Removes all jobs belonging to all users in the
configuration array @validusers from the list of processes to be
killed.
- •
- Removes all nice(1) jobs from the list of jobs to be
killed.
- •
- Removes all jobs belonging to users where the user has less
than $maxidletime idle time on at least one terminal. Additionally, jobs
associated with ttys that are owned by users that have less than
$maxidletime idle time on at least one terminal are preserved. This makes
it so that if luser uses su(1) to gain the privileges of boozer,
processes owned by boozer will not be killed.
- •
- Removes all processes of users with uid lower than the
$minuid value.
- •
- Finally, the process table and terminal objects are
returned.
BUGS¶
There is a small window of opportunity for a user that reaches $maxidletime in
the middle of this script to get unfair treatment. This could probably be
reconciled by shaving some time off of maxidletime for the second call to
main::gatherInfo.
It is still possible to get around the background job killer by having a lot of
proceses that watch each other to be sure that they are still responding (have
not yet gotten a signal 23). As soon as a stopped process is found, the still
running process could
fork(), thus leaving a background process that is
not going to be killed.
Different operating systems have different notions of nice values. Some go from
-20 to +19. Some go from 0 to 39. Solaris and HP-UX (using System V ps
command) report nice values between 0 and 39.
It is bad to assume that all systems that run this have the same number of
processes per user. The script should ask the OS how many processes normal
(non-root) users can run.
TODO¶
The configuration is quite minimalistic. It should be made possible to have
per-host configuration directives so that you can, for instance, allow certain
people to run background jobs on certain hosts.
People that really care about finding habitual offenders will probably want to
have a way to add entries to a database and flag those that pop up too often.
Thoroughly test on more operating systems. A very close relative of this code
has performed well on about 60 Solaris 2.5.1 machines. It has been lightly
tested on HP-UX 10.20 as well.
Make mailing to someone optional. If you have a lot of workstations killing off
boring stuff all the time, too much meaningless mail traffic is generated.
If you plan to run this on a machine that runs special processes like a POP or
IMAP server, it would be handy to be able to check multiple conditions easily.
Perhaps
$ptable->removeProcesses( { comm => 'imapd',
parentComm => 'inetd',
parentUser => 'root' } );
This would make it so that people don't rename the crack binary imapd to escape
the wrath of killer.
LICENSE¶
This program is released under the terms of the General Public License (GPL)
version 2. The the file COPYING with the distribution. If you have lost your
copy, you can get a new one at
http://www.gnu.org/copyleft/gpl.html. In
particular remember that this code is distributed for free without warranty.
If you make use of this code, please send me some email. While I am open to
suggestions to improvement, I by no means guarantee that I will implement
them.
SEE ALSO¶
nice(1) perl(1) ps(1) su(1) who(1)
fork(2) signal(5)
http://www.cs.wisc.edu/condor/
http://www.cae.wisc.edu/~gerdts/killer/
AUTHOR¶
killer was written by Mike Gerdts, gerdts@cae.wisc.edu.