Fri, 02 Oct 2009
Killing Processes by Name on Linux or Unix Systems
Finding and killing a process on a Unix or Linux system is typically done by sending it a signal using the kill command, specifying the process ID (PID), which we can grab using ps and grep (use ps -ef on Solaris):
dmaxwell@kaylee:~$ ps ax | grep gedit
11604 ? Sl 0:01 gedit
11609 pts/5 S+ 0:00 grep gedit
dmaxwell@kaylee:~$ kill 11604
But it’s sometimes convenient to want to kill a running process by name, or kill a group of running processes with the same name. The most portable way to do this is with the pkill command - this is present on most Linux, Solaris and BSD systems. The simplest way to use pkill is just to specify the process name:
pkill gedit
This sends a TERM signal to any process whose name matches ‘gedit’, terminating it. If you have a long-running command and can only remember part of the command string, no problem - use -f with pkill:
dmaxwell@kaylee:~$ ps ax | grep name
11902 pts/5 S 5:23 find . -name foo*
11906 pts/5 S+ 0:00 grep name
dmaxwell@kaylee:~$ pkill -f name
This would kill the find process (and the grep if it were still running), since part of its full command string contained the substring ‘name’. Using pkill in this way will by default gracefully terminate processes, but for stubborn processes that refuse to die, you can specify a different signal. Here we specify a KILL signal, which immediately ends a process.
pkill -KILL name
You can use numeric signals in place of the signal name, for example ‘-9′ is the KILL signal in the last example.
Killing a group of processes is just as easy. Sometimes this is necessary when system shutdown scripts fail, perhaps due to a missing lockfile. Here we kill all the Apache processes running on our server after the shutdown command fails:
root@kaylee:~# /etc/init.d/apache2 stop
* Stopping web server apache2 [ OK ]
root@kaylee:~# ps ax | grep apache
13124 ? Ss 0:00 /usr/sbin/apache2 -k start
13129 ? S 0:00 /usr/sbin/apache2 -k start
13130 ? S 0:00 /usr/sbin/apache2 -k start
13131 ? S 0:00 /usr/sbin/apache2 -k start
13132 ? S 0:00 /usr/sbin/apache2 -k start
13133 ? S 0:00 /usr/sbin/apache2 -k start
13162 pts/8 S+ 0:00 grep apache
root@kaylee:~# pkill apache
root@kaylee:~# ps ax | grep apache
13165 pts/8 S+ 0:00 grep apache
root@kaylee:~#
Using pkill -f start would also work here, since each of the Apache command lines contains the substring ’start’. The pkill command has many more options, but one other that might be useful is -u, which will allow you to specify a username or ID. In this example we send a TERM signal to all the processes owned by the user ‘nobody’:
root@kaylee:~# pkill -u nobody
There is a sister command to
pkill, pgrep, that takes most
of the same options but rather than sending a signal to one or a group
of processes, it just displays the process IDs. This can be fed as
standard input into other commands. Here is an example:
dmaxwell@kaylee:~$ pgrep -d, apache2
14507,14512,14513,14514,14515,14516
dmaxwell@kaylee:~$ ps fvp $(pgrep -d, apache2)
PID TTY STAT TIME .. RSS %MEM COMMAND
14507 ? Ss 0:00 .. 11076 0.3 /usr/sbin/apache2 -k start
14512 ? S 0:00 .. 6032 0.2 \_ /usr/sbin/apache2 -k start
14513 ? S 0:00 .. 6028 0.2 \_ /usr/sbin/apache2 -k start
14514 ? S 0:00 .. 6028 0.2 \_ /usr/sbin/apache2 -k start
14515 ? S 0:00 .. 6028 0.2 \_ /usr/sbin/apache2 -k start
14516 ? S 0:00 .. 6028 0.2 \_ /usr/sbin/apache2 -k start
This is particularly useful, since it preserves the header line output by ps, as opposed to something like ps avx | grep apache, which displays the data, but not the column headers. Both pkill and pgrep are documented in the same manual page, so search for either in the FreeBSD or Debian man pages for more info.
posted at: 00:04 | path: / | permanent link to this entry | 0 comments | tags: Linux Unix Sysadmin Tips Processes Servers
Wed, 02 Sep 2009
Troubleshooting SSH Connections
I’ve helped a few people recently who have had trouble getting OpenSSH working properly; I’ve also had my share of issues over the years. Generally problems with SSH connections fall into two groups - network related and server related. Most of these problems can be fixed fairly quickly if you know what to look for.
Network Related
These will typically be caused by improper routing or firewall configurations. Here are some things to check.
1. If your SSH server sits behind a firewall or router, make sure the default route of your internal SSH server points back to that firewall or router. Seems obvious, but it’s common to forget about the return trip packets need to make. This will display your default gateway:
netstat -rn | grep '^0'
Sometimes the default gateway is just one of your server interfaces, this is OK as long as that interface is directly connected to something that knows how to get back to your client.
2. While you’re at it, make sure the incoming SSH packets are actually getting to your SSH server. Tcpdump works very nicely for this, you’ll need to be root to run it on the server:
tcpdump -n -i eth0 tcp port 22 and host [IP address of client]
Just replace eth0 by your client-facing interface name. If you don’t see incoming SSH packets during connection attempts, it’s probably due to a firewall or router access list.
SSH Server Problems
All of these issues revolve around SSH server configuration settings - not misconfigurations necessarily, just settings you may not be aware of.
1. Permissions can be a problem - in its default configuration, OpenSSH sets StrictModes to yes and won’t allow any connections if the account you’re trying to SSH into has group- or world-writable permissions on its home directory, ~/.ssh directory, or ~/.ssh/authorized_keys file. I typically just make the two directories mode 700 and the authorized_keys file mode 600. The sshd man page suggests this one-liner:
chmod go-w ~/ ~/.ssh ~/.ssh/authorized_keys
2. On Debian or Ubuntu systems, it is possible the keys you are using to connect are blacklisted. This is only an issue on Debian or Debian-based clients, and stems from this now-famous vulnerability in May of 2008. To detect any such blacklisted keys, run ssh-vulnkey on the client, while logged into the account you are connecting from. Debian and Ubuntu SSH servers will reject any such keys unless the PermitBlacklistedKeys directive in the /etc/ssh/sshd_config file is set to no. I don’t recommend you actually leave this security check disabled, but it can be useful to temporarily disable it during testing.
3. Finally, if all else fails, you can see exactly what the SSH server is doing by running it in debug mode on a non-standard port:
/usr/sbin/sshd -d -p 2222
Then, on the client, connect and watch the server output:
ssh -vv -p 2222 [Server IP]
Note the -vv option to provide verbose client output. This alone can sometimes help debug connection issues.
posted at: 22:17 | path: / | permanent link to this entry | 0 comments | tags: Sysadmin Linux SSH Networking Tips
Mon, 10 Aug 2009
The Forgotten Power of Unix Text Utilities
I’m the first to extol the virtues of scripting languages like Python and Perl in particular. But they aren’t always the best tool for the job. It’s often forgotten how powerful the original Unix (and now GNU) text processing utilities are. Recently on linuxquestions.org, someone was asking how to combine specific columns from multiple CSV files into a new CSV file. They had the start of a Perl solution that was not working correctly, and wanted advice on it. My advice was to go with a one-line shell solution which is simply this:
paste -d, <(cut -d, -f3 file1.csv) <(cut -d, -f3 file2.csv) > output.csv
This will combine the third column from each specified file into a new file. It relies on a feature of the more modern Bourne shells, process substitution - the two parts that look like <(…). Here it is in action:
dmaxwell@kaylee:~$ cat foo1.txt
a1,a2,a3
b1,b2,b3
dmaxwell@kaylee:~$ cat foo2.txt
A1,A2,A3
B1,B2,B3
dmaxwell@kaylee:~$ paste -d, <(cut -d, -f3 foo1.txt) <(cut -d, -f3 foo2.txt)
a3,A3
b3,B3
You can paste columns from as many files as you need here. One catch, of course, is that this only works with simple CSV data - meaning there are no embedded commas in the data fields themselves. But this is much more understandable than any lengthy scripting language solution.
One other tip, if you had to get rid of the first row, which might contain column header data, just pipe the output through tail:
paste -d, <(cut -d, -f8 file1.csv) <(cut -d, -f8 file2.csv) | tail -n +2 > output.csv
posted at: 15:33 | path: / | permanent link to this entry | 0 comments | tags: Unix Linux Sysadmin Tips Textutils
Thu, 30 Jul 2009
Introduction to the Command Line
There is a manual available from the FSF for those wishing to learn how to use the command line and associated tools. It’s quite good. You can get it online at Flossmanuals, or support the FSF and buy a printed copy. At about 165 pages, it covers all the Bash shell basics, and has sections on the various text utilities, scripting, SSH, text editors and the indispensable GNU screen. It also has a nice command reference as an appendix. Here is an outline of the book content.
posted at: 18:22 | path: / | permanent link to this entry | 0 comments | tags: GNU Linux Unix CLI Books
Sun, 19 Jul 2009
Comments on “10 Things for Linux Desktop Evangelists to Ponder”
Technewsworld has an opinion piece listing 10 things needed to bring desktop Linux closer to reality. Here is a snippet:
8. Convince the killer-apps owners to create real and usable ports of their products.
…
7. Find a sponsor willing to step up to real publicity for Linux.
…
5. Pay for Linux!
…
1. Lose the attitude! Lose the edge! Stop whining already!
I’ve said it before, and I’ll say it again (and again), it’s all about the OEMs. None of this stuff matters to anyone but us geeks. People use Windows on the desktop because of the lock Microsoft has on the OEM market. It’s not about the apps, or the OS, or the Free Software. Generally people will use whatever comes with whatever they buy. That’s why Google’s announcement of a new OS was so important - they were very public about the OEM agreements they have in place to put their OS on hardware that consumers will buy. They’re giving plenty of warning to the app developers to get ready, in this case to web-enable their apps.
posted at: 07:01 | path: / | permanent link to this entry | 0 comments | tags: Linux Desktop Opinion
Fri, 17 Jul 2009
Monitoring and Alerting on Linux Logfiles
As a sysadmin, I’ve found it’s always useful to monitor system logs on your Linux or Unix servers for specific patterns of activity, things that can indicate security or system issues. Even nicer to get alerts when activity occurs. Some time ago I wanted a simple solution that would allow me to continuously monitor the ClamAV updater (freshclam) logfiles and send email alerts - the result was this script. Recently I wanted something a bit more general, so I wrote this Perl script that monitors any logfile for a specific pattern and generates email or syslog alerts.
Installing Logmon
It needs a few non-core Perl modules to run, namely Mail::Mailer, Proc::Daemon, Unix::Syslog and File::Tail, but these can be installed pretty easily as packaged modules or via CPAN. On Debian/Ubuntu systems, all the needed modules are pre-packaged for you:
apt-get install libmailtools-perl libunix-syslog-perl libfile-tail-perl libproc-daemon-perl
On red Hat/Fedora servers, you can use yum:
yum install perl-MailTools perl-Unix-Syslog perl-File-Tail perl-Proc-Daemon
To pull in all the modules from CPAN, you can use this one-liner:
for m in Mail::Mailer Proc::Daemon Unix::Syslog File::Tail; do perl -MCPAN -e "install $m"; done
Once the modules are installed, download the script and double check that it will run without error, then copy it to your path and make it executable. You should see no errors about missing modules when you run the script under ‘perl -cwT’.
perl -cwT ./logmon.pl
install -m 755 logmon.pl /usr/local/bin/
Running Logmon
It’s meant to be both simple and secure, here is the usage summary:
logmon.pl synopsis: Daemon that periodically checks logfile for a pattern and send alerts
Pattern is always required. If no other options are given, defaults to syslog alerts and monitors /var/log/messages for given pattern.
Usage: logmon.pl -p pattern [-m alerts@example.com] [-f logfile] [-u run as user] [-g run as group] [-i max interval] [-v] [-d] [-h]
-m: Email destination for alerts
-f: logfile to monitor
-p: Pattern to match against lines in logfile (Perl regexp, match is case-insensitive)
-u: Run with permissions of user
-g: Run with permissions of group
-i: Max time to sleep between checks
-d: Debug output to STDOUT, do not daemonize
-v: Verbose logging (use with caution or you may have endless alerts)
-h: This help text
Running the script is pretty straightforward, you specify a pattern to match against with -p (this is the only required parameter), and optionally an email recipient (-m) and logfile to watch (-f). Here is an example. Let’s say you want to get alerts whenever MySQL detects a crashed table. The syslog event for this looks like this on my Ubuntu box:
Jul 17 08:02:49 kaylee mysqld[1532]: 090717 8:02:49 [ERROR] /usr/sbin/mysqld: Table './mysql/user' is marked as crashed and should be repaired
And here is the command line usage:
logmon.pl -p 'mysqld.+?table.+?crashed' -m you@example.com -u nobody -g adm -f /var/log/syslog -i 30
Note that the pattern match is case-insensitive. When run this way, Logmon will detach itself from your terminal and run as a daemon, checking /var/log/syslog every 30 seconds for the supplied pattern. I recommend you use the -u and -g options to force Logmon to drop it’s privileges, just make sure you specify a user or group that have read-permissions on the specified logfile. On Debian and Ubuntu servers, all the system logs are readable by the group ‘adm’.
Other Options and Tips for Testing Logmon
Logmon will dump alerts to your default system log if you leave off the -m option. If you also use the -v option for verbose logging, these syslog entries and alerts will have pattern and match data. If you end up monitoring the same file you are dumping alerts into you’ll get an endless series of alerts continuously being added to the system log. For this reason when alerts are sent to syslog, by default they are very generic (email alerts are always verbose).
To test Logmon, use the -v and -d options together. Run this way, Logmon will not daemonize itself, and will just print alerts and activity to your console (STDERR).
Errors
Any errors that cause the script to die while it’s running as a daemon can be found in your system log.
Startup and Shutdown
I haven’t yet written any startup scripts for Logmon, although I plan to. For now, just start it from one of your system’s startup scripts, and if you have to stop it you can just use pkill logmon. Please send any bugs or suggestions to the email address in the script header, or leave a comment here.
posted at: 02:17 | path: / | permanent link to this entry | 0 comments | tags: Linux Sysadmin Logs Tips Code
Thu, 16 Jul 2009
Process Monitoring on Linux Servers
I’m updating some of the older articles on this blog, making sure the links work, updating the referenced software with newer versions and generally re-testing everything to make sure it still works on the latest crop of distros. Since I’m on the topic of processes, I updated Monitoring Unix System Processes with Psmon. Psmon is a very useful tool for monitoring running processes - every sysadmin should have it in their toolbox. I encourage you to take a look.
posted at: 23:10 | path: / | permanent link to this entry | 0 comments | tags: Linux Sysadmin Porcesses Monitoring Tips