Sun, 23 Aug 2009
Move Over, Grep. Hello, Ack
As someone who has been using grep and its variants like egrep for years, I admit they have been insanely useful. But every once in a while something comes along that improves an idea so much, you can’t ignore it. Such a thing is Ack, the grep replacement.
I do a lot of software development in large codebases, and the ability to find snippets of text is paramount. Tags can be used and integrated with Emacs (or Vim, we’re not all perfect), which is great for function names, but not useful for general text searches. Using grep in a code repository is a pain, and usually means some sort of hack to ignore VC directories like .svn and RCS. Enter ack - similar to grep but with some more thought behind it. It ignores VC meta-data directories by default and is written in pure Perl - so it’s portable and supports the full Perl regexp syntax. Having a pure-Perl version available with no dependencies also means its easy to install in shared hosting environments, where you don’t have root access.
Install ack by just downloading the standalone version and put it in your command path, use CPAN (cpan App::Ack), or install a pre-packaged binary (On Debian/Ubuntu systems, the package name is ack-grep). Ack output is very readable, with highlighted matches by default as well as line numbers and file names. Here is an example:
dmaxwell@kaylee:~/tmp$ ack-grep -ai 'limit_as.+?\&rlimit' emacs-22.3
emacs-22.3/src/vm-limit.c
76: getrlimit (RLIMIT_AS, &rlimit);
Here is a screenshot so you can see the highlighting and colorization:
The -ai means ’search all, case insensitively’, and tells Ack to search all filetypes (but still not including common VCS directories or files), while ignoring case. Ack searches are recursive by default, so there is no need for a -r switch. You can see we used Perl’s non-greedy match quantifier in the search regexp, something egrep can’t do. This speeds the search up considerably.
There is much more to ack, read the docs and give it a try. I hope you’ll find it as useful as I have.
posted at: 19:20 | path: / | permanent link to this entry | 0 comments | tags: Perl Grep Ack Tips
Sun, 16 Aug 2009
Five Minutes to an Even More Secure SSH
One of the most popular articles on this blog was Five-Minutes to a More Secure SSH. My impetus for writing it was seeing too many client’s servers left in a default state where they are vulnerable to brute-force attacks. In it, I basically advocate three things:
- Disabling password authentication
- Disabling root login
- Enabling key-based authentication
Three years later, those recommendations still hold true and I would encourage you to follow them. However, OpenSSH has many features and there is more you can do to secure your SSH servers, without resorting to external software.
Important Notes: The main
OpenSSH server configuration file is
called sshd_config and will
typically be in the /etc/ssh
or /etc/sshd directories. Like
all of the configuration files used by OpenSSH, it is in plain text
and so can be edited with
any text
editor. After editing
your sshd_config file, you
will need to reload your SSH server’s configuration - restarting the
SSH daemon is not necessary. The command typically looks like this
(this is on Debian or Ubuntu):
/etc/init.d/ssh reload
or (on Red Hat/Fedora):
service sshd reload
Also be careful not to lock yourself out of your SSH server when experimenting with these access controls. It’s a good idea to always have two SSH sessions into the server, and to always make backup of the relevant configuration files. If you log out of one session and get denied access, you still have one active session to fix things.
Restricting Users and Hosts
OpenSSH allows you to restrict users and groups by host or IP address. There are four different directives you can use in your sshd_config file (they are evaluated in this order):
DenyUsers
AllowUsers
DenyGroups
AllowGroups
The format for all of them will be the same - a space-separated list of users or group names, with optional host names. Here is an example:
AllowUsers vader@10.0.0.1 maul@sproing.evillittleman.net sidious tyranus@*.evillitleman.net
AllowGroups wheel staff
This tells sshd to only allow connections from the user vader and only from the IP address 10.0.0.1. The user maul is also allowed, but only from the host sproing.evillittleman.net. User sidious is allowed from anywhere, and the user tyranus is also allowed, from any host in the evillittleman.net domain (the asterisk matches zero or more characters).
The AllowGroups line allows login only from users whose primary group name or supplementary group list match one of ‘wheel’ or ’staff’.
Keep in mind that using AllowUsers or AllowGroups means that anyone not matching one of the supplied patterns will be denied access by default. Also, in order for sshd to allow access based on full or partial hostnames, it needs to do a DNS lookup on the incoming IP address. That means the connecting IP address must have a PTR (reverse) entry that maps back to a real hostname. These aren’t hard to get if you have a static IP address, usually your ISP or server hosting provider can do this for you on request. If your server is internal, you probably have your own DNS server and can add appropriate PTR entries yourself.
In addition to the asterisk in hostname or group patterns, you can use a question-mark to mean exactly one character, and an exclamation point to negate the sense of a match:
* - Matches zero or more characters
? - Matches exactly one character
! - Negates the host pattern match
Note: In my tests, using ! to negate the sense of the hostname match did not work with the AllowUsers directive. It only seems to work when used with authorized_keys file restrictions (see below).
Restricting Access and Commands
SSH has the concept of authorized keys. If you are using key-based auth, like I suggested in my first article, the user accounts on the SSH server will have an authorized_keys file (which is by default in the ~/.ssh directory of whatever user account you are logging into). This file lists the public keys, one per line, that are authorized for access to that account. Apart from just specifying which public keys are allowed access, there are a some more options that you can use to further restrict SSH sessions. Here are the most useful ones:
from=”hostname1,hostname2,…” - Restricts access from the specified IP or hostname patterns
command=”command” - Runs the specified command after authentication
no-pty - Does not allocate a pty (does not allow interactive login)
no-port-forwarding - Does not allow port forwarding
Here is an example showing part of an authorized_keys file:
from="deathstar.example.com,!jedi.example.com,10.0.0.?" ssh-rsa AAAAB5...2BQ== vader@evillittleman.net
from="pitofdespair.example.com",command="ls",no-pty,no-port-forwarding ssh-dss AAAAZ7...22Q== droidQBX12@evillittleman.net
The first line allows login with the specified RSA key from deathstar.example.com, from any host with IP address in 10.0.0.[0-9], but not from the host jedi.example.com. The second line merely runs the ‘ls’ command whenever the specified DSA key is used - it does not allow any other commands to be run, does not allow interactive login, and does not allow port-forwarding. It also restricts the source of that key to the host pitofdespair.example.com.
Running sshd on a Non-Standard Port
Admittedly this is an attempt at ’security through obscurity’, but that doesn’t mean it’s not useful when combined with other security measures. You may not be able to restrict access by hostname or IP, for example - you may always be sourcing your connections from a dynamic IP address, or you may not be able to get a proper PTR record created. It’s also very easy to do. In your sshd_config file, just change Port=22 to Port=nnnnn (where nnnnn is some high port), then reload the sshd configuration. How do we pick a port number? Some are better than others. First, assume that most port scans are being done with Nmap, and take a look at the nmap-services file. This is a list of ports that Nmap will use by default if you don’t specify a port range on the command line. It’s probably a fair bet that most script-kiddies are using nmap is this manner. Just pick a high port not on this list, most nmap scans won’t notice it. You can also use multiple Port= directives, meaning you can have sshd listen on multiple ports. Connecting to an alternate port is also very easy, use the following options depending on the command used:
ssh -p 65502 vader@deathstar.example.com
sftp -oPort=65502 vader@deathstar.example.com
scp -P 65502 deathstar_plans.doc vader@deathstar.example.com:
You can also edit your ~/.ssh/config file, and add the Port= directive to one of your host blocks:
...
Host evil
Hostname deathstar.example.com
User vader
Port 65502
...
Then just connecting with the command ssh evil will connect with the specified user and port.
Hashing Known Hosts Files
When you connect to an SSH server, the ssh client stores the server’s hostname, IP address and host key in a file named known_hosts. It will by default be in your ~/.ssh directory. Having the IP addresses of the servers you connect to regularly in plaintext can be a security risk if you are on a shared host, or your client gets compromised (stolen laptop, for example). An easy way to avoid this problem is to obscure the information in the known_hosts file by hashing it. Hashing your known_hosts file is easy, you just use the ssh-keygen command, giving it the file path.
ssh-keygen -H -f ~/.ssh/known_hosts
While this hashes all existing host keys, any host keys that get added to your known_hosts file after you hash it do not get hashed by default. To make it the default, add the directive HashKnownHosts to your ~/.ssh/config file. Here is an example of hashing a known_hosts file. First, here is what the file looks like beforehand:
dmaxwell@kaylee:~/.ssh$ head known_hosts
10.100.6.151 ssh-rsa AAAAB4NzaC1yc2EAAAABIwAAAIEAuVgRdptT3xsQoGkiNnJb4Zb02p07MaZX02MFs5JhoqmvV5X5Z/LEQH0S7ngSn3b8kQUnocGulJgLchwfThrd/1OkdyOKdpgXxH/rmDXfwh/MZBNBxnMWBa1HpXSc1gxyDfSSxo+VPa1NCP+ob0dWx4sI+JFJ5cVzbQng4rKp3x8=
10.100.6.162 ssh-rsa AAAAB4NzaC1yc2EAAAABIwAAAIEAxpQuMJR4Dq/MmrpUryYlNbP+BIWgJlr0LAfaHTIU64Ho6F58Bb1QzlUeeHQSI9f6qFW9aPsBC3Gd5wgQBUj3byinXXHC/10c3vmb2aEujmyL6en2Pef4AN8bKgaRtJq2G/H4MkPWBzxqZPb/k9c3a26P/DjG4y01TMw9vCld+As=
...
Here we run the ssh-keygen command:
dmaxwell@kaylee:~/.ssh$ ssh-keygen -H -f ~/.ssh/known_hosts
/home/dmaxwell/.ssh/known_hosts updated.
Original contents retained as /home/dmaxwell/.ssh/known_hosts.old
WARNING: /home/dmaxwell/.ssh/known_hosts.old contains unhashed entries
Delete this file to ensure privacy of hostnames
And here is what the file looks like afterward (Note that we deleted the backup file when we were done):
dmaxwell@kaylee:~/.ssh$ head known_hosts
|1|PdThGCuhg23t9bcURxyitJTmfKk=|/z+Xvh4xPuDni8PTB5iK7KKnGdA= ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAuVgRdptT3xsQoGkiNnJb4Zb02p07MaZX02MFs5JhoqmvV5X5Z/LEQH0S7ngSn3b8kQUnocGulJgLchwfThrd/1OkdyOKdpgXxH/rmDXfwh/MZBNBxnMWBa1HpXSc1gxyDfSSxo+VPa1NCP+ob0dWx4sI+JFJ5cVzbQng4rKp3x8=
|1|vkLZ22nl30gyJ3gIX74FUF7b3eg=|uy5oSZ8avgZQZE+dwMd/mXGoA38= ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAxpQuMJR4Dq/MmrpUryYlNbP+BIWgJlr0LAfaHTIU64Ho6F58Bb1QzlUeeHQSI9f6qFW9aPsBC3Gd5wgQBUj3byinXXHC/10c3vmb2aEujmyL6en2Pef4AN8bKgaRtJq2G/H4MkPWBzxqZPb/k9c3a26P/DjG4y01TMw9vCld+As=
...
dmaxwell@kaylee:~/.ssh$ rm known_hosts.old
Donate!
OpenSSH is an amazing tool, one most system and network admins couldn’t live without. I encourage you to donate to the OpenSSH project.
More Information
- sshd man page
- sshd_config man page
- ssh man page
- ssh_config man page
- SSH, The Secure Shell: The Definitive Guide
posted at: 08:42 | path: / | permanent link to this entry | 0 comments | tags: Sysadmin SSH Security Nwetworking Tips
Mon, 10 Aug 2009
The Forgotten Power of Unix Text Utilities
I’m the first to extol the virtues of scripting languages like Python and Perl in particular. But they aren’t always the best tool for the job. It’s often forgotten how powerful the original Unix (and now GNU) text processing utilities are. Recently on linuxquestions.org, someone was asking how to combine specific columns from multiple CSV files into a new CSV file. They had the start of a Perl solution that was not working correctly, and wanted advice on it. My advice was to go with a one-line shell solution which is simply this:
paste -d, <(cut -d, -f3 file1.csv) <(cut -d, -f3 file2.csv) > output.csv
This will combine the third column from each specified file into a new file. It relies on a feature of the more modern Bourne shells, process substitution - the two parts that look like <(…). Here it is in action:
dmaxwell@kaylee:~$ cat foo1.txt
a1,a2,a3
b1,b2,b3
dmaxwell@kaylee:~$ cat foo2.txt
A1,A2,A3
B1,B2,B3
dmaxwell@kaylee:~$ paste -d, <(cut -d, -f3 foo1.txt) <(cut -d, -f3 foo2.txt)
a3,A3
b3,B3
You can paste columns from as many files as you need here. One catch, of course, is that this only works with simple CSV data - meaning there are no embedded commas in the data fields themselves. But this is much more understandable than any lengthy scripting language solution.
One other tip, if you had to get rid of the first row, which might contain column header data, just pipe the output through tail:
paste -d, <(cut -d, -f8 file1.csv) <(cut -d, -f8 file2.csv) | tail -n +2 > output.csv
posted at: 15:33 | path: / | permanent link to this entry | 0 comments | tags: Unix Linux Sysadmin Tips Textutils
Tue, 04 Aug 2009
Colophon
A note on the new blog format. I've opted for simplicity and changed from Wordpress to PyBlosxom. Wordpress is great, but I found my normal workflow (pretty much all shell and Emacs) conflicted with the web interface. This took a bit more effort to setup, but the ease of use is worth it. You might be interested in the plugins I'm using:
- tags
- pyarchives
- pycalendar
- conditionalhttp
- readmore
I also wrote a couple of shell scripts, one to publish the posts from a staging directory, and another to save or restore the post mtimes, so I can edit old entries and still preserve the original blog order. I converted old site from its dynamic form to a static one, using wget and Perl's wonderful regular expression engine. That served two purposes - it meant that the pages are now served much more quickly while the original URLs are preserved for the search engines.
posted at: 11:39 | path: / | permanent link to this entry | 0 comments | tags: Meta Blog Pyblosxom Plugins
