Recently I’ve been looking at a lot of log files. And sorting through them can be a bit of a pain. However, there are some tools out there to help make this process a bit easier. The first of these is sort. If I have a log that has 1,000 lines, while I like to initially see any lines that are repeated numerous times so that I can see when servers are throwing a lot of errors, combing through them can get tedious. Sort will help to reduce the volume and organize them in a manner that makes sense. For example, to sort the logs and remove duplicate line entries we could use the following:
sort -u /Library/FileSystems/Xsan/data/MYVOLUME/logs/cvlog
The uniq command can also be used to simply remove duplicates. For example, if we’re looking to cat the log without changing the order and then remove unique entries:
cat /Library/FileSystems/Xsan/data/MYVOLUME/logs/cvlog | uniq
If we want to get a little more granular, we can also constrain the output to lines containing a word using grep. For example, if we only want to see the lines that have the word Error in them and we want to remove uniq entries:
cat /Library/FileSystems/Xsan/data/MYVOLUME/logs/cvlog | grep Error | uniq
There are also times when we only want to see lines that are not repeated, so to leverage sort and only look at lines that appear once:
cat /Library/FileSystems/Xsan/data/MYVOLUME/logs/cvlog | uniq -u
The volume of a given error can be indicative of some issues. To look at lines that repeated a certain number of times prefixed with the number of times they were shown:
sort /Library/FileSystems/Xsan/data/MYVOLUME/logs/cvlog | uniq -c
Now, uniqueness can totally be thrown off by the date and time stamp typically prefixed to a line entry in a log. Therefore, the -n and +n options of uniq will help to get that out of the way. Basically, use them to ignore a number of fields and characters respectively. For example, let’s say that we wanted to look at lines that appear only once, but we also wanted to ignore the first 14 characters in the line as they were strictly used for date and time stamps. Then we could use:
cat /Library/FileSystems/Xsan/data/MYVOLUME/logs/cvlog | uniq +n 14
If you haven’t started to leverage sort and uniq in your log perusal then getting started may seem like it takes longer to figure out the commands than it takes to go through the logs. But keep in mind that the more you do it the faster you get. And the faster you get zipping through those logs the more quickly you can restore service, triangulate corruption and most importantly go surfin’. Happy huntin’.