5 Ways To Manage Background Jobs In A Shell Environment

When running commands that are going to take awhile, I frequently start them with the nohup command, disown the command from the current session or queue them for later execution. The reason is that if I’m running them from a Terminal or SSH session and the session is broken I want to make sure they complete. To schedule a job for later execution, use at. For example, if I want to perform a simple command, I can schedule it in a minute by running it as an echo piped to at: echo "goldengirlsfix.sh" | at now + 2 minutes Note, if using 1 minute, you’ll need that to be singular. But you can also disown the job. To do so, end a command with an & symbol. So, running a command or script that will take awhile with an ampersand at the end displays the job number for the command and then you can disown it by running disown followed by -h at the end. for example: du -d 0 & disown -h If you choose not to disown the job, you can check running jobs using the jobs command at any time: jobs Nohup runs a command or script in the background even after a shell has been stopped: nohup cvfsck -nv goldengirls & The above command runs the command between nohup and the & symbol in the background. By default, you’ll then have the output to the command run in the nohup.out file in your home directory. So if your username were krypted, you could tail the output using the following command: tail -f /Users/krypted/nohup.out You can also use screen and then reconnect to that screen. For example, use screen with a -t to create a new screen: screen -t sanconfigchange Then run a command: xsanctl sanConfigChanged Then later, reconnect to your screen: screen -x And you can control-n or control-a to scroll through running background processes this way, provided each is in its own screen. Finally, in AIX you can actually use the bg command. I used to really like this as I could basically move an existing job into the background if I’d already invoked it from a screen/session. For example, you have pid 88909 running and you want to put it into the background. You can just run bg 88909 and throw it into the background, allowing you to close a tty. But then if you’d like to look at it later, you can always pop it back using, you guessed it, fg. This only worked in AIX really, but is a great process management tool.

Don't Defrag the Whole SAN

I see a numer of environments that are running routine defragmentation scripts on Xsan volumes. I do not agree with this practice, but given certain edge cases I have watched it happen. When defragmenting a volume, there is no reason to do so to the entire volume. Especially if much of the content is static and not changing very often. And if specific files doesn’t have a lot of extents then they are easily skipped. Let’s look at a couple of quick ways to narrow down your defrag using snfsdefrag. The first is by specifying the path. In this case you would specify a -r option and follow that with the path starting path you want to recursively seek fragmented files. The second is to limit the number of extents in the file. To combine these, let’s assume that we are looking to defragment a folder called Seldon on an Xsan volume called Harry. snfsdefrag -r -m 25 /Volumes/Harry/Seldon You should also build logic into scripts if you are automating the events. For example, you could also use the -c option to just look at how many extents there are and perform the actual defragmentation as part of an if/then only in the event that there are more than a specified threshold. Another example is to check that there isn’t an existing process running in snfsdefrag. Also, if there is then don’t fire up yet another instance:
currentPID=$(ps -ewo pid,user,command | grep snfsdefrag | grep -v grep | cut -d ” ” -f 1) echo The current snfsdefrag PID is ${currentPID} so we are aborting the process. > $logfile
If you insist on automating the defragmentation of an Xsan volume, then there’s lots of other little sanity checks that you can do as well. Oh, you’re backing up, right?

Isolating iNodes in Xsan cvfsck Output

I’ve noticed a couple of occasions where data corruption in Xsan causes a perceived data loss on a volume. This does not always mean that you have to restore from backup. Given the cvfsck output, you can isolate the iNodes using the following:
cat cvfsck.txt | grep *Error* | cut -c 27-36 > iNodeList.txt
Once isolated you can then use the cvfsdb tool to correlate this to file names. For example, if you have an iNode of 0x20643c8 then you can convert this into a file name using the following:
cvfsdb> show inode 0x20643c8
The output will be similar to the following:
000: 0100 8000 3f04 0327 5250 2daa 0000 0000 |….?..’RPL….. 010: 0000 024d 6163 506f 7274 1233 3455 362e |…MyFile-9.6. 020: 302d 2222 2e35 1ca4 656f 7061 7264 2e64 |0-Leopard.d 030: 6d67 0404 084e 5453 4400 0000 0000 0000 |mg…NTSD……. 040: 0000 0000 0000 0000 0000 0000 0000 0000 |……………. 050: 0000 0000 0000 0000 0000 0000 0000 0000 |…………….
The string to the right of the | and between the … characters can then be used to obtain a file name. Using that file name you can then put humpty dumpty back together. If you have a lot of corruption that cvfsck has fixed then you can have a lot of recompiling and therefore would want to automate the task in a script.

Xsan: Corruption

Volumes can become corrupt no matter what file system you are talking about (er, there might a magical file system out there that cannot become corrupted but I’ve never heard of it and would like to sell a certain bridge to you if you have).  Xsan is no different and so you need to be ready to use the command line to combat said corruption.  fsck is the traditional *nix tool to fix issues with volume corruption.  cvfsck is the weird cousin that’s used for Xsan.  If you see any iNode errors in your logs, corruption errors, high latency or just too many weird issues to shake a stick at then use cvfsck to check for errors.  It can be run in a non-destructive mode (it is by default actually).  If errors are found then, if possible, backup the SAN immediate as cvfsck could cause the volume to get shredded (or more commonly for specified files on the volume to become unuseable).  Then you can use cvfsck to repair the volume.