Wednesday, January 30, 2008

Shell Script to Tunnel VNC Sessions over SSH (Improved)

Following is a parameterized shell script which you can use with this syntax:
Usage: vnctunnel [ssh session] [screen] [vnc options]
Example: vnctunnel user@server :1

It's "improved" because it's completely parameterized--you can specify the SSH session, screen, and even vnc options.

(Assuming you copy it to your ~/bin directory and make it executable)

#!/bin/sh
# This is a parameterized script to SSH tunnel a VNC session
# Ryan Helinski

# Local settings
# Change this if you don't like `vncviewer'
VNCVIEWER=vncviewer
# Change this if you use ports near 25900 for something else locally
PORTOFFSET=20000
# Apply extra options to vncviewer
VNCOPTS="--AutoSelect=0 $3"

if [ $# -lt 2 ] ; then
echo "Usage: $0 [ssh session] [screen]";
echo "Example: $0 user@server :1";
fi

SSHPARM=$1;
SCREEN=`echo $2 | cut -d':' -f2`;
SSHPORT=$[$SCREEN+5900];

echo "Session: $SSHPARM, Screen: $SCREEN, Port: $SSHPORT"

ssh -f -L $[$PORTOFFSET+$SSHPORT]:localhost:$SSHPORT $SSHPARM sleep 10; \
vncviewer localhost:$[$PORTOFFSET+$SSHPORT]:$SCREEN $VNCOPTS

exit

Using 7-zip for Archival

I've been using 7-zip to archive large collections of files--mainly college and documents which have high potential to be compressed. 7-zip has a high compression ratio--higher than bzip2, but also higher than the evil RAR archive everyone seems to like--and it's GPL and widely available for GNU+Linux, UNIX and Mac via `p7zip'. However, I've been slightly concerned that it doesn't retain POSIX file modes. For files like text, graphics, etc. this doesn't really bother me. However, if I was going to archive something which contains scripts, or a shared directory where user ownership needs to be preserved, I'd need to retain this extra information.

I have seen it a couple times while browsing, and thought "oh...". When I tried to look for it, I really found a lack of examples. After careful reading of the manuals for tar and 7z, following is the solution:

To encode:
tar cf - data/ | 7za a data.tar.7z -si


To decode:
7za x ~/data.tar.7z -so | tar xf -

Monday, January 28, 2008

Trash Management Scripts

Moving files to trash



Today's post is about some trash management scripts I have written for myself and would like to share. They should run on most GNU+Linux environments.


If you're like me, you prefer to mv file ~/.Trash/ rather than rm file, but this is both dangerous and hard to type. I created a (very small) script that will act as a command you can use to move files to your Trash bin without overwriting anything.


Uses 'numbered' backup scheme of 'mv', so the most recently trashed
file will have its name intact, and the backups will have the suffix
".~n~" appended, where the higher the n, the more recent the backup.



#!/bin/sh
#
# Moves files in the command arguments to the trash bin, while keeping
# backups of any files already in that directory.
#
# Uses 'numbered' backup scheme of 'mv', so the most recently trashed
# file will have its name intact, and the backups will have the suffix
# .~n~ appended where the higher the n, the more recent the backup.
#
# It uses the basename of the file so that no (absolute or relative)
# directory is preserved.
#
TRASH="$HOME/.Trash";

while [ $# -gt 0 ];
do
mv --backup=numbered "$1" "$TRASH/`basename $1`";

shift;
done


After you copy this file to your favorite bin directory and make it executable, you can use the following syntax

trash file1 file2 path/to/file3 path/to/file4

And, even if file1 and file3 have the same name, you'll still be able to find both in your trash bin.


Rounding up and deleting trash automatically



If your trash is anything like mine, the probability that a file is not trash increases exponentially every day. For this reason, I wanted to somehow tag when I threw something out so I could tell how old it was, and maybe delete everything after a specific number of days.


The solution I came up with was to create siblings of the .Trash directory, using UNIX time-stamps. In conjunction with a cron task, yesterday's trash will be in a directory at ~/.Trash-XXXXXXXXXX where the X's are the time-stamp for 4:00am today. In this manner, you'll have a bin for each day you throw something out. The script follows.



#!/bin/sh
#
# The first step is to move all files under ~/.Trash, other than
# ., .., and .#bin-n into a new trash bin for yesterday.
#
#

NEWBIN=`date +%s`;
NEWBINPATH="$HOME/.Trash-$NEWBIN";
OLDESTBIN="20"; # days

OLDESTBINTIME=`date -d "now - $OLDESTBIN days" +%s`;
OLDESTBINPATH="$HOME/.Trash-$OLDESTBINTIME";

echo "Moving current trash to $NEWBINPATH";
mkdir $NEWBINPATH;
mv $HOME/.Trash/* $HOME/.Trash/.[!.]* $NEWBINPATH/;

for BIN in $HOME/.Trash-*; do

STAMP=`echo "$BIN" | cut -d'-' -f2`;
# echo $BIN $STAMP $OLDESTBINTIME;
if [ $STAMP -lt $OLDESTBINTIME ] ;
then
echo "Deleting $BIN";
# rm -Rf $BIN;
fi

done


So I copied this file into my ~/bin/ directory and used crontab -e to add the following line to my crontab:

00 4 * * * /home/ryan/bin/trash-roundup.sh


Right now this script will send errors to crond, which should be delivered in your local mail (accessed using mail). Also, deleting old bins is disabled since I haven't had a chance to thoroughly test it.


To actually delete old trash, choose a value for the OLDESTBIN variable in the script, this is the longest time that a bin will hang around. Then, you have to un-comment the line with rm in the script.

Friday, January 25, 2008

Using the Seagate FreeAgent as a (periodic) Mirror Backup

I recently purchased a Seagate FreeAgent 100D USB drive. The drive is quite nice, small, quiet, and has advanced power management. I plugged the device in on a Fedora 7 installation, and it came right up.



The first problem I had was the NTFS partition that came on the device. This is easily fixed with fdisk and mkfs.ext3. Don't forget to use e2label, especially if you have more than one of the same device.



The next thing you want to do is determine the "id" of the device, so that you can uniquely identify it. The problem is that udev sets up devices in order of connection (/dev/sda, /dev/sdb, ...). To remedy this, use the symbolic links that udev creates under /dev/disk/ to identify the disk by one of the following means:




  • /dev/disk/by-id/ - Uses connection, make, model, and serial number

  • /dev/disk/by-uuid/ - Uses the UUID given to the partition when it was created

  • /dev/disk/by-label/ - If you have used e2label, this is more meaningful than the UUID

  • /dev/disk/by-path/ - This is the one you do not want to use!



I want to disable normal users (including myself) from modifying the contents of the backup, so the plan is to keep it unmounted and disable users from mounting it. Thefefor, this is the appropriate entry for me in /etc/fstab:



/dev/disk/by-id/usb-Seagate_FreeAgentDesktop_30DFK39D-0:0-part1 /media/FreeAgent        ext3    defaults        1 2


Note this will require the device to be present and consistent at boot-time. If you don't need this, change the last two numbers in the fstab entry to zero.

Before automating backup, I had a problem: The drive spins down automatically, so if you use only a command like rsync it may terminate due to an I/O Error because it simply timed out. This happened with me using ls, but it doesn't always happen -- so you should be more paranoid about the power mode.

The solution is to use a utility called sdparm, which is made to control SCSI (and SATA) disks. This is available for Fedora in a yum package called simply sdparm. This allows you to send a command to the drive to "start" (spin up) or "stop" (spin down).

Then, the final task is to create a backup procedure that is invoked by crond. Rather than writing something to /etc/crontab, I added the following file to /etc/cron.daily/. Make sure it has execute permissions, and call it (as root) a couple times to make sure it's working.

#!/bin/sh
#
# Backup to a USB disk
#

# Env variables
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
HOME=/

# OK to change these
LOGFILE=/var/log/backup-usb.log
DISKDEVICE=/dev/disk/by-id/usb-Seagate_FreeAgentDesktop_30DFK39D-0\:0
DISKPARTITION=$DISKDEVICE-part1
MOUNTPOINT=/media/FreeAgent

# Leave these alone
DEVICEFILE=`readlink -f $DISKDEVICE`
PARTITIONFILE=`readlink -f $DISKPARTITION`

# Close stdout
1>&-;
# Close stderr
2>&-;

# Log needs to be rotated manually
#mv $LOGFILE $LOGFILE.1

# Direct stdout and stderr to logfile
exec 1>>$LOGFILE;
exec 2>>$LOGFILE;

echo Begin $0, `date`;

# Preliminary work (get backup device online)
if [ -e $DISKDEVICE ] ;
then
echo "Device exists (is connected).";

# /usr/bin/sdparm --all $DISKDEVICE

echo "Sending start (wake up) signal...";
/usr/bin/sdparm --command=start $DISKDEVICE
if [ "$?" -eq "0" ] ;
then
echo "Success";
else
echo "Failed to start device";
exit;
fi

# Get backup partition mounted
if [ `grep $PARTITIONFILE /etc/mtab | wc -l` -le "0" ] ;
then
echo "Device $PARTITIONFILE is not mounted!";
mount $PARTITIONFILE $MOUNTPOINT;
if [ "$?" -eq "0" ] ;
then
echo "Mounted OK.";
else
echo "Failed to mount.";
exit;
fi

else
echo "Partition $PARTITIONFILE is already mounted.";
fi
else
echo "Device doesn't exist, must not be connected!";
exit;
fi

# Backup procedure
#
# For now I'm just using RSYNC because these files don't often change
#
rsync --verbose --itemize-changes \
--archive --hard-links --partial --delete-before \
/opt/srv/ $MOUNTPOINT/srv/ && \
date && \
df -h $MOUNTPOINT && \
echo ;

# NOTE might also want to un-mount the device at this point
# to prevent users from modifying it directly !
echo "Unmounting $PARTITIONFILE"
umount $PARTITIONFILE
echo "Stopping device $DISKDEVICE"
/usr/bin/sdparm --command=stop $DISKDEVICE
# /usr/bin/sdparm --all $DISKDEVICE && \

echo End $0, `date`;


The variables section and the rsync command have to be updated to fit your installation and your backup scheme.



Again, this worked well for me, you may have to make adjustments.

Friday, January 18, 2008

Shell Script to Reorganize a Mirror to Match a Newer Mirror

This script has a very specific purpose, but you've probably had a chance where it could have come in handy. The idea is, if there are two mirrors of a bunch of files, and you reorganize (more around and/or rename) files on one mirror, then when you go to RSYNC the other mirror, it thinks the files that were only moved were added and deleted. 

To use this script, you would first move the files on one mirror, and generate an MD5 sum file, as in:
$ find ./ -type f -exec md5sum {} \; | tee checksums.md5

Then, copy this small file to the root of the other mirror, and invoke the script with the name of the checksums file:
$ md5sum-reorg.sh checksums.md5

The script will process every file under the working directory and attempt to correct its path so that it matches a file in the new 'checksums.md5'. Files not found in the checksums file are left alone. Files already in place are left alone (obviously), without even calculating their md5sum.

Script follows:


#!/bin/bash
#
# This file takes a checksum file (output from md5sum utility)
# and attempts to reorganize the files in the directory to
# match the listing in the md5 file.
#
# Files not found in the md5 input are left alone.
# Files already in the right place are left alone.
# Other files have their checksums computed and, if they are found
# in the md5 input, they are moved to the appropriate location.
#
# WARNING: It confuses duplicate files!!!
#

if [ $# -lt 1 ];
then
echo "Usage: $0 [checksums file]";
exit;
fi

declare -a SUMS
declare -a NEWPATHS

exec 10<$1
let count=0

echo "Parsing checksums and paths from input file...";

while read LINE <&10; do
SUM=`echo "$LINE" | cut -d' ' -f1`;
NEWPATH=`echo "$LINE" | cut -d' ' -f1 --complement | cut -d' ' -f1 --complement`;
SUMS[$count]=$SUM;
NEWPATHS[$count]=$NEWPATH;
((count++));
done
# close file
exec 10>&-

echo "Compiling list of files that need to be checked...";

TMPFILE=`mktemp`;
find ./ -type f -printf "%p\n" > $TMPFILE;

exec 10<$TMPFILE
while read OLDFILE <&10; do
echo "Trying to find new path for $OLDFILE";

SKIP=0;
let count=0;
while [ $count -lt ${#NEWPATHS[@]} ] ; do
if [ "${NEWPATHS[$count]}" == "$OLDFILE" ]; then
echo "File already exists at '${NEWPATHS[$count]}'";
SKIP=1;
break;
fi
((count++));
done

if [ $SKIP -eq 1 ]; then
continue; #skip the rest of this iteration
fi


echo "Computing checksum of $OLDFILE";
OLDSUM=`md5sum "$OLDFILE" | cut -d' ' -f1`;

# iterate over the pair of arrays until we might find a matching sum
let count=0;
while [ "$count" -lt ${#SUMS[@]} ]; do
SUM=${SUMS[$count]};
NEWPATH=${NEWPATHS[$count]};

if [ "$SUM" == "$OLDSUM" ];
then
if [ "$OLDFILE" != "$NEWPATH" ] ;
then
NEWPARENT=`dirname "$NEWPATH"`;
if [ ! -d "$NEWPARENT" -a "$NEWPARENT" != "." ];
then
echo "Making directory $NEWPARENT";
mkdir "-p" "$NEWPARENT";
fi
echo "Moving $OLDFILE to $NEWPATH";
mv "$OLDFILE" "$NEWPATH";
else
echo "Path hasn't changed.";
fi
break;
fi
((count++));
done

done

exec 10>&-

exit



In case it's not clear, this is offered without any warranty or guarantee whatsoever.

Friday, January 11, 2008

Scripts for Moving Large Files to DVD (GNU+Linux environment)

Often, we have tons of files (videos, ISO's, software packages, etc.) that we're done with but don't want to throw away. I have written a few scripts to pack a directory of these kinds of files up into volumes (without any kind of compression) so that they can be written to DVD. Each volume should have an MD5 sum file generated which provides both a directory for that volume (since you can keep a copy locally and search using grep) and a means to verify its integrity.



The first script attempts to "pack" files at the current directory level (it's non-recursive) into volumes of a specified size. This is really the only part of this post that warrants a real script since it's a nested loop. Note that really small files will always end up in the extra space on the first volume, so you may want to move them first.




#!/bin/sh
#
# Script takes files (or directories) in the current working directory
# and moves them to subdirectories for writing out to discs
#
# This allows collections of relatively same-sized files or directories
# of files to be packed into volumes for storage on optical media.
#
# Modified to output a shell script instead of making any changes
# Disk capacity can now be entered

# Defaults
discSize="0";
discDefaultSize="4380000";
discInitNumDef="0";
discInitNum="0";
scriptPathDef="pack.sh";
diskPath="disks";

echo -n "Enter the volume number at which to start [$discInitNumDef]: ";
read discNumOffset;

if [ "$discNumOffset" == "" ] ;
then
discNumOffset=$discInitNumDef;
fi
echo $discNumOffset;

echo -n "Enter the maximum capacity of the media [$discDefaultSize]: ";
read discMaxSize;

if [ "$discMaxSize" == "" ] ;
then
discMaxSize=$discDefaultSize;
fi
echo $discMaxSize;

echo -n "A shell script will be output, move files now? [y/N]";
read moveFiles;

if [ "$moveFiles" == "" ];
then
moveFiles="N";
echo "Not going to move files.";
fi

echo -n "Enter the path to save the shell script [$scriptPathDef]: ";
read $scriptPath;

if [ "$scriptPath" == "" ] ;
then
scriptPath=$scriptPathDef;
fi

echo "Going to write shell script to '$scriptPath'.";

# Declare disk size array
diskSizes[0]=0;
arraySize=1;

echo "#!/bin/sh" > $scriptPath;

if [ ! -d "$diskPath" ];
then
echo "mkdir \"$diskPath\";" >> $scriptPath;
fi

if [ ! -d "$diskPath/`expr $discNum + $discNumOffSet`" ] ;
then
echo "mkdir \"$diskPath/`expr $discInitNum + $discNumOffset`\";" >> $scriptPath;
fi

for file in * ;
do

if [ "$file" != "$diskPath" -a "$file" != "$scriptPath" ] ;
then
echo "$file";

discNum=$discInitNum;

newSize=`du -s "$file" | cut -f1`;
#discSize=`du -s todisk/$discNum/ | cut -f1`;
#discSize=`expr $diskSizes[$discNum] + $newSize`;
discSize=${diskSizes[$discNum]};

echo "newSize = $newSize, discSize = $discSize";

if [ $newSize -gt $discMaxSize ] ;
then
echo "$file is larger than the disc size, skipping it.";
else
while [ `expr $discSize + $newSize` -gt $discMaxSize ]
do
echo "Won't fit in $discNum + $discNumOffset: $discSize + $newSize > $discMaxSize";

discNum=`expr $discNum + 1`;

if [ $discNum -ge $arraySize ] ;
then
#diskSizes[$diskNum]=0;
diskSizes=( ${diskSizes[@]} 0 );
arraySize=`expr $arraySize + 1`;

if [ ! -d "$diskPath/`expr $discNum + $discNumOffset`" ];
then
echo "mkdir \"$diskPath/`expr $discNum + $discNumOffset`\";" >> $scriptPath;
fi


fi

discSize=${diskSizes[$discNum]};
done

echo "Going to move $file into $discNum to make $discSize kb `expr $discSize + $newSize`";

echo "mv \"$file\" \"$diskPath/`expr $discNum + $discNumOffset`/\";" >> $scriptPath;

# Update disc size entry
diskSizes[$discNum]=`expr $discSize + $newSize`;

fi

fi

done

echo "Disk sizes:";

for DISC in ${diskSizes[@]}
do
echo "$DISC kb";
done

exit;


The next, albeit more simple script creates checksum files (for later use with md5sum -c ...) in each volume and provides the option to save a copy of the checksum file to another location.




#!/bin/sh
#
# Create checksum files for disk volumes generated by 'disk-pack'.
# These files allow the fidelity of the optical media to be
# evaluated, and allow the contents of the disk to be catalogued.
#
# This file should not change any files; only add new files.
#

CATDIRDEF="`pwd`";
echo -n "Path to save a duplicate of the MD5 checksums [$CATDIRDEF]: ";
read CATDIR;

if [ "$CATDIR" == "" ];
then
CATDIR=$CATDIRDEF;
fi

echo "Saving duplicate checksums in '$CATDIR'";

if [ ! -d "$CATDIR" ];
then
echo "Directory doesn't exist.";
exit;
fi

PREFIXDEF="disk";
echo -n "Prefix to use in checksum file names [$PREFIXDEF]: ";
read PREFIX;

if [ "$PREFIX" == "" ];
then
PREFIX=PREFIXDEF;
fi
echo "Using prefix '$PREFIX'.";

for DISK in [0-9]* ;
do

if [ "$DISK" != "." -a "$DISK" != ".." ]
then
echo "Processing volume $DISK";
cd $DISK;
find . -type f -exec md5sum {} \; | tee ../tempsums.md5;
cd ..;

if [ -e "$CATDIR/$PREFIX$DISK.md5" ];
then
echo "WARNING: Catalog file already exists, using alternate name.";
NUM="0";
while [ -e ""$CATDIR/$PREFIX$DISK-$NUM.md5 ]; do
NUM=`expr $NUM + 1`;
done
cp tempsums.md5 $CATDIR/$PREFIX$DISK-$NUM.md5;
else
cp tempsums.md5 $CATDIR/$PREFIX$DISK.md5;
fi

if [ -e "$DISK/$PREFIX$DISK.md5" ];
then
echo "WARNING: File $DISK/$PREFIX$DISK.md5 already exists, using alternate name.";
NUM="0";
while [ -e $DISK/$PREFIX$DISK-$NUM.md5 ]; do
NUM=`expr $NUM + 1`;
done
mv tempsums.md5 $DISK/$PREFIX$DISK-$NUM.md5;
else
mv tempsums.md5 $DISK/$PREFIX$DISK.md5;
fi
fi
done


Finally, you're ready to put these volumes out to optical media (since you've minimized internal fragmentation, captured a catalog of the files, and took an extra step to preserve integrity). You can do so using your favorite method, but when there are many volumes (like more than three) I prefer to take the following steps.



The following command, if you have genisoimage will create a .iso file for the directory '40' and it will have the volume name "Volume40" when you mount it.


genisoimage -o volume40.iso -J -r -V Volume40 40/


After you have a .iso file, you're almost ready to burn. Always, always, always mount the ISO image (mount -o loop -t iso9660 volume40.iso isotest/), enter it and check some of the MD5 sums to make sure you have a good .iso file! You'll have to check the man page for genisoimage and make sure you're providing the command-line options correctly if the files in the ISO seem corrupted.



If you're familiar with cdrecord, it is now provided by wodim. You need to be root. The command looks like:

wodim -v -eject speed=8 dev='/dev/scd0' volume40.iso


Then, before I delete anything, I always insert the CD, preferably into another optical drive, and run md5sum -c volume40.md5. Now you know you have an exact copy, you can put it in a case and delete the original. Note I'm assuming that if the disc fidelity decays that the files can be found again from the Internet--make sure you have even more redundancy if these are your personal files!

New Split Directory Script

Follow-up to: "Making directory traversals more efficient"

Back at the beginning, I posted a lengthy script that would split up a congested directory alphabetically. Recently, I needed it again, but needed it to be smarter, so I re-wrote it. Also, I figured out how to insert code into Blogger. Enjoy.




#!/bin/sh
#
# by Ryan Helinski, January 2008
#
# This is the second revision of a script that should be used
# when there are too many files or directories at a single level
# on the file system.
#
# It now recognizes the word, "the", and that the name should
# be alphabetized by the words following.
#
# A script is output so the changes can be reviewed before
# any are made.
#
# The script should add to existing bin directories if they
# already exist.
#
# A further improvement would be to allow the split to be
# multi-level.

BINS=(0-9 a b c d e f g h i j k l m n o p q r s t u v w x y z);
BIN_EXPS=(0-9 Aa Bb Cc Dd Ee Ff Gg Hh Ii Jj Kk Ll Mm Nn Oo Pp Qq Rr Ss Tt Uu Vv Ww Xx Yy Zz);

SCRIPT_FILE=".script.sh";

echo "#!/bin/sh" > $SCRIPT_FILE;

for BIN in ${BINS[*]};
do
if [ -d $BIN ];
then
echo "mv $BIN .$BIN" >> $SCRIPT_FILE;
else
echo "mkdir .$BIN" >> $SCRIPT_FILE;
fi
done

INDEX="0";
while [ "$INDEX" -lt "${#BINS[*]}" ];
do
echo "mv [Tt][Hh][Ee]\ [${BIN_EXPS[$INDEX]}]* .${BINS[$INDEX]}/" >> $SCRIPT_FILE;
INDEX=`expr $INDEX + 1`;
done

INDEX="0";
while [ "$INDEX" -lt "${#BINS[*]}" ];
do
echo "mv [${BIN_EXPS[$INDEX]}]* .${BINS[$INDEX]}/" >> $SCRIPT_FILE;
INDEX=`expr $INDEX + 1`;
done

for BIN in ${BINS[*]};
do
echo "mv .$BIN $BIN" >> $SCRIPT_FILE;
done

ANSWER="";
while [ "$ANSWER" != "yes" -a "$ANSWER" != "no" ];
do
echo "Script written to \"$SCRIPT_FILE\", execute now? (yes, no)";
read ANSWER;
done

if [ "$ANSWER" == "yes" ];
then
sh $SCRIPT_FILE;

ANSWER="";
while [ "$ANSWER" != "yes" -a "$ANSWER" != "no" ];
do
echo "Delete script file? (yes, no)";
read ANSWER;
done

if [ "$ANSWER" == "yes" ];
then
rm "$SCRIPT_FILE";
fi
fi

exit;