Ryan's Bliggity Blog: October 2006

Saturday, October 28, 2006

Simple Web File Upload Form

I put the following form and post processing files together from a manual on PHP. The HTML could really be embedded in the PHP file, but this might be better if you only want one copy of the PHP code on the system. Modify both to fit your environment and make sure the PHP file is executable.

Example .HEADER.html

<br /><h1>Index of /srv/pub</h1>
<br />
<br /><a href="/">Back to Welcome Page</a><br>
<br /><br>
<br />This form allows you to upload a file of up to 10 MiB:
<br />
<br /><!-- The data encoding type, enctype, MUST be specified as below -->
<br /><form enctype="multipart/form-data" action="/cgi-bin/getfile.php" 
<br />method="POST">
<br />    <input type="hidden" name="dest_dir" value="/srv/pub/" />
<br />    <!-- MAX_FILE_SIZE must precede the file input field -->
<br />    <input type="hidden" name="MAX_FILE_SIZE" value="1000000000" />
<br />    <!-- Name of input element determines name in $_FILES array -->
<br />    Send this file: <input name="userfile" type="file" />
<br />    <input type="submit" value="Send File" />
<br /></form>
<br />
<br />

Example getfile.php

<br /><html>
<br /><head>
<br /><title>File Upload</title>
<br /></head>
<br /><body>
<br />
<br /><h1>File Upload</h1>
<br />
<br /><?php
<br />// In PHP versions earlier than 4.1.0, $HTTP_POST_FILES should be used instead
<br />// of $_FILES.
<br />
<br />//$uploaddir = '/var/www/srv/Public/';
<br />$uploaddir = $_POST['dest_dir'];
<br />$uploadfile = $uploaddir . basename($_FILES['userfile']['name']);
<br />
<br />echo '<pre>';
<br />echo "Trying to save to: ".$uploadfile."\n";
<br />
<br />if (move_uploaded_file($_FILES['userfile']['tmp_name'], $uploadfile)) {
<br />   chmod($uploadfile, 0644);
<br />   echo "File is valid, and was successfully uploaded.\n\n";
<br />   echo "Saved in: ".$uploaddir."\n";
<br />   echo "Note that this program sets the file permissions to 0644.\n";
<br />
<br />   echo "<a href=\"javascript:history.go(-1)\">Click Here to Continue.</a>\n";
<br />
<br />} else {
<br />   echo "File upload failed.\n";
<br />     switch ($_FILES['uploadFile']['error'])
<br />     { case 1:
<br />         print '<p> The file is bigger than this PHP installation allows</p>';
<br />         break;
<br />       case 2:
<br />         print '<p> The file is bigger than this form allows</p>';
<br />         break;
<br />       case 3:
<br />         print '<p> Only part of the file was uploaded</p>';
<br />         break;
<br />       case 4:
<br />         print '<p> No file was uploaded</p>';
<br />         break;
<br />     }
<br />}
<br />
<br />echo "<br><br>Here is some more debugging info:\n<br>\n";
<br />print_r($_FILES);
<br />
<br />print "</pre>";
<br />
<br />?>
<br />

This code is derived from a few examples.

Web File Search CGI Program for Linux

So you can install Fedora, and choose the Web Server package. If you are running any kind of file server, you will probably want to find a file over the web. This script allows users to search the mlocate database which is usually compiled by your Linux installation every night. It's a fast database which circumvents searching the filesystem manually. I had to write it, so I extend my source code to everyone in hopes that it will help someone.

It has some thought put into security. The results are displayed as links to facilitate viewing/downloading the located file. It should be used to search one part of the file system. In my case, the public area is /srv/. This part of the filesystem should be linked into the Apache document root via

ln -s /srv /var/www/html/srv

to allow the users access to the files with your HTTP server.

Following is the Perl-CGI code. Name it locate.cgi and it should go in your cgi-bin directory. Don't forget to chmod 755 the file so it's executable. Because of technical limitations of Blogger, I can't paste the Perl-CGI code here--please follow the following link:

http://www.cs.umbc.edu/~rhelins1/getfile.php?id=26

Thursday, October 26, 2006

Using the Same Thunderbird Storage Folder on Windows and Linux

Today I finally solved the problem of having multiple e-mail clients and multiple storage folders. First I used an IMAP account I have access to in order to move all the old e-mail I have on multiple computers onto the Thunderbird "Local Folders" in Linux on my server. This is as easy as copying all the messages in one Inbox to the Inbox on the IMAP account, and then on the other computer moving the messages out of the Inbox on the IMAP account to the local Inbox account. It's not the fastest, but it's the most compatible as Thunderbird doesn't support much importing from other e-mail storage folders.

So once I had everything on the "Local Folders" account in Thunderbird on Linux, I changed the Local Message Storage directory on Windows (not on the Server) to the storage directory in my home folder on my Linux server via a drive-mapped Samba share:
Y:\.thunderbird\ayrxordj.default\Mail\Local Folders
This can be found by going to Tools->Account Settings->Local Folders. Click Browse. You'll have to have "Show Hidden Files" ON to locate your Local Folders path with the GUI. You also need to make sure that all your e-mail accounts you have are set up to store the mail in the "Local Folders" rather than an individual folder for that account.
After some testing with no networking on the Windows computer, if the client doesn't have accesss to the network drive, you just get nothing under "Local Folders". In other words, it doesn't crash or anything.

So what does this mean? If I download a message on my Windows Thunderbird client, it is exactly like downloading it on the Linux Thunderbird client. It is the same for the Linux client (vice-versa). It also means that now I can have "Delete message from POP after download" set to "Yes" on both clients and avoid downloading e-mail that I've already seen.

Issues I can imagine running into include having both clients open simultaneously and possible Operating System dissimilarities in the implementation of Thunderbird. Also, it took a long time to open the gigantic storage folder the first time over the network, but cleaning up the 2-3 thousand messages and perhaps "compacting" the folders should help. The other issue is, since I merged the two clients, there are several duplicate e-mails, but there is an extension to find & delete duplicate messages which has already solved this problem for me. I removed all the duplicates (leaving me with only roughly two thirds of the messages, because this merge was so large) and compacting the folders did solve the long-to-load problem.

Now I can load up Thunderbird on Windows or Linux using the same "Local Folders" storage directory. Finally, no more getting on different computers/clients to find different mail.

There are still issues with using this technique. If you have a laptop or are losing connectivity, Windows will let you know that "Delayed Writing" failed. Also, a solution like having IMAP storage would be much more network-friendly in terms of bandwidth. See my post on setting up Cyrus IMAPD

Tuesday, October 24, 2006

Shell Script for Tunneling VNC over SSH

If you use VNC over the Internet rather than just over your LAN, it is not recommended to allow the 2590x port to be forwarded to the Internet. Instead, if you already have port 22 for SSH forwarded to the machine you want to get VNC from, you're all set to connect securely via an SSH tunnel. Note that you still have to have VNC set up and working on the LAN and know which screen number you are using (if you've used the GUI to allow your desktop to be connected to remotely, then you are using screen 1). This can also be accomplished using PuTTy with its GUI.

Replace {user} with your username on the remote machine, and the {WAN Address} with your public IP address on the remote machine. Replace the screen with the (single) number of the screen you use (1,2,3,...). Due to technical limitations, I can't use backslashes, so please make the command a one-liner. The ssh with the sleep argument needs to have a command after it that uses the forwarded port or it will close immediately.

#!/bin/sh

ssh -f -L 2590{screen}:127.0.0.1:590{screen} 
 {user}@{WAN Address} sleep 10; 
 vncviewer 127.0.0.1:2590{screen}:{screen};

exit

So an example script would look like:

#!/bin/sh

ssh -f -L 25901:127.0.0.1:5901 user@mysubnet.domain.com sleep 10; 
 vncviewer 127.0.0.1:25901:1;

exit

Shell Scripts for making links unique or sole copies

Here are scripts to convert soft links to unique copies, and to sole copied (which moves the target of the link to the location of the link).

#!/bin/sh

# Check that link exists
ls "$1" > /dev/null
if test $? -eq 0
then
echo -n "File found. "
else
echo "Error: File not found."
exit 1
fi

linktarget=`find "$1" -printf "%l\0"`
linktargettype=`stat --format=%F "$linktarget"`

# Check that link target exists
if test -z "$linktarget"
then
  echo "Error: Null link target, not a valid soft link."
  exit 1
else
  echo -n "Found soft link. "
fi

# Check that link target is NOT a directory
#if test "$linktargettype" = "directory"
#  then
#    echo "Link to directory, skipping."
#    exit 0
#  else
#    echo "Not a directory."
#fi

# Remove soft link
echo -n "Unlinking $1... "
unlink "$1"
if test $? -eq 0
then
  echo "Done."
else
  echo "Error!"
  exit 1
fi

# Replace soft link with unique copy
echo "Creating copy from $1"
echo -n " to $linktarget... "
cp -a "$linktarget" "$1"
if test $? -eq 0
then
  echo "Done."
else
  echo "Error!"
  exit 1
fi


exit 0

Here's the same script for changing the link to the unique copy:

# Check that link exists
ls "$1" > /dev/null
if test $? -eq 0
then
echo -n "File found. "
else
echo "Error: File not found."
exit 1
fi

linktarget=`find "$1" -printf "%l\0"`
linktargettype=`stat --format=%F "$linktarget"`

# Check that link target exists
if test -z "$linktarget"
 then
   echo "Error: Null link target, not a valid soft link."
   exit 1
 else
   echo -n "Found soft link. "
fi

# Check that link target is NOT a directory
#if test "$linktargettype" = "directory"
#  then
#    echo "Link to directory, skipping."
#    exit 0
#  else
#    echo "Not a directory."
#fi

# Remove soft link
echo -n "Unlinking $1... "
unlink "$1"
if test $? -eq 0
 then
   echo "Done."
 else
   echo "Error!"
   exit 1
fi

# Replace soft link with only copy
echo "Moving $1"
echo -n " to $linktarget... "
mv "$linktarget" "$1"
if test $? -eq 0
 then
   echo "Done."
 else
   echo "Error!"
   exit 1
fi


exit 0

Convert Soft Links to Hard Links

Here's a Shell script I wrote to convert large numbers of soft links to hard links on Linux.
It now handles soft links on different file systems properly.

#!/bin/sh
# soft2hard.sh by Ryan Helinski
# Replace a soft (symbolic) link with a hard one.
#
# $1 is name of soft link
# Returns 0 on success, 1 otherwise
#
# Example: To replace all the soft links in a particular directory:
# find ./ -type l | tr \\n \\0 | xargs -0 -n 1 soft2hard.sh
#
# finds all files under ./ of type link (l), replaces (tr) the newline
# characters with null characters and then pipes each filename one-by-one
# to soft2hard.sh

# Check that link exists
ls "$1" > /dev/null
if test $? -eq 0
then
echo -n "File found. "
else
echo "Error: File not found."
exit 1
fi

linktarget=`find "$1" -printf "%l\0"`
linktargettype=`stat --format=%F "$linktarget"`

# Check that link target exists
if test -z "$linktarget"
 then
   echo "Error: Null link target, not a soft link."
   exit 1
 else
   echo -n "Found soft link. "
fi


# Check that link target is NOT a directory
if test "$linktargettype" = "directory"
 then
   echo "Link to directory, skipping."
   exit 0
 else
   echo "Not a directory."
fi

# Remove soft link
echo -n "Unlinking $1... "
unlink "$1"
if test $? -eq 0
 then
   echo "Done."
 else
   echo "Error!"
   exit 1
fi

# Replace with hard link
echo "Creating hard link from $1"
echo -n " to $linktarget... "
ln "$linktarget" "$1"
if test $? -eq 0
 then
   echo "Done."
 else
   echo "Error creating hard link, replacing soft link"
   ln -s "$linktarget" "$1"
   exit 1
fi


exit 0

Tuesday, October 10, 2006

GRUB Read Error

If you've ever gotten the following error:

Loading GRUB... Read error

It looks like the GRUB read error occurs not because I shut the system down abrubtly or modified the GRUB configuration incorrectly, but because I'm trying to boot the system up without a keyboard. So, for now, I'll just keep the keyboard and mouse plugged in, but I'd like to be able to boot without these eventually. Some settings in the BIOS (like enabling USB keyboard support) might let me get around this error.

Optimizing Multimedia and Backup Storage with Hard or Soft Links

Here are the notes on this subject in my Server Log with some annotations. Basically, it's (hard links are) a quick fix for duplicate files to free up space.

I had md5 files generated for everything that was copied onto the server, so using the shell commands sort, grep, and uniq, I was able to clean up a lot of space from files that had been copied over twice as a result of using WinMerge to prepare for deleting the backups that were on one of the hard drives that went into the RAID.

I was about to start playing with FSlint, but by chance came across a perl program called dupseek. I was looking for a script which replaced one of two duplicate files with a link.

As far as I can tell, this is an excellent program with a well thought-out algorithm for finding duplicate files on a Unix system (based on personal experience, and what it says on their page), but more importantly to me, it has a function for creating Unix soft links in place of the duplicate files. Although it's text-mode, this is the best program I've used for dealing with duplicate files. And text-mode is just fine! This is a real life-saver because I don't want duplicates sitting on the file system, and I'd like to keep some files cross-referenced in directories. Also, removing duplicates in directories which are already backed up will make the copy on hard disk seem like it has fewer files than that on the CD.

Hard links would really be better (for me), because soft (symbolic) links in Unix probably raise some compatibility issues when, for instance, trying to but the directory on a CD. Unless interpreted correctly, these links are just files. With hard links, however, the same file system inode is just referenced by two different directories. Since the directories in question won't be changing, this wouldn't raise any issues (deletion, separation, etc.).

In fact, gnomebaker currently has a bug where soft links are dereferenced for computer the size of the CD image, but the link file itself if put onto the CD.

I was able to save less than 10 GB by realizing there were duplicate media files as a result of combining directories before the big move to the server RAID, and more than 2 GB by using dupseek.

http://www.pixelbeat.org/fslint/
http://www.beautylabs.net/software/dupseek.html

Notes:

You can use a text file, generated yourself or by filtering the report generated the “-b report” function of dupseek, which contains filenames to be removed. You can pipe these names to xargs which calls rm. This is useful in case files have been copied to more specific directories and many duplicates lie in a general directory (e.g. “downloads” vs. “singles”).

cat [name of report file] | grep “/Downloads/” | tr \\n \\0 | xargs -0 rm

Which accomplishes the task of filtering the report to only files which are in a path which includes
“/Downloads/”, replacing the new line characters with null characters, and xargs passes each of these lines to rm. This removes the duplicate files which are in the common directory (which you want to clean out, and not preserve). Also, this must be executed from the same directory the report file was created in reference to (to make the relative filenames match up). For safety, try replacing rm with ls before you do anything to make sure you're about to remove the right files:

cat [name of report file] | tr \\n \\0 | xargs -0 ls | less

Only after checking over this output, hit q and then run:

cat [name of report file] | tr \\n \\0 | xargs -0 rm

Note: A much safer plan is to use the interactive mode of dupseek, or use the FSlint GUI (takes into account hard links). The interactive mode of dupseek got too repetitive and so I just used it to identify duplicates in one case, but in general the interactive or batch mode is fine (be careful of the batch mode, unless you're running my version of dupseek).

In the end, dupseek is better for batch jobs (but I feel that only with my hard link modification) and FSLint is better for compatibility and running on directories like home folders where it is the case that you want to leave some unique files alone. Of course, a compressed file system would be a step better, but who has the (CPU) time for that?