Wednesday, April 24, 2013

How to Instantly Find Files on Flash Drives, Network Shares, DVDs, and More

2013-04-15_120629
You don’t have to be a computer power user to amass a pile of backup discs, removable drives, USB hard drives, and other non-localized media. Finding a file in that mess, especially when it’s not directly accessible by your computer anymore, is a headache. Read on as we show you how to build a lightning fast file index.

Why Do I Want to Do This?

When every single file you have is stored directly on your computer, it’s easy to find what you need. There are great search tools like Everything from VoidTools that rip through your master file table in a fraction of a second to find exactly what you’re searching for.
Once you start dealing with multiple disks, removable media (such as data DVDs, flash drives, backup files on USB HDDs, etc.) and network shares, however, searching gets progressively more difficult if not impossible. If you’ve come to depend on that lighting fast local search that tools like Everything provide, it can be extremely frustrating trying to find files that are beyond the reach of such tools.
Today we’re going to show you how to index everything from your network shares to your flash drive to the backup hard drive you pull out once a month. You won’t have to perform any arcane edits to Windows, force Windows to jump through any indexing hoops, or any of the other nonsense many guides make you put up with in order to just get Windows to acknowledge that the file “Taxes 2009 1040.pdf” actually exists somewhere in your constellation of data storage.
Instead you’re going to enjoy dead simple searching, lightening fast catalog creation, and all in a lightweight and portable package you can pluck right off your computer and take with you. How light weight? The apps take up less than 200k of space and even indexing every local, network, and detached storage device in our entire office only yielded a collection of file indexes around 30MB in size.

What Do I Need?

For today’s tutorial you’ll need the following things:
  • 1 copy of Cathy .
  • Access to the disks you wish to index.
  • Optional: 1 copy of CathyCmd for automated local index updating.
Cathy is a simple and free tool that the author, Robert Vašíček, originally created back in the 1990s to catalog his collection of MP3 files. He’s done an admirable job tending to the little project over the years and still routinely updates it a few times a year.

Installing and Configuring Cathy


After you’ve downloaded Cathy, extract the single file Cathy.exe to a safe location. We opted to place our installation in /My Documents/My Dropbox/Drive Indexes/ so that 1) our drive indexes would get backed up to Dropbox and 2) we could easily search our file indexes away from our home computer/network.
Once you’ve extracted and placed the executable, go ahead and run it. You’ll be greeted with a bare installation as seen in the screenshot above. No files, no catalogs, nothing yet for us to search.
Note: If you get an error message indicating you need the mfc100.dll, that just means you need to grab the Microsoft Visual C++ Resdistributable Package to fulfill the program’s dependencies. You can download the 32-bit version here and the 64-bit version here.
Let’s get started by creating our first catalog. What kinds of things should you catalog? Any drive, disk, removable media, network drive, or other data source that you can access from your computer and read the directory structure is fair game. Here are some sources to consider indexing for your search convenience:
  • Local Hard Drives
  • Removable Hard Drives
  • CD/DVD Backups
  • Flash Drives
  • Network Shares
While you can start the project by creating a catalog for any of your file locations, we’re going to start by indexing our network shares–as 99% of the time if we can’t find a file on our local machine we’ll find it on the office server.
Create your first catalog by clicking on the Catalog tab in the main GUI. In the “Root” box, type in the pathname as it is understood by the computer you’re working from (e.g. G:\MyDVDBackup or \\server\MP3s). We’ll start by indexing \\Hive\Software, the location where we backup software installation files. In addition to specifying the location you want indexed, you can also edit the volume label.

This volume label will be seen both in Cathy and as the filename of the specific catalog created by Cathy for this location (every new Root directory you enter into Cathy becomes its own unique catalog). By default it takes the name of the last folder in the directory structure (in the case of our \\Hive\Software example, it makes the volume label [software]). We generally edit the volume label to indicate the source so we’ll change it now to \\Hive\Software\ to remind us the index points at the office server.
In addition the above changes, you can also add comments in the Comment box (these comments will be displayed beside future search results returned from this source). By default Cathy ignores certain files (such as .tmp files); you can remove this restriction or add to it if you wish. Once you’ve checked over the settings for your first catalog entry, press the “Add” button.

The new catalog entry will appear in the list. In addition, a new file will be present in the directory where Cathy.exe is located:

If you navigate over to the search tab in the main GUI, you can type in a search expression in the “Pattern” box to look for files in the catalog. One of the things stored in /Software/ folder on the office server is a collection of Windows Home Server add-ins, including LightsOut. We’ll search for that now to test the catalog:

Perfect! In addition to finding the file immediately, because we renamed the volume to the network share name of the network share we were indexing, it’s extremely easy to read across the columns and see exactly where the file is. Furthermore, if the search results point at a resource currently accessible to the computer (whether that’s because the search result is local, on a network share, or the indexed DVD is current in the drive) you can right click on the entry and open the file or explore the path directly from Cathy.
Go ahead and add as many sources as you’d like. Remember anything that can be seen by your computer (network shares, discs in the disc drive, even remote FTP folders you’ve mounted in Windows as directories) can all be indexed. Keep in mind that the larger the number of files you’re indexing, the longer it will take–we found Cathy could index around a quarter million files in 30 seconds, so if the program seems to stop responding give it a minute or two to finish crunching the file tables.

Automating Catalog Updating for Local Drives and Network Shares

If you just follow along with the first part of the tutorial, you’re already light years ahead of most people in that you now have a searchable index of all your offline media–it’s now simple and super fast for you to discover exactly which backup disk or network share you left those old tax returns on.
There are a few simple tweaks you can make to your Cathy workflow, however, that greatly improve your experience and keep everything up to date.
If you’re using Cathy to search local drives or network drives where, unlike a burnt DVD backup, the contents of the directories can change, it’s worthwhile to set up a process to update those directories. You can, at any time, select a catalog in Cathy, right click, and Refresh the contents of that catalog, but that’s a hassle and it adds friction to our search system.
Instead we’re going to use CathyCmd, a tiny little command line interface tool for Cathy search, to write a simple batch script to update all of our local and network directory catalogs instead. Go ahead and download CathyCmd from the Cathy website and extract the single executable to the same directory you installed Cathy.exe to.
Next we need to create a simple script to drive CathyCmd. Go ahead and create a new text file in the directory called update.txt and open it. Inside the text file we only need to create a few lines to instruct CathyCmd. The only inputs CathyCmd will read from this script are those lines that begin with #IGN and #DEV. Look at our sample script below to see how to structure your own script:
## The IGN command is used to indicate files\directories you want ignored:
#IGN *.tmp; \tmp; \Temp*;
## The DEV command indicates the folders\file locations you want cataloged:
## The format is: path , volume name
#DEV E:\ , DATA
Save the script once you’ve edited it to your liking. To test the script we recommend creating a dummy file in the location you’re refreshing. We made: whataintnocountry.txt on the E:\ drive.
Run the script by executing CathyCmd.exe with the parameter -f and the script file, like so:
Let’s take a quick peek in Cathy to make sure everything updated as intended:

Success! The new file with the casual Pulp Fiction reference has been located. Our update script works perfectly.
Now all you need to do to finish the automation process is to place make an entry the Windows Task Scheduler (or alternative tool if you use one) to fire off the script on a schedule. Given the frequency with which our local files and network files change we’re comfortable setting it to refresh every 12 hours.
If you’re worried about setting the fresh rate too high because it might be a drain on system resources, don’t be. Once you do the initial grind through a large disk or directory structure the fresh command for that catalog takes less than a second to check for new files and generates no noticeable drain on system resources.

Have a clever way to use Cathy or another indexing tip or trick you’d like to share with your fellow How-To Geek readers? Jump into the conversation below and share your file search wisdom.

No comments:

Post a Comment