Record overlay for Voyager ILS

Read this document to learn how to overlay the records in your Voyager ILS with records from OCLC batch processing. WMS libraries often use batch processing to ensure their holdings are set in WorldCat as part of the data migration process.The overlay of records ensures that each bibliographic record in your Voyager system has an OCLC number in a consistent location and format.

Set up a bibliographic duplicate detection profile

In System Administration, under the cataloging tab on the left, click New. Give it a simple name and code, and you can make them the same thing (WMS for example). The code will be all caps.

Given that the records being imported are exact copies of the ones in the database (other than the OCLC number), we can simply replace the existing records, so select "Replace". If you need more detail about the differences between the options, press F1 to get the Help screen, which has a table explaining each option. "Merge", for instance, copies specified fields from the existing record into the incoming record before deleting the existing record; so, if you had local bibliographic info in the 599 or other fields, you could specify those to be merged into the incoming record before the existing record is deleted and replaced (however, since the incoming record also has the local bibliographic info, it's not really necessary).

Check the "Discard incoming records that do not match existing records". If, for some reason (and this should not happen), records do not match, you don't want to add second copies of your bibliographic records. Leave Cancellation set at "None"... it's not pertinent to what we're doing.

Leave Duplicate Replace and Duplicate Warn at 100.

On the Field Definition Tab, scroll all the way down on the list on the left side. The second to last selection is BBID, which is the bib ID number. Click the right arrow to copy it over to the list on the right. We're going to be matching on Bib ID. Leave the field weight at 100 because that will be sufficient to trigger a replacement.

You can ignore Quality Hierarchybecause we're not going to check the incoming records to see whether they are better or worse than the existing record, so just click Save.

Set bulk import rules

On the left side,click Bulk Import Rules. Click New. Give it a name and a code (WMS is also fine here; use all caps for code).

On the Rules tab, select WMS for the Bib Dup Profile. Leave Auth Dup Profile blank. Select your owning library (if you have more than one, use the one that handles cataloging. If you have more than one that handles cataloging, you will need to create profiles for each one with everything else the same). Use whichever character set you specified in your batch processing order (if you pick the wrong one, the bulk import will give an error, but you should be able to get the right one by trial and error). Leave "Load Bib/Auth only" selected, and leave everything else alone. Just click Save.

Upload records from OCLC for overlay

Log into the server as voyager. You'll be in the voyager home directory (/export/home/voyager). Use the following command to create a wms folder for the bib files (this is optional, but I like to keep things a little tidy if I'm thinking about it):mkdirwmsPut the bib files from the batch processing here.

How this works is heavily dependent on how your IT dept. has things set up, so it may be simpler to send them the files and tell them where you want them. If you do have access to upload files yourself, I'll tell you what I do. I use thefree FileZillaFTP client.If you install and run it, you can generally open a connection by specifying your server name, voyager as the username, your voyager password, and either 21 for FTP or 22 for secure FTP. On the right side, you'd see the /export/home/voyager directory, and under that should be wms. Open that on the right side. Find your processed bib files (D######.R#####.bin, if the naming convention for these files is consistent) on the left side and drag them to the wms folder. You'll see their progress at the bottom of FileZilla.

Bulk importa few records for testing

Once the files are in /export/home/voyager/wms, you are ready to run bulk import.cd /m1/voyager/xxxdb/sbin (using your library's xxxdb name, of course). You can type ./Pbulkimport -h to get a list of the options you may specify when running bulk import. -f will be the path to the input file and -i will be the code used in the Bulk Import profile above (all caps). Note -b and -e... we want our first bulk import to only be a handful of records (however many you don't mind going in and fixing if we did something wrong...I usually do 10). That's all we need.

First, type ls /export/home/voyager/wms so that you can see the file names of the processed marc files. Then, look at the time (it'll come in handy later), and type the following to do a run of 10 records (use whatever code you used for the Bulk Import profile if it's other than WMS):

./Pbulkimport -f /export/home/voyager/wms/D######.R#####.bin -i WMS -b 1 -e 10

Check results of bulk import test

To check the results, first go to the rpt directory:cd /m1/voyager/xxxdb/rpt (or cd ../rpt). This folder usually has a lot of junk in it, so I find it helps to do the following:

ls -l *imp.20120607*

(use the current date in the yyyymmdd format) This gives you a list of all import files pertaining to today. You'll get a delete file, a discard file, an err file, a log file, a reject file, and a replace file. Ideally, everything but the log and replace file will be size 0 (just left of the date in the listing). Next...

more log.imp.20120607.####

Use the hhmm that most closely match the hours and minutes that you ran the bulk import in place of ####... you'll need to keep track of this as you do subsequent runs today. This will show you a summary of your import run.

For each record, you should see something like this:

1(1):Duplicate Bibs above threshold: replace 1, warning 0.

BibID & rank

4 – 100

REPLACE Existing DB Bib record replaced.

This means that the bib record with the ID 4 has successfully been replaced.All of the recordsshould look like this. This run is small, so you can check all of them. Hit the spacebar to page through the file. At the end you should see the following:

BIBLIOGRAPHIC or AUTHORITY Records

Processed: 10

Added: 0

Discarded: 0

Rejected: 0

Errored: 0

Replaced: 10

Merged: 0

Deleted: 0

Mfhds created: 0

Items created: 0

Thu Jun 7 09:47:03 2012

This is what you want to see: all 10 replaced. Next time, when you run the rest of the records, all you really need to look at is the end of the log file, so use the following command instead:

tail -20 log.imp.20120607.####

(again, using the appropriate time and date). You should see just the summary. The tail command shows the last number of lines that you specify (the summary takes up the last 12 lines, but I like to see the stuff right before that to make sure there are no errors or messages... I don't remember if there would be or not).

The replace file is a marc file of the records that were replaced, in case you need to put them back. That shouldn't be necessary.

Next, open cataloging and pull up the records from the run by their bib ids. Make sure they have the expected OCLC numbers and that everything else looks ok. You should see an oclc number (035 field) starting with (OCoLC), and from what I can see in mine, you may also see an 035 field with the bib id.

Run bulk import

Assuming the first 10 went through ok, go back to the sbin directory (cd ../sbin) and redo the bulk import without the -b and -e.This is a time-consuming process, so if you want to know when it's done, then after you run the bulk import, cd ../rpt, do an ls -l log.imp.20120607*, look for the hhmm timestamp of the most recent log file (it's always the timestamp of when the import was begun, btw), and then type the following:

tail -f log.imp.20120607.#### (using the appropriate hhmm in place of the ####)

This lets you see everything that is written to the log file as it is written. So, when you see the summary, you'll know the import is done (and how it went). Hit CTRL-C to break out of the tail -f (otherwise it will just sit there watching the log file forever).

Shortcut to speed up bulk import: If you add -X NOKEY to the Pbulkimport command line, it will skip the keyword file portion of the import, and that will SIGNIFICANTLY improve import speed (even if the import starts fast, by the end of overlaying *ALL* the bibs in the database, the keyword index component of the import will drag it to a crawl). The keyword files are completely irrelevant, not only because the library is moving away from Voyager, but because the keywords in the incoming records are identical to those in the database, so nothing is changing regarding the keyword indexes.

I recommend doing the first 10 records of each file before moving on with the rest of the file.

This should be everything you need to do to overlay your records with the ones from batch processing.

1