Oracle Secure Enterprise– Quick Start Guide

An Oracle White Paper

September 06

Implementing Oracle Secure Enterprise SearchPage 1

Table of Contents:

Introduction

Planning Considerations

Hardware / OS platform

The need for secure versus public searching

The Identity Platform

User interface (standard or customized Web Services application)

The data sources to be crawled

Custom Connectors

Recrawl schedules

Index defragmentation schedules

High Availability Strategies

Firewall / DMZ issues

Process and Timescales

Installation

Source Setup

Use Interface Building

Administration

User Training

Conclusion

Introduction

Secure Enterprise Search is a product from Oracle which allows you to search all your enterprise data sources from a single simple, convenient interface. Searching enterprise data is made as simple for the users as searching the internet.

One of the overriding design goals of Secure Enterprise Search was simplicity of installation and implementation.

This guide runs you through the basic steps of installing SES 10.1.8 and running some basic crawls. It is intended for people evaluating or reviewing the product, and is not intended to replace the standard SES documentation library.

Installation

Installation is straightforward and takes around 15 minutes, depending on the speed of the system and installation media.

Windows

Insert the installation CD (if appropriate) and allow it to autorun. If autorun is switched off, or the installation is from downloaded files, navigate to the “setup.exe” file in the root (or “Disk1”) directory and open it.

Unix / Linux

Ensure you are in a graphical X-Windows environment and your DISPLAY environment variable is set correctly. From a terminal window, run the command “xterm” to check this. If an xterm window appears (you can close it immediately), your environment is set correctly.

The installer program is called “runInstaller” and may be found on the root directory of the installation CD, or in “Disk1” directory of downloaded software.

If the installation is from read-only media (such as a CD) you mustrun the runInstaller program from a writable directory (such as your user’s home directory). Eg.

$ cd

$ /media/cdrom/runInstaller

If the installation files are on a writable file system then this is not necessary.

The installer screen is shown below.

You need to choose a name for your SES server, a password, a connection port, and a location for the files.

The name may be anything, but note that if you are connecting multiple SES installations to a single identity plug-in, the names must be unique. It therefore makes sense to change this from the default, perhaps to a name derived from the server host name.

The password will be used for all administrative functions. It may be changed later, but there is no way to reset it if it is forgotten.

The port number defaults to 7777. Note that this is the same as the default for Oracle Application Server, so if you intend to use that on the same machine you should probably choose a different port. To avoid the need to use a port number in the URL for SES, you should choose port 80 (the default for http). There are, however, implications for doing this on Unix and Linux, as port 80 is only available to the “root” user. See the Installation and Upgrade Guide for Linux x86 for further details.

“Destination Path” is the path for SES software, and log files. The 10.1.8 software takes up about 1GB of space, so allow at least 1.2GB to allow for growing log files.

“Data Storage Location” is used by the SES database files (used to store index and configuration data) and the cache files, which are HTML renderings of all the documents which are indexed. The files stored here will start at about 800MB, but will increase rapidly as sources are crawled. The actual space required will vary according to the types of data (for example cached files for PDF and MS Word files are considerably smaller than the originals, whereas cached HTML files are the same size), but as a rule of thumb allow at least as much space as the overall size of the dataset which will be indexed.

If you wish to use SES with prompts and messages in languages other than English, goto Product Languages and choose extra languages to install.

When complete, press the “Install” button to start installation. Assuming you pass all the prerequisite tests, the installation will now proceed (if you don’t please refer to the Installation and Upgrade Guide for your platform to resolve any issues).

On Unix/Linux, you may be prompted part way through the installation to run a script as the root user. This merely sets the correct permissions on a file used to locate Oracle software on your machine.

Crawling public information – a first test

When the installation is complete, you will be shown two URLs, one for the admin screens and one for the query screens. Go to the admin URL from any web browser, and enter the password you gave during installation (you did remember it, right? If not, you’ll need to reinstall, I’m afraid (see “Deinstallation” at the end of this paper).

After logging in to the admin screen, you will be presented with some (mostly empty) summary information. Click on the “Sources” tab on the top left. You will then be on the “Sources” page. Choose “Web” from the pull down list (it should be the default anyway, and click on “Create”.

We’re going to index Yahoo.com as a first test.

Enter “Source Name: Yahoo” and “Starting URLs: Then click on “Create & Customize”. Now click on the “Crawling Parameters” tab, and set “Number of Crawler Threads” to 1 (we want to be nice to Yahoo and not overload their server! When crawling your own datasources you will probably want to leave this at 5, or even increase it).

After changing the number of crawler threads, click on “Apply”.

We will now go to “Schedules”, where crawls are launched. Click on the “Schedules” tab at the top left. Select Yahoo and press “Start”. The status column should change from “Scheduled” to “Launching” (or “Executing”). Click on “Launching” or “Executing” to go the Schedule Status page. One the Status page, when the Status column is showing “Executing”, you can click on the pencil icon under “Statistics” to get to the Crawler Progress Summary page to see exactly what’s happening.

Why have so many documents been rejected? Probably because they didn’t fit the default URL boundary rules, which will reject anything without “ in the URL (eg To change this, go to Sources -> Yahoo -> Edit -> URL boundary rules, and add a new rule, such as “contains: yahoo.com”

Didn’t get any documents indexed at all? You probably need to set a proxy. Go to “Global Settings” on the right hand tabset, the Sources -> Proxy Settings, and set your proxy there.

After making changes, go to Home->Schedules and start the Yahoo schedule again.

Simple Searching

To launch a Search window, click on the very top “Search” link (next to Help and Logout, not the tab between Home and Global Settings). Alternatively, on Windows choose Start -> All Programs -> Oracle – YourServerName -> SES Search.

If you expanded the boundary news, you should be able to search for a currently newsworthy term (“iraq” seems a good bet for the foreseeable future). If not, then “yahoo” should get you a few hits.

Alternatively, click on “Browse Source Groups” to see all the documents you have indexed. Currently, as we haven’t defined any source groups, all the documents will appear in the “Miscellaneous Group”.

Connecting to a Directory for Secure Search

So far we have demonstrated simple indexing and searching of public sources. To deal with secure searching, we must first connect our SES installation to an identity plugin. Normally this will be for a directory such as Oracle Internet Directory (OID) or Microsoft Active Directory (AD).

De-installation

Complete de-installation of SES is very simple (so be careful not to do it by accident!)

On Windows, go to Start -> All Programs -> Oracle YourSESName -> Uninstall SES.

On Linux, execute $SES_DIR/install/deinstall_ses where $SES_DIR represents the directory where SES was installed (the “Destination Path” during the install). Run this command from a different directory to ensure a clean deinstallation.

Implementing Oracle Secure Enterprise Search

September 2006

Author: Roger Ford

Contributing Authors: TBA

Oracle Corporation

World Headquarters

500 Oracle Parkway

Redwood Shores, CA94065

U.S.A.

Worldwide Inquiries:

Phone: +1.650.506.7000

Fax: +1.650.506.7200

This Document Is For Informational Purposes Only And May Not Be Incorporated Into A Contract or Agreement.

Oracle is a registered trademark of Oracle Corporation. Various

product and service names referenced herein may be trademarks

of Oracle Corporation. All other product and service names

mentioned may be trademarks of their respective owners.

Copyright © 2001 Oracle Corporation

All rights reserved.

Implementing Oracle Secure Enterprise SearchPage 1