Deciphering OneWorld Logs

Deciphering OneWorld Logs

Primary Contact/Author / Roger Miller
Documentation Specialist / Bill Calkins

Technology Demographic Table

Product / OneWorld
Version / All
Platform/OS / All
Industry / All
Application / All
Database / All
Key words / error logs, jde log, trace logs, debug, jdedebug logs
Date / September 2001
Appendix A added November 2001 by John Lee

Table of Contents

Overview 3

Security Server Errors 3

Call Object Debugging 4

Example: Application Hangs 5

Example: Problems Without Application Hang 7

Example: Application Completes Without Expected Results 8

Viewing Server Call Object Logs from the Client 8

UBE Debugging 9

Conclusion 10

Contact Information 10

Appendix A: Determining Cause of jdeCacheInit Re-initialization Warning Message 11


“The art of being wise is the art of knowing what to overlook."

- William James

Overview

One of the keys – if not THE key – to OneWorld® problem determination is the ability to effectively read the OneWorld logs. Much of this effectiveness comes with experience, but some basic principles can be learned. This document provides some guidelines for the log reader that can help diagnose a problem.

Most OneWorld processes can create two types of logs: error logs (often called the jde log) and trace logs (also called debug or jdedebug logs). Error logs generally have error information, but also may contain general high-level information about the process. Trace logs contain much more detailed information about what the process is doing. Because of the overhead associated with writing this detailed information, trace logs are not used on a day-to-day basis, but are invaluable in solving certain kinds of problems.

A OneWorld client will have one set of logs (jde and jdedebug), while a OneWorld server will have one set of logs for each process that runs. The logs on the client machine are in files named jde.log and jdedebug.log, respectively. The logs on the server are in files named jde_12345.log and jdedebug_12345.log, where “12345” represents the process ID or job ID of the process for the log. It is important to understand that in a client-server environment, logs on the server need to be looked at – in addition to those on the client – to determine what the problem is. This document assumes the user has some familiarity with OneWorld server processes and how to determine process IDs for various kernel types.

This document is divided into the following sections:

§  Security Server Errors

§  Call Object Debugging

§  UBE Debugging

Security Server Errors

One common problem when first starting OneWorld is a security server that does not respond. In this case, the client machine gets a popup dialog box like the following:

This popup is accompanied by a message in the error log similar to the following:

433/307 Tue Jul 10 09:06:04 2001 jdeksec768

SEC0000002 - Unable to connect to security server sundev1

Running OneWorld with user's signon and password

This may or may not work depending on the databases' security.

The first thing to check is to make sure the enterprise server (sundev1 in this case) is running correctly and that the right port number is being used to communicate with the security server. Check the client jde.ini file to make sure that the SecurityServer parameter is correct, and that the port numbers specified by the serviceNameListen and serviceNameConnect parameters are also correct. If these appear to be correct, sign on to the enterprise server and use the porttest program to check the security server.

If the porttest program fails, the log for the porttest process will most likely have an error message similar to the one above. This same message might show up in the error log for any jdequeue processes - or other OneWorld server processes that are being run. Unfortunately, this message says very little about what the problem really is - only that there is some problem with the security server. In order to determine the root cause of the problem, the logs for the security server process that is running on our OneWorld enterprise server must be looked at.

Use the Server Administration Workbench program (SAW) to determine which process is the security server (detailed information on using SAW can be found on the Knowledge Garden in the Server and Workstation Administration guide). If you are signed on to the server and there are a small number of logs, however, it might be easier to simply search the logs to determine which are for the security server. Looking through the jde (error) logs, you should find one (or more) with a line similar to the following:

15982 Fri Jul 13 08:53:20 2001 jdeksec.c2753

INITIALIZING SECURITY SERVER KERNEL

If there are error messages, they should give a good idea of what is causing the problem. If there are no error messages, look at the corresponding jdedebug (trace) log to try to determine what the process is currently doing, or the last statement that was processed before problems occurred.

Call Object Debugging

“Call Object” is the OneWorld term for running business logic. A given piece of business logic can generally be run either on an application server or locally on the client machine. The applicable OCM records determine where this logic runs. By looking at the trace log (jdedebug.log) from a OneWorld client machine, you can tell where that user is running their business logic. Here are some examples:

Jul 10 11:07:18 ** 248/363 Calling Business function GetLocalComputerId for RM5797194. Application Name [P4210], Version [ZJDE0001] (BSFNLevel = 1)

Jul 10 11:06:38 ** 248/363 On sundev1, calling Business function FormatCompany for RM5797194. (BSFNLevel = 1)

The first example shows a business function that is running locally (on the client machine). The second shows a business function that is being called on an application server named sundev1. Each line contains the date and time of execution, followed by the process ID and thread ID of the OneWorld process, the name of the business function being run, and the OneWorld user ID of who is running this logic. For a function running locally, the application and version are listed if known. Both examples show a business function “level”, which indicates from where the current function is called (discussed later in this document). The process ID and thread ID are not important in the client logs, but the fact that there are two numbers here indicates that the log is from a Windows-based machine. On a Unix machine, you would see a single number indicating the process ID of the process that created the log, and on an AS/400, the process ID is not listed on every line, but in a single line at the top of the file.

Example: Application Hangs

One common problem encountered with call object is that the application “hangs” for some amount of time, perhaps several minutes, and then finally returns with a popup dialog box similar to the following:

This is accompanied by a message similar to the following in the client jde.log:

402/371 Mon Jul 16 13:43:22 2001 jdekinit588

KNT0000031 - JDENET_ReceiveMsg Error = eTimeOut

and also:

402/371 Mon Jul 16 13:43:22 2001 JDEOBJ258

COB0005000 - FAILOVER! All BSFNs mapped to sundev1 will now run on the LOCAL machine.

Lines on the client jdedebug.log show something like:

Jul 16 13:41:52 ** 402/371 On sundev1, calling Business function F4211FSEndDoc for RM5797194. (BSFNLevel = 1)

Jul 16 13:43:22 ** 402/371 Return value is 99 for F4211FSEndDoc. (BSFNLevel = 1)

Note the time difference between the business function call and the return. If you looked at the jdedebug.log during the time the application was hanging, you would have seen the following on the last line:

Jul 16 13:41:52 ** 402/371 On sundev1, calling Business function F4211FSEndDoc for RM5797194. (BSFNLevel = 1)

This gives a quick idea of what business function is running, where it is running, and therefore, where to look for the problem (if this message indicates an unexpected server name, an OCM or logical data source is probably not set up correctly). In this case, to see what is really causing the problem go to the application server (sundev1) and find the call object kernel that is running your business function. There are various ways of doing this, but one might be to search the logs on the server for the business function name and your user ID. For the case above, assuming that debug logs are turned on, a log containing something like the following line should be available:

Jul 16 13:41:51 ** 10849 Calling Business function F4211FSEndDoc for RM5797194. Application Name [P4210], Version [ZJDE0001] (BSFNLevel = 1)

A couple of things can be determined right away by looking at this line. First, because of the single number after the date and time, the log is from a Unix machine. Also, the process ID of the process that created this log is 10849. Since the application is hanging, or timing out, on the client, your first step should be to check whether or not this process is still running (either by using SAW, or by looking for the process on the server itself). The way to do this varies by platform. For example, on a Windows platform you might use the task manager, while on a Unix platform you might use the “ps” command. If the process is no longer running, it has died the logs should be examined to try to determine why. Even if the process is still running, the trace log should be examined to see if new statements are still being written to the log.

In this case, process 10849 is no longer running, so the server log files jde_10849.log and jdedebug_10849.log should be examined. The error log does not show any errors, and the last line in the trace log is the following:

Jul 16 13:41:51 ** 10849 Calling Business function RetrieveCurrencyConvFlag for RM5797194. Application Name [P4210], Version [ZJDE0001] (BSFNLevel = 4)

Note that this is not the original business function that was called from the client machine. Also notice the last part of the line that says BSFNLevel = 4. This indicates that there are three levels of business functions that were called before getting to this particular function. To determine what business function called this one, search back through the trace log until a call to a function with BSFNLevel = 3 is called. Subsequently, it is possible to search backwards for BSFNLevel = 2 and BSFNLevel = 1, which should be the original function that was called from the client. Looking at these calls shows the sequence of events that led up to the problem. In addition, with a case such as this – where the call object kernel has died – the actual business function that was causing the problem can be determined. In this case, it is RetrieveCurrencyConvFlag.

Example: Problems Without Application Hang

Another example might be that the application doesn’t hang, but a dialog box appears saying something like:

This is accompanied by messages in the client jdedebug.log file similar to the following:

Jul 16 11:16:33 ** 359/334 On sundev1, calling Business function F4211FSBeginDoc for RM5797194. (BSFNLevel = 1)

Jul 16 11:16:33 ** 359/334 Return value is 99 for F4211FSBeginDoc. (BSFNLevel = 1)

Jul 16 11:16:36 ** 359/334 On sundev1, calling Business function F4211FSBeginDoc for RM5797194. (BSFNLevel = 1)

Jul 16 11:16:36 ** 359/334 Entering JDEK_InitRemoteUser for server sundev1 with commit mode 0.

Jul 16 11:16:36 ** 359/334 Exiting JDEK_InitRemoteUser for server sundev1 with commit mode 0.

Jul 16 11:16:36 ** 359/334 Return value is 99 for F4211FSBeginDoc. (BSFNLevel = 1)

In addition, messages in the client jde.log appear that may (or may not) help pinpoint where the problem lies. The following is one example of a possible message:

359/334 Mon Jul 16 11:16:36 2001 jdeknobj1406

KNT00000181 - Library load failed CSALES.dll function F4211FSBeginDoc server sundev1

This message indicates what the problem is – i.e. the CSALES library could not be loaded – but at this point, it hasn't been determined why the library could not be loaded. For this, the call object process on the server must be found, and the logs for that process examined to get more information about what is happening.

Example: Application Completes Without Expected Results

A third example is where the application completes but does not give the expected results. If there is a clear error message, it may be determined that there is a problem with the setup, or something similar. But if there is no error message, or the message is unclear, the situation is more difficult to debug. Familiarity with how the application should work is required, or access to an expert on the particular application being run.

From a logging point of view, the primary things to look for are errors in the jde log and the return values (in the jdedebug log) for the business functions that are being called. If a business function is returning a value other than “0”, it is a candidate for research into the problem.

Once again, keep in mind that the business functions may be running either locally on the client, or on a business function application server. If the client trace log has messages like:

Jul 16 11:16:33 ** 359/334 On sundev1, calling Business function F4211FSBeginDoc for RM5797194. (BSFNLevel = 1)

Jul 16 11:16:33 ** 359/334 Return value is 2 for F4211FSBeginDoc. (BSFNLevel = 1)

The server logs should be consulted for more information. This function may actually be calling many other business functions on the server, which cannot be seen from the client logs.