SAP AMERICA

Host: Margaret Anderson

May 15, 2003/10:00 a.m. CDT

Page 1

SAP AMERICA

May 15, 2003

10:00 a.m. CDT

Coordinator Ladies and gentlemen, thank you for standing by and welcome to the BW Know How Teleconference. At this time, all participants are in a listen-only mode. Later we will conduct a question and answer session with instructions given at that time. As a reminder, this teleconference is being recorded. I would now like to turn the conference over to Mr. Oliver Mayer, from the BW Rig. Please go ahead, sir.

O. Mayer Okay, thank you, Bill. This is Oliver Mayer, and first of all, I’d like to thank everybody for dialing in today’s call and taking the time out of their busy schedule to listen to this presentation and, of course, I’d like to thank Rudolf Hennecke, my colleague out of the EMEA Rig out of Germany for taking the time to put this presentation together and to take the time of his day, towards the end of his work day to present this.

So without further ado, I would like to turn the call over to Rudolf and the topic, “Modeling Aspects in Process Chains.”

R. Hennecke So, hello, everybody, to this conference call on modeling aspects and process chain. Yes, first of all, I hope that everybody could download the presentation. So for the next 30 to 40 minutes I will give some overview on how to model process chains and then we will have some time for questions for about 20 to 30 minutes.

So let’s come to page one, which shows you the agenda of today’s conference call. You will see that the conference call is divided in four blocks: a short, very short introduction and then several modeling aspects on process chains, first conceptual and project-related issues; then from processes to process chains (means collecting processes and then building process chains out of it) and then some general modeling principles.

So let’s start with the introduction. We are now on page three. You’ll find the page number in the upper-right corner of each slide. As you see, process chains are built for administrating and automating the BW administration and this means not only the data-load processes, which we could automate already in BW 2.0 (using info package groups) but also all other administration processes that might run after or even before loading data (like database statistics update or rolling up of aggregates).

It means process chains are the central point of administrating for all processes that need to run in BW before data can be made available for end users. So therefore, on page four you find another definition on the term “process” because in this presentation I will use the term process sometimes, and therefore I will precise the definition of what a process in a process chain is.

A process is composed, as you can see, out of a process type and a variant, and the process type describes what administration task has to be undertaken; for example, the data load or roll-up aggregate.

In the variant, it’s a configuration of that process at the time of definition, means it is the for example the info package to be loaded. Or it is the info cube for which the roll-up of aggregates has to be done. Therefore, process is equal process type plus variant.

You can see from this presentation that I will not talk in detail about the tool. This information can be found in last year’s conference call on operating BW 3.0 using process chains. In this conference call from 2002 you’ll find information on the tool itself and how to use it.

Today’s conference call is more on how do I set up process chains? How do I model process chains?

We are now on page six. Within the conceptual and project-related issues, two points are very important. The first thing is that you have to think of when do I start to implement process chains? This, I recommend to do it quite early in the implementation phase of the project, means right after having implemented your data flows. Means do not start implementing process chains five days before going live with your project.

Then, after having implemented your process chains, use the final preparation phase of your project to test your process chains on performance, correctness and robustness. You will see, as there a lot of modeling on process chains, you really need to start with it early in your project.

Next point is on project team and know-how and what I want to tell you is that the project team members that work with process chain do not only need to know about the tool, but also on data warehouse administration (basic tasks to be undertaken with administrating a huge data warehouse) and also the need to have knowledge on the single process types they start and schedule in process chains. Means they need to know what the system is doing when data is loaded or when aggregates are rolled up.

The other thing is that usually it’s very good to have central project teams implementing these process chains. If this is not the case or if you need to work in parallel and decentrally, then make sure that the different project teams implementing process chains share their knowledge and that they also document what they schedule at which time.

Now on slide seven I give you prerequisites before starting implementing a scenario with process chains.

First question, what is the frequency of this administration scenario to be performed? Is it something you do once or is really something you have to do periodically?

Means for example, you should evaluate whether it makes sense to implement the initialization of data-loading scenario within the process chain or whether this is an ad hoc processing, which not needs to be automated using process chains because it is just executed once.

The next thing is data quality. Is the quality of data good enough that, periodically, subsequent processing can be started? Means if data quality is very bad, usually then you have to work on correctness of data by repairing it (p.eg. by contacting source system people).

If this is the case for your data-load scenario, then you should first try to stabilize data quality before implementing this with process chains.

Last point: Is the availability of data foreseeable? Example is flat file loading; is it foreseeable that a certain flat file is made available every week at the same folder with the same name with the same data quality. And if this data availability is not foreseeable, then also it is, as you can imagine, hard to implement periodic scheduling using process chains.

So therefore, if one of these questions applies, then you should work first on the conceptual side of administration before implementing process chains. This means, as we can see on slide number eight, that you have to think of building administration procedures or operation procedures in order to support process chain implementation. Means you have to fix and you have to discuss certain topics like data ownership and availability of data? Means also support organization: who monitors the process chains and who supports process chains?

Then, also, the next thing is that you have to give guidelines for process chain implementation and for process chain administration. What do I do if something fails? How to react in case of error. What are repair procedures? These are very important questions to be answered before starting automating a scenario using process chains.

The next thing, which should be included in such administration procedures, are, as you can see on slide nine, naming conventions.

Also, it is important to think of transport management guidelines for process chains. My recommendation is that you transport process chains from the development system to the test system. You test it then, and then you transport it to production because process chains are no ad hoc scheduling tool like info packages are, but process chains are really something static, a static tool, which is used for periodic scheduling. Therefore, process chains should be transported from development system to test and then to production system.

Now let’s come to the next part in this presentation, and this is the part from processes to process chains. We will now come to page number 11. In this chapter I now I want to give you advice on how to collect relevant processes that have to run within the process chain and then how to distribute these processes to single process chains.

First, you have to collect all processes that have to run before data can be released to the end users. This is not only our data-load processes; it’s also administration processes like change runs, like roll-up of aggregates and also reporting agent processes.

Then when having all these processes, you have to think of what are the time windows for my process chains? When can I start the process chain during the night and when does it need to finish in order that data is available early in the morning for the end users?

Then finally, you have to think of special dependencies or priorities. Is something depending on another thing or is something more important to succeed in a process chain than another process?

So on slide 12 this is an example, on a central document, that collects all processes in order to prepare the final modeling of the process chain.

When we now think of distributing processes to process chains we might use some typical criteria for doing this. These criteria, you can find them now on page 13 and page 14. The criteria I give there are ordered according to their priority. Means that the first criteria, frequency, seems to me being the most important criteria.

This is because frequency is a criteria, which leads automatically to separately scheduled process chains, so if one process needs to be scheduled daily and the other needs to be scheduled monthly, then in 99% of the cases you need to build separately scheduled process chains in order to be started, once a day or once a week.

Next criteria that will help you, distributing processes to process chains, is the business scenario, which is in question. Means the reporting scenario, for example inventory management or headcount analysis. The criteria business scenario, can lead to separate scheduling but is most of the time used for administration purposes. Means that you might have different support teams and different support organizations for different business topics, for different business scenarios.

Next criteria is the data type you load. Means whether loading master data or transaction data. This is not a criteria, which leads automatically to separately scheduled process chains but which will very often be used for creation of sub chains: one sub chain for master data, another one for transaction data.

So next criteria on slide 14 is source systems. The type of source system might sometimes also lead to separate scheduling…availability of source systems or data in the source systems will be different. So that will be used for creating sub chains.

And the last criteria applies for BW landscapes, where we find multiple BW’s.

Usually, as you can see, as a final conclusion, process chain design and distribution of processes to process chains will depend heavily on data-load processes, and whenever you have something to be scheduled that is not depending on any data load, a process chain that does not need to include a data load, then you might schedule this group of processes totally independent from all other process chains relying on data loads.

The next very, very, very important topic is that when then you can see how your processes will be grouped to process chains that you have to see that the complexity of a single chain should still be reasonable. Means you should not include all the processes you have in a single process chain because when having over 100, 200 processes in one chain, then administration of this chain will be very difficult.

So therefore, to keep it simple and to keep administration simple, you should think of then splitting up this very big chain in several sub chains. The benefits you will then have is that you can better visualize dependencies, better visualize lock situation and administration will be simplified.

To do this, as you can see, you can use the process type “local process chain” which you can find in the process chain maintenance transaction RSPC.

When doing this, you create a so-called process chain hierarchy. My recommendation on processes hierarchies is that you should not have, in one hierarchy, more than two to three hierarchy levels.

Now we come to slide number 19 to the general modeling principles. As you can see, there are five basic principles I will mention. On one hand, technical considerations like maximum performance and maximum stability, and then on business side fulfilling all my requirements and doing it as less cost as possible.

So now we are on page 20. First topic I want to talk about is how to avoid unwanted dependencies in the process chain. This is about how to improve termination security of a process chain.

The first example you can find there is on modeling the links in a process chain that link processes to each other. The color of these links, should be always based on the priority of a single process.

An example is that in some cases the failure on a single text load should not stop the whole process chain. Therefore, in the good example, on the right side, you see that the link between the text load and the attribute change run is modeled with green/red. Means the attribute change runs in all cases, even if the text load failed.

In order to use this green/red link, you should, as prerequisite, avoid the undefined yellow status on processes by using 2 customizing settings:

The one is that you customize the traffic light colors in BW (customizing transaction SPRO). The next customizing setting that might help you is using the polling flag.