BO FAQ
BUSINESS OBJECTS
Frequently Asked Questions
Prepared by Raj
Date: July 12th, 2011
Version: BO XI 3.x
Note: This material is prepared by referring to many websites, blog sites of many BO gurus and material from various sources. I would like to thank all for their contribution. I just gathered and kept here for the sole purpose of helping BO newbie’s and whoever attending interviews to have a quick glance of the subject.
Chapter 1: Data warehousing & BO Concepts
1)What is Data warehouse?
Ans. A Data warehouse is a Subject-oriented, Integrated, Time-variant and Non-volatile collection of data in support of management‘s decision making process.
Subject-Oriented: A data warehouse can be used to analyze a particular subject area. For example, "sales" can be a particular subject.
Integrated: A data warehouse integrates data from multiple data sources (transactional systems - OLTP). For example, source A and source B may have different ways of identifying a product, but in a data warehouse, there will be only a single way of identifying a product.
Time-Variant: Historical data is kept in a data warehouse. For example, one can retrieve data from 3 months, 6 months, 12 months, or even older data from a data warehouse.
Non-volatile: Once data is in the data warehouse, it will not change. Only inserts are done todata warehouse and No updates.
So, historical data in a data warehouse should never be altered.
2)What is Data mart?
Ans. Data mart is a subset of a Data warehouse.
3)What is Rapid Mart?
Ans. It uses pre-packaged data marts for SAP, Oracle, PeopleSoft and Siebel applications to accelerate the delivery of analytical data.
4)What is the difference between OLTP and OLAP?
Ans. OLTP stands for On Line Transaction Processing which deals with day-to-day transactions, stores the current data in the database which is normalized as updates are very frequent and deals with the small amount of data.
OLAP stands for On Line Analytical Processing stores the historical data based on OLTP source and the database is De-normalized as frequent updates will not happen and deals with bulk amount of data to support trend analysis and future predictions.
5)What are the types of Dimensions?
Ans. The types of dimensions are:
- Confirmed Dimension
- Junk Dimension
- De-generate Dimension
- Role-playing Dimension
6)What is Confirmed Dimension?
Ans. Dimension which is shared by all fact tables or shared across different data marts is called as Confirmed dimension.
Example: The date dimension table connected to the sales facts is identical to the date dimension connected to the inventory facts.
7)What is Junk Dimension?
Ans. It is a dimension table consisting of attributes that does not belong to the fact table or any of the existing dimension tables. These attributes are usually text or flags with yes/no or true/false indicators.
8)What is Degenerate Dimension?
Ans. It is a fact table primary key and represents the unique identifier of the parent. It has no attributes and doesn’t join to an actual dimension table. Example: Invoice/Tran number.
9)What are Degenerated Objects?
Ans. Objects created using SQL queries or stored procedures called Degenerated Objects.
10)What is Role playing Dimension?
Ans. Dimensions which are used in multiple applications within the same database. For example a "Date" dimension can be used for "Date of Sale", as well as "Date of Delivery", or "Date of Hire". This is often referred to as a "role-playing dimension".
11)What is Casual Dimension?
Ans. Dimension which will not change the fundamental grain of the fact table is called as casual dimension.
Example: Gender - Male, Female.
12)What are slowly changing dimensions?
Ans : Slowly Changing Dimensions are basically those dimensions whose key value will remain static but description might change over the period of time. For example, the product id in companies, product line might remain the same, but the description might change from time to time.
13)What are the types of slowly changing dimension (SCD)?
Ans. There are 3 types of SCD.
Type 1: History of dimension is not stored in the dimension table. No trace of the old record exists
Type 2: A new record is added into the customer dimension table for maintaining history whenever the attributes of a dimension is changed.
Type 3: The original record is modified to reflect the change of dimension attribute. Partial history is maintained.
14)What is MOLAP? (Multidimensional)
Ans. In MOLAP data is stored in multidimensional cube. The data can be retrieved fast and slicing and dicing operation is optimal and can perform complex calculations but limited data can be handled.
15)What is ROLAP? (Relational)
Ans. In ROLAP data is stored in relational database. Can handle large amount of data but is limited by SQL functionalities and performance can be slow.
16)What is HOLAP?
Ans. It is the combination of MOLAP and ROLAP. For summary type information, it uses the cube technology for fast performance and when detail information is needed, it can drill through from the cube into underlying relational data.
17)What is Dimension Modeling?
Ans. Dimensional modeling is a logical design technique to present the data in a standard frameworkto allow for high-performance access. It is inherently dimensional and uses the relational model with some restrictions.
Every dimensional model is composed of one table with a multipart key called the fact table and a set of smaller tables called dimension tables. Each dimension table has a single-part primary key that corresponds exactly to one of the components of the multipart key in the fact table.
18)What is Fact table?
Ans. It is a table which contains two types of columns. One that contain numeric facts (measurements) and other column have foreign keys to dimension tables.
A fact table contains either detail-level facts or facts that have been aggregated (Summary tables).
19)What is Dimension table?
Ans. It is a table which contains further information about an attribute in a fact table.
A foreign key of a fact table references the primary key in a dimension table in a many-to-one relationship.
20)What are the different measure (fact) types?
Ans.
Additive: Measures that can be added across all dimensions.
Example - Sales
Semi-Additive:Measures that can be added across some dimensions and not across others.
Example: Inventory level, where you cannot tell what a level means simply by looking at it.
Non-Additive:Measures that cannot be added across any dimension.
Example– Average
21)What is ODS(Operational Data Store)?
Ans. An operational data store (ODS) is an integrated database, source includeslegacy systems and it contains current or near term data, means data is not static. An ODS may contain 30 to 60 days of information, while a data warehouse typically contains years of data and data is static.
22)What is a Star schema?
Ans. A typical star schema has a completely de-normalized dimension and fact tables whoseEntity-Relationship (ER) diagram looks like a star.
Dimensions have Primary key and Fact table have foreign keys referencing dimension table primary keys.
A star schema can have any number of dimension tables. The crow's feet at the end of the links connecting the tables indicate a many-to-one relationship between the fact table and each dimension table.
23)What is a Snowflake schema?
Ans. In a snowflake schema one or more dimension tables are partially or completely normalized.
A snowflake schema can have anynumber of dimensions and each dimension can have any number of levels. The following figure shows a snowflake Schema.
24)What is the difference between Star and Snow flake schema?
Ans. STAR SCHEMA: De-Normalized Data Structure, Category wise Single Dimension Table, More data dependency and redundancy, No need to use complicated join, Query Results Faster, No Parent Table, and Simple DB Structure.
SNOWFLAKE SCHEMA:Normalized Data Structure, Dimension table split into many pieces, less data dependency and No redundancy, Complicated Join, Some delay in Query Processing, It May contain Parent Table, Complicated DB Structure.
25)What is the use of AFD? Where it can be stored?
Ans. AFD stands for Automatic File Distributor. It is used to create dashboards. It can be stored in repository, corporate or personal.
26)What is Data Integrator (BODI)?
Ans. Data Integrator is a data movement and integration tool and has the capability of Extracting, Transforming and Loading data from multiple sources into a target database or data warehouse.
27)What is Data Federator?
Ans. It provides federated query capabilities that can accelerate deployment performance. It creates a virtual view of all data sources and allows a single BO universe or crystal reports to map to multiple sources and optimally federates (Integrates) queries against individual sources directly.
28)Are data mart and data warehouse normalized or de-normalized? Can both exist in same tier?
Ans. Data in Datamart and Datawarehouse is denormalizedand optimized for OLAP systems.As datamart is the subset of datawarehouse both can exist in same tier and it supports a particular business unit.
29)Can you explain BO architecture?
Ans. BO is a 5 tier architecture consisting of:
- Client tier
- Application tier
- Intelligence tier
- Processing tier
- Data tier
Client tier: There are two types of clients.
a)Windows based clients – These are also called as Thick clients and needs to be installed on your machine to work on it.
Example: Central Configuration Manager, Designer, Web Intelligence Rich Client, Desktop Intelligence, Report Conversion Tool, etc.
b)Web based clients – These are also called as Thin clients and are accessed by any web browser.
Example: Infoview, Web Intelligence, Central Management Console (CMC).
Application tier:BO servers will be hosted either in JAVA application or .NET application.
Intelligence tier: It has four servers.
a)Central management Server - It is a primary server which provide services for all other servers in the BI platform including management of
Security
Objects
Servers
Auditing
b)Event Servers - It is a responsible for monitoring file based events, and notifying the CMS of any events occurred.
c)Input/Output File Repository Server - It is a responsible for the creation of file system objects, such as exported reports, and imported files in non-native formats.
d)Cache Server - It is used to store previous run activities of report view requests and avoids accessing the database each time a report is requested, thus accelerates viewing performance and reduces network traffic.
Processing tier: It is the only tier that directly interacts with the reporting database and accesses the data tier and generates reports for clients. The main servers include:
Job servers
Processing servers
Report Application server
Multi-Dimensional Analysis server
Dashboard and Analytics servers
Data tier: It is made up of the data sources that contain the information used in the reports and documents managed by BO enterprise system, and supports wide range of corporate databases.
TheConnection server is responsible for handling connection and interaction with the various data sources. It supports relational databases (Oracle, MySQL, Microsoft SQL Server, DB2, Sybase) as well as OLAP (SAP BW, Microsoft Analysis Services, Hyperion Essbase).
30)What are the different authentication modes you know?
Ans. Version 3.0 supports below authentication:
- Enterprise
- Windows NT
- Windows AD
- LDAP (Light weight Data Access Protocol) which will be mapped in CMC.
31)What is Semantic layer?
Ans. It is a business transaction layer between the database and reports. The semantic layer in BO is called as Universe.
As many business rules will be applied on database, these rules generate the SQL and if two users ask for the same information, these users will get same result.
It gives the users independence on the technology as just by dragging objects the query will be modified.
It is an extra layer and is not free. It has to be created, maintained and managed. It must be kept in sync with any database changes that occur.
It could connect to only 1 database at a time.
32)What is meant by ZABO?
Ans. ZABO stands for Zero Administration BusinessObjectswhich uses the hardware resources of the client machine for application level processing and communicates through the protocol defined for the browser to use the server resources for processing External requests (like running a database Query, creating a data provider to access data.)
Chapter2: Designer
1)What are the difference between BO XI R2 and 3.0?
Ans. BO XI 3.0 supports the following features which are not supported in XI R2. These are:
Smart measure: A measure can be called as a smart measure in the universe when its data aggregated in a way not supported by WEBI. A smart measure has projection function set to “Database delegated” on the properties tab of object properties.
Optional prompts: In WEBI you can ignore optional prompts by not specifying any value and prompts does not apply a filter on the data.
ForceMerge function: With this WEBI function you can calculate measures for merged dimensions and is similar to Multicube function in DESKI.
Data tracking: WEBI has the feature to track data changes by highlighting the changed data and displays the previous value of a dimension or measure along with its current value.
It highlights the changed data according to parameters you set. In the option Auto-update, the current data becomes the reference data after each data refresh and the other option is “use the current data as referenced data” the report always show the difference between the most recent data and this fixed reference data.
Multilingual support: From the same report, you can decide in which available language you want to view the report’s metadata and prompts. Using Translation manager you can add translations to the universe on which the report is created or translate the report itself. Once the document is translated, it is published and available to users in their preferred viewing language.
Note: Data is dependent on database source cannot be translated.
Document linking: Using WEBI you can create links to other WEBI documents, to web sites, or to any resource accessible from a WEBI report using a simple user interface. From WEBI you can even pass data to linked resource.
Stored Procedures: In Designer, you can create a universe based on one or more existing stored procedures.
2)How can you optimize a Universe?
Ans. Query time can be shortened by optimizing a universe.There are several ways to optimize a universe.
- Optimizing the Array fetch parameter in the Universe parameters
- Allocating weight to each table
- Using shortcut joins
- Creating and using aggregate tables in database
Optimizing the Array fetch parameter
The array fetch parameter allows you to set maximum number of rows that are permitted in a fetch procedure and determines the packet size on the network. For example, if the array fetch is set as 20 and you plan to retrieve 40 rows, then two fetches will be executed to retrieve the data.
Some data sources do NOT allow to modify the fetch size, in such cases all rows will be return in single FETCH. If you want to reteive binary long-objects (BLOB) set FETCH size as 1.
If you have a network that allows you to send a large array fetch, then you can set a new larger value. This value will speed up the FETCH procedure and reduce query processing time. Values can be set from 1 to 999.
Allocating table weights
Table weight is a measure of how many rows are there in a table. By default BO sorts the table from lighter to heavier tables and this determines the table order in the FROM clause of SQL statement.
If you are using Oracle database, you can optimize the SQL by reversing the order that BO sorts the table. To do this we need to modify the oracle .PRM file by changing parameter REVERSE_TABLE_WEIGHT from ‘Y’ to ‘N’ (Default – Y ). By doing this BO forces to sort the tables from those with most rows to those with least rows.
Path of oracle.PRM file => <INSTDIR>\dataAccess\RDBMS\connectionServer\oracle\oracle.prm
Restart Designer to apply the changes to the file.
In Designer we can manually change the number of rows for any table.
To view, select View menu => Number of rows in tables. To modify
- Open a universe in designer, Right click the relevant table.
- Select number of rows in table from contextual menu. A dialog box appears.
- Select the Modify manually tables row count radio button. A text box appears.
- Type a number in the text box. This is the number of rows that you want to use for the table.
- Click ok and save the universe.
Using short cut joins
A short cut is an alternate path between two tables. Short cut join used to reduce number of tables that are used in a query.
Using Aggregate Awareness
The ability of a universe to make use of Aggregate tables to optimize SQL query is called Aggregate Awareness. These are the tables which contain pre-calculated data.
You can use @Aggregare_Aware function in the select statement for an object which directs SQL query to run against Aggregate tables rather than on base tables.
Using Aggregate tables speeds up the execution of query and thus improves the performance.
Setting up aggregate awareness
- Build the Objects
1. Identify all the possible definitions (table/columncombinations) of the objects.
2. Arrange the objects by level of aggregation.
3. Build the objects using the @Aggregate_Awarenessfunction.
- Specify the incompatible objects ( Tools- Aggregate Navigation )
1. Build an objects/aggregate tables matrix.