Wednesday, January 4, 2012

Replacing of complex LIKE with REGEXP_LIKE from Oracle 10g

Oracle Database 10g offers four regular expression functions. You can use these equally in your SQL and PL/SQL statements. Among these REGEXP_LIKE is the most useful. Like LIKE operator it returns boolean values. Thus this can be easily used in WHERE and HAVING clause. The header of this function is -

REGEXP_LIKE(source, regexp, modes)

To know more about this function visit http://www.oracle-developer.net/.

Also here is a recommended site for regular expression http://www.regular-expressions.info

The third parameter is optional and it is rarely used.

Here I will show you how it makes the query simple and faster -

'^(880|88018)?7777[0-9]{0,2}$' - this regular expression matches every starts with 880 or 88018 or none of these, then 7777 then ends or ends after one or two digits. So it matches the following strings -

8807777, 880187777, 7777, 77776, 777758.

So, if you want to query a database table with this filter on a column, you can write the following query -

select * from table where regexp_like(colA, '^(880|88018)?7777[0-9]{0,2}$')

or

select * from table where colA = '7777' or colA='8807777' or colA='880187777'
or colA like '7777_' or colA like '8807777_' or colA like '880187777_'
or colA like '7777__' or colA like '8807777__' or colA like '880187777__'

Thus regexp_like() function makes the query simple, comprehensive and easy to manage. In some case you may find it is absolutely hard to write query using like where that can be easily written using regexp_like().

Tuesday, October 4, 2011

High Level Steps of Creating Dash board in BO XI R2

Here are the steps -

1. Build a metic universe & Export it to performance management
  • This universe need not have any calendar.
  • All the measures must have a self join i.e. under where clause the event date should be mentioned between start date and end date.
    Example - DATE_ID between @Prompt('Begin_Date','D',,mono,free)
    and @Prompt('End_Date','D',,mono,free)
  • select clause should not contain any aggregate function like sum, avg, etc.
  • Import the universe in performance management -
    Performance management>Setup>System Setup>Universes>Add Universe
  • Select the time dimension as you required from time configuration -
    Use Performance management>Setup>Time Config
2. Create/Edit Metrics
  • Create the Metrics -
    Performance management>Dashboard Manager>Metrics>
  • Select proper calculation interval
  • Check "Stop at current period" to stop calculation till today. This helps you to see the dashboard till today.
  • After creation of Metrics, don't forget to refresh it. Otherwise, you will not be able to see the data.
  • Use purge and refresh after editing the calculation period.
3. Create Metric Analytics
  • Create the analytics -
    Performance management>Dashboard Manager>Metric Analytics>
  • Use "Save as" to save it in proper folder
  • Use proper render style and graph properties
4. Create A dashboard
  • Add a dashboard
  • Save the dashboard in proper directory
5. Place the Metric Reports on dashboard
  • Open the dashboard in the directory.
  • Select customize then pick the analytics that you want to see in your dashboard

Monday, September 5, 2011

Solution of File-based Data Collection in Exadata

When I use Oracle Exadata Database in 2010, it was the first instance of this awesome product in South Asia. The biggest problem that we face during Data Warehouse implementation on this machine was that Oracle forbid us to collect file-based data on Exadata. This is because -
  • There is no space available in local disk of the RAC instances as this may create problem such as no space available or instance high availability.
  • External storage of Exadata are used as Automatic Storage Management (ASM). I.e., from Oracle Enterprise Linux (OEL) these are raw disks and the oracle database engine manage these storage. So, these storage cannot be accessed in conventional way.
The solution of this problem can be solved in two ways -
  1. Read External table using DB Link
  2. Store data in a database table using sql loader then read the table using DB Link. The problem in this solution and its previous solution is that you cannot use the parallel processing of Exadata
  3. Use Database File System (DBFS). This is the ultimate solution as in this case external tables can be accessed in parallel manner. So, the extreme performance of exadata can be utilized.
Actually configuring DBFS is fully part of System & DB Admin job. They configure the DBFS server in Exadata and client in the staging area OS. We, the ETL team, mainly use the client of DBFS. Here are the steps -
  1. At first, run the mount script in client which connects to DBFS service (not database service) in Exadata. This script actually run the DBFS client with the following parameter -

    dbfs_client dbfs_user/dbfs_pass@SRV_DBFS -o allow_root -o direct_io /ORABASE/DBFS_DIR

  2. After execution of above script mount point in DBFS server will be available under mount point of DBFS client. Note that the original name of this mount point can be different in DBFS server. As an example in our case,

    In DBFS server the mount point is /u01/app/DBFS_DIR/
    In DBFS client the mount point is /ORABASE/DBFS_DIR/

    So, now if you create a directory in DBFS server under the server mount point, the directory will be available in same name under DBFS client mount point. The system & db admin will ensure that both server and client should be in same user group so that no privilege related problem will appear during accessing the directories within DBFS mount point.

  3. One advice - please collect the files in the client's own mount point then copy those files to DBFS client mount point. Its performance is optimum.

  4. Now rest of the things are within Exadata server. Create Oracle Directory using OS directory under DBFS server mount point which is visible to OEL if DBFS server is up and running.

  5. Create an external table in Exadata database server using the above database directory.
Thus, you can create and use External table within Exadata and can be benefited by its power.

Wednesday, June 15, 2011

Calculation Context : Ulmitate Solution for Complex Query in Business Object

Calculation contexts give you more control over how a formula or measure is evaluated. To understand calculation contexts, you need to be familiar with basic report concepts.

A report contains two kinds of objects:

  • dimensions - which are types of data about your business that can have measures associated with them (for example: products, years, states)

  • measures - that you can calculate in relation to dimensions (for example: sales revenue, number of sales).

For example, a report could show sales revenue (a measure) by year (a dimension). In this case the sales revenue results are calculated by year.

The important thing to remember about measures is that they are semantically dynamic. This means that the results returned by a measure depend on the dimension(s) with which the measure is associated. In other words, the results returned by a measure depend on the context in which the measure is placed.

WebIntelligence calculates measures according to default contexts depending on where the measures appear in a report. However, you can change these default contexts. This is what is meant by defining the calculation context. You define the calculation context by including context_operators and context_keywords (Report, Section, Break, Block and Body) in your formulas.

A calculation has an input context and an output context. You can specify neither, either or both explicitly. The input context is the set of dimensions used to make the calculation. The output context is a set of dimensions that functions like a break in the calculation - it causes the calculation to output values as if they are calculated in the footer of a table break (or an SQL COMPUTE BY clause).

Example

The final column in the table below is calculated as follows:

calculation - minimum

input context - year, quarter

output context - year

This combination of input and output contexts tells Web Intelligence to sum the revenue for each year/quarter combination (the dimensions used to make the calculation), then output the minimum such value in each year ( the value that appears in the break footer if the table contains a break on Year and a minimum calculation in the break footer).

Year

Quarter

Product

Revenue

Minimum

2005

Q1

Product 1

10000

41000

2005

Q1

Product 2

15000

41000

2005

Q1

Product 3

16000

41000

2005

Q2

Product 1

17000

41000

2005

Q2

Product 2

17000

41000

2005

Q2

Product 3

16000

41000

2006

Q1

Product 1

14000

33000

2006

Q1

Product 2

15000

33000

2006

Q1

Product 3

16000

33000

2006

Q2

Product 1

12000

33000

2006

Q2

Product 2

11000

33000

2006

Q2

Product 3

10000

33000

This is how the calculation looks when made explicit in a table:

Year

Quarter

Revenue

2005

Q1

41000

2005

Q2

50000


Min:

41000

2006

Q1

45000

2006

Q2

33000


Min:

33000

In Web Intelligence formula syntax, this can be expressed as either

Min ([Revenue] In ([Year];[Quarter])) In ([Year])

or

Min ([Revenue] ForAll ([Product])) In ([Year])

The input context appears inside the function parentheses and the output context appears after the function parentheses. The set of dimensions in each context is itself enclosed in parentheses and the dimensions separated by semi-colons.

You use the context operators In, ForEach and ForAll to set up input and output contexts. You can also use context keywords to specify the data referenced in the contexts.

For more information, please download the manual from help.sap.com and read chapter 3 - Understanding calculation contexts.

Sunday, May 29, 2011

Pagination in Data View : How to Facilitate My SQL Limit in Oracle

The solution provided here probably is not new to many. I try to combine the optimum solution that I know in one post.

Those who are acquainted with My SQL, surely knows about LIMIT keyword in My SQL. Oracle does not have LIMIT keyword feature. But you can make the pagination of Data View better using two feature of Oracle - ROWNUM and REF CURSOR.

create or replace procedure paging(p_pagenum IN number,p_pagesize IN number, p_cursor OUT SYS_REFCURSOR)is
begin
open p_cursor FOR
SELECT DATA
FROM
(SELECT ROWNUM RNUM,a.*
FROM
(
SELECT
object_name data
FROM
TBL_SOURCE
ORDER BY created
) A
WHERE ROWNUM<=p_pagesize*p_pagenum )
WHERE RNUM > (p_pagesize*p_pagenum) - p_pagesize
ORDER BY RNUM;
end paging;

This procedure will be sufficient for pagination. It is somewhat complex due to the fact that ROWNUM is a pseudo column. From the select statement it is clear that the first page data loading will be the fastest and the last page data loading will be the slowest. This is because, for the last page, you have to fetch all rows.

Next comes the REF_CURSOR. It is a handler of the source data like other cursors. In stead of fetching all data, it fetches data one by one and provides streaming. But unlike other cursor this cursors can be consumed by other pl/sql or high level languages like C# or java.

See also - http://www.oradev.com/ref_cursor.jsp and http://support.microsoft.com/kb/322160

If you only use ROWNUM and do not use REF_CURSOR, you will still get better performance. In that case, you have to write inline SQL in java or C# code.

Tuesday, March 29, 2011

Oracle External Table with strong data type & datetime mask

Those who are familiar with Oracle Data Warehousing and ETL must be familiar with Oracle External Table. It is actually nothing but showing the file content in Oracle table by SQL Loader parsing.

I have seen some developers who always use external table without any strong data type i.e. they keep data types of all columns as varchar/varchar2. So, though the data should be datetime field or number field they put the data type of that field as varchar2 and do the type casting in stored procedures or views. This is certainly a bad practice due to the following reasons -

  1. Explicit type casting is slower than the parsing time type imposing.

  2. If data quality is not good it will be horrible to use varchar2 as all columns types. In case of using strong data type column, the rows in the files that do not meet the type constraints will go to the bad record and you will find the reason in log file why these records are discarded. While in case of weak type column, you will get lost during type casting. Because, these type miss-match records are loaded in the external table not in the bad file and during type casting in procedures or views you will get exception and you cannot identify which column and row in the file causes this problem.
Using strong data in external table is not at all a difficult task. You can easily do it by mentioning the type in external table definition. Your file access parameter can be same; you just need to change the column type in the external table. But there is one exception and that is the date type column.

For date type column, you need to change the access parameter also. It is not at all complex rather very easy. Just instead of using char as the field type in file access parameter use date and mention the date mask. As an example, to load date type data in date type column of external table we can use -

EVENT_TIME date 'YYYYMMDDHH24MISS'
instead of
EVENT_TIME char
where the sample data in this field is - 20110522180652
You see in the example date mask takes same pattern mask as the to_date() function. For more detail you can visit download.oracle.com and forums.oracle.com.

Friday, March 4, 2011

Task scheduler in Business Object XI

The purpose of task schedule in BO (or any other BI tool) is very much significant. By scheduling and viewing instances (An instance is a version of the object or report that contains data from the time that the object was run), you can ensure that the objects have the most up-to-date information available for viewing, printing, and distributing. Its purpose can be -
  1. regular (or periodic) report refreshing so that the user will not face any delay when he sees the report or dash board.
  2. checking the data integrity of data mart on regular basis without running the sql manually.
  3. sending email attaching the report on regular basis.
Here are the steps:-
  • Configure your job server from -
    administration launch pad > central management console > servers> respected server (e.g. Web_IntelligenceJobServer).
    Select proper destination. As an example, if you want to use email, you have to configure mail template and SMTP account configuration.
  • Go to info view and find your report and click on the schedule hyperlink. Then schedule time (daily, monthly, ...), destination (default location, inbox or email group configured from job server).
  • You have to make sure that you user proper OS authentication configure. You can check it from -
    Performance Management > Set up > Parameters > Scheduler Parameters > Scheduler Parameter.
  • You can also troubleshoot your scheduler from -
    Performance Management > Set up > Tools >Check & Cleanup > Scheduled Program
For more detail, visit the http://www.sdn.sap.com and download the guide book.