Tuesday, March 29, 2011

Oracle External Table with strong data type & datetime mask

Those who are familiar with Oracle Data Warehousing and ETL must be familiar with Oracle External Table. It is actually nothing but showing the file content in Oracle table by SQL Loader parsing.

I have seen some developers who always use external table without any strong data type i.e. they keep data types of all columns as varchar/varchar2. So, though the data should be datetime field or number field they put the data type of that field as varchar2 and do the type casting in stored procedures or views. This is certainly a bad practice due to the following reasons -

  1. Explicit type casting is slower than the parsing time type imposing.

  2. If data quality is not good it will be horrible to use varchar2 as all columns types. In case of using strong data type column, the rows in the files that do not meet the type constraints will go to the bad record and you will find the reason in log file why these records are discarded. While in case of weak type column, you will get lost during type casting. Because, these type miss-match records are loaded in the external table not in the bad file and during type casting in procedures or views you will get exception and you cannot identify which column and row in the file causes this problem.
Using strong data in external table is not at all a difficult task. You can easily do it by mentioning the type in external table definition. Your file access parameter can be same; you just need to change the column type in the external table. But there is one exception and that is the date type column.

For date type column, you need to change the access parameter also. It is not at all complex rather very easy. Just instead of using char as the field type in file access parameter use date and mention the date mask. As an example, to load date type data in date type column of external table we can use -

EVENT_TIME date 'YYYYMMDDHH24MISS'
instead of
EVENT_TIME char
where the sample data in this field is - 20110522180652
You see in the example date mask takes same pattern mask as the to_date() function. For more detail you can visit download.oracle.com and forums.oracle.com.

4 comments:

RIYAD said...

Elaboration of "Explicit type casting is slower than the parsing time type imposing." will be helpful to understand more about the benefit of type casting

Moinul Al-Mamun said...

Hello, here is a sort of code. Hope it will helpful for beginner to understand.

CREATE TABLE OPSO.ETBL_R0_PR_KPI_TEST
(
FILE_NAME VARCHAR2(300 BYTE),
SL_NUMBER NUMBER,
OPERATE_TYPE NUMBER,
OPERATE_RESULT NUMBER,
OPERATE_TIME DATE,
CALLING_PARTY VARCHAR2(120 BYTE),
CALLED_PARTY VARCHAR2(120 BYTE)
)
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY ETL_KPI_PR_DATA
ACCESS PARAMETERS
( RECORDS DELIMITED BY NEWLINE FIELDS TERMINATED BY '|'
(
FILE_NAME CHAR,
SL_NUMBER CHAR,
Operate_Type CHAR,
Operate_Result CHAR,
Operate_Time DATE 'YYYYMMDDHH24MISS',
Calling_Party CHAR,
Called_Party CHAR
)
)
LOCATION (ETL_KPI_PR_DATA:'R010001_PR_KPI.dat')
)
REJECT LIMIT 1000
NOPARALLEL
NOMONITORING;

Thanks! Moinul

Sadique said...

"Explicit type casting is slower than the parsing time type imposing."

- This means using of to_number() or to_date() function for type casting is very much slower than imposing the data type number or date in External table column. Probably this is because, it imposes the type casting during parsing. This explicit type casting is evident specially when you query the external table through a db link. As an example, when I tested it for a 17 GB external table, I found the data loading was about 1 hour faster for external table with strong data type.

obaid said...

good findings!!