I have seen some developers who always use external table without any strong data type i.e. they keep data types of all columns as varchar/varchar2. So, though the data should be datetime field or number field they put the data type of that field as varchar2 and do the type casting in stored procedures or views. This is certainly a bad practice due to the following reasons -
- Explicit type casting is slower than the parsing time type imposing.
- If data quality is not good it will be horrible to use varchar2 as all columns types. In case of using strong data type column, the rows in the files that do not meet the type constraints will go to the bad record and you will find the reason in log file why these records are discarded. While in case of weak type column, you will get lost during type casting. Because, these type miss-match records are loaded in the external table not in the bad file and during type casting in procedures or views you will get exception and you cannot identify which column and row in the file causes this problem.
For date type column, you need to change the access parameter also. It is not at all complex rather very easy. Just instead of using char as the field type in file access parameter use date and mention the date mask. As an example, to load date type data in date type column of external table we can use -
EVENT_TIME date 'YYYYMMDDHH24MISS'You see in the example date mask takes same pattern mask as the to_date() function. For more detail you can visit download.oracle.com and forums.oracle.com.
instead of
EVENT_TIME char
where the sample data in this field is - 20110522180652
4 comments:
Elaboration of "Explicit type casting is slower than the parsing time type imposing." will be helpful to understand more about the benefit of type casting
Hello, here is a sort of code. Hope it will helpful for beginner to understand.
CREATE TABLE OPSO.ETBL_R0_PR_KPI_TEST
(
FILE_NAME VARCHAR2(300 BYTE),
SL_NUMBER NUMBER,
OPERATE_TYPE NUMBER,
OPERATE_RESULT NUMBER,
OPERATE_TIME DATE,
CALLING_PARTY VARCHAR2(120 BYTE),
CALLED_PARTY VARCHAR2(120 BYTE)
)
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY ETL_KPI_PR_DATA
ACCESS PARAMETERS
( RECORDS DELIMITED BY NEWLINE FIELDS TERMINATED BY '|'
(
FILE_NAME CHAR,
SL_NUMBER CHAR,
Operate_Type CHAR,
Operate_Result CHAR,
Operate_Time DATE 'YYYYMMDDHH24MISS',
Calling_Party CHAR,
Called_Party CHAR
)
)
LOCATION (ETL_KPI_PR_DATA:'R010001_PR_KPI.dat')
)
REJECT LIMIT 1000
NOPARALLEL
NOMONITORING;
Thanks! Moinul
"Explicit type casting is slower than the parsing time type imposing."
- This means using of to_number() or to_date() function for type casting is very much slower than imposing the data type number or date in External table column. Probably this is because, it imposes the type casting during parsing. This explicit type casting is evident specially when you query the external table through a db link. As an example, when I tested it for a 17 GB external table, I found the data loading was about 1 hour faster for external table with strong data type.
good findings!!
Post a Comment