Monday, September 5, 2011

Solution of File-based Data Collection in Exadata

When I use Oracle Exadata Database in 2010, it was the first instance of this awesome product in South Asia. The biggest problem that we face during Data Warehouse implementation on this machine was that Oracle forbid us to collect file-based data on Exadata. This is because -
  • There is no space available in local disk of the RAC instances as this may create problem such as no space available or instance high availability.
  • External storage of Exadata are used as Automatic Storage Management (ASM). I.e., from Oracle Enterprise Linux (OEL) these are raw disks and the oracle database engine manage these storage. So, these storage cannot be accessed in conventional way.
The solution of this problem can be solved in two ways -
  1. Read External table using DB Link
  2. Store data in a database table using sql loader then read the table using DB Link. The problem in this solution and its previous solution is that you cannot use the parallel processing of Exadata
  3. Use Database File System (DBFS). This is the ultimate solution as in this case external tables can be accessed in parallel manner. So, the extreme performance of exadata can be utilized.
Actually configuring DBFS is fully part of System & DB Admin job. They configure the DBFS server in Exadata and client in the staging area OS. We, the ETL team, mainly use the client of DBFS. Here are the steps -
  1. At first, run the mount script in client which connects to DBFS service (not database service) in Exadata. This script actually run the DBFS client with the following parameter -

    dbfs_client dbfs_user/dbfs_pass@SRV_DBFS -o allow_root -o direct_io /ORABASE/DBFS_DIR

  2. After execution of above script mount point in DBFS server will be available under mount point of DBFS client. Note that the original name of this mount point can be different in DBFS server. As an example in our case,

    In DBFS server the mount point is /u01/app/DBFS_DIR/
    In DBFS client the mount point is /ORABASE/DBFS_DIR/

    So, now if you create a directory in DBFS server under the server mount point, the directory will be available in same name under DBFS client mount point. The system & db admin will ensure that both server and client should be in same user group so that no privilege related problem will appear during accessing the directories within DBFS mount point.

  3. One advice - please collect the files in the client's own mount point then copy those files to DBFS client mount point. Its performance is optimum.

  4. Now rest of the things are within Exadata server. Create Oracle Directory using OS directory under DBFS server mount point which is visible to OEL if DBFS server is up and running.

  5. Create an external table in Exadata database server using the above database directory.
Thus, you can create and use External table within Exadata and can be benefited by its power.