Query External Partitioned Data (with Partitioning Clause)

If you want to query multiple data files in the Object Store as a single external table and the files can be represented as multiple logical partitions, then it is highly recommended to use an external partitioned table. Using an external partitioned table preserves the logical partitioning of your data files for query access. Use the procedure DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE to create an external partitioned table.

There are two ways to create an external partitioned table on Autonomous AI Database:

  • The first version is for. See Query External Partitioned Data with Hive Format Source File Organization for information on this type of external partitioned table usage.

  • The second version is for. When you a create a partitioned external table in this way you include a partitioning clause in the partitioning_clause parameter. The partitioning clause that you include depends upon the data files in the Cloud and the type of partitioning you use. This section describes this type of external partitioned table usage.

  1. Store your object store credentials using the procedure DBMS_CLOUD.CREATE_CREDENTIAL.

    For example:

    BEGIN
      DBMS_CLOUD.CREATE_CREDENTIAL (
        credential_name => 'DEF_CRED_NAME',
        username => '[email protected]',
        password => 'password' );
    END;
    /
    

    Creating a credential to access Oracle Cloud Infrastructure Object Store is not required if you enable resource principal credentials. See Use Resource Principal to Access Oracle Cloud Infrastructure Resources for more information.

    This operation stores the credentials in the database in an encrypted format. You can use any name for the credential name. Note that this step is required only once unless your object store credentials change. Once you store the credentials you can then use the same credential name for creating external tables.

    See CREATE_CREDENTIAL Procedure for information about the username and password parameters for different object storage services.

  2. Create an external partitioned table on top of your source files using the procedure  DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE.

    The procedure DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE supports external partitioned files in the supported cloud object storage services. The credential is a table level property; therefore, the external files must be on the same object store.

    For example:

    BEGIN
      DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE(
          table_name =>'PET1',  
          credential_name =>'DEF_CRED_NAME',
          format => json_object('delimiter' value ',', 'recorddelimiter' value 'newline', 'characterset' value 'us7ascii'),  
          column_list => 'col1 number, col2 number, col3 number',
          partitioning_clause => 'partition by range (col1)
             (partition p1 values less than (1000) location
                 ( ''https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/o/file_11.txt''),
              partition p2 values less than (2000) location 
                 ( ''https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/o/file_21.txt''),
              partition p3 values less than (3000) location 
                 ( ''https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/o/file_31.txt'') )'
         );
    END;
    / 
    

    The parameters are:

    • table_name: is the external table name.

    • credential_name: is the name of the credential created in the previous step.

    • partitioning_clause: is the complete partitioning clause, including the location information for individual partitions.

    • format: defines the options you can specify to describe the format of the source file. format: defines the options you can specify to describe the format of the source file.

      If the data in your source files is encrypted, decrypt the data by specifying the format parameter with the encryption option. See Decrypt Data While Importing from Object Storage for more information on decrypting data.

    • column_list: is a comma delimited list of the column definitions in the source files.

    In this example, namespace-string is the Oracle Cloud Infrastructure object storage namespace and bucketname is the bucket name. See Understanding Object Storage Namespaces for more information.

    You can now run queries on the external partitioned table you created in the previous step. Your Autonomous AI Database takes advantage of the partitioning information of your external partitioned table, ensuring that the query only accesses the relevant data files in the Object Store. For example, the following query only reads data files from partition P1:

    SELECT * FROM pet1 WHERE col1 < 750;

    The external partitioned tables you create with DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE include two invisible columns file$path and file$name. These columns help identify which file a record is coming from. See External Table Metadata Columns for more information.

    If there are any rows in the source files that do not match the format options you specified, the query reports an error. You can use DBMS_CLOUD parameters, like rejectlimit, to suppress these errors. As an alternative, you can also validate the external partitioned table you created to see the error messages and the rejected rows so that you can change your format options accordingly. See Validate External Data and Validate External Partitioned Data for more information.

    See CREATE_EXTERNAL_PART_TABLE Procedure for detailed information about the parameters.

    See DBMS_CLOUD URI Formats for more information on the supported cloud object storage services.