Oracle Autonomous Database (ADB), true to its name, is a fully managed database service. Most interactions with ADB require nothing more than a connect string for access. However, certain scenarios necessitate loading or accessing files in the local directory where the ADB instance is running. A notable example of this arises when loading ONNX models into the database to perform embeddings. This can be achieved using the DBMS_CLOUD.GET_OBJECTS procedure. Despite the longstanding availability of this feature and used internally by impdp to import data from object storage, it is not widely used.
This blog post will guide you through the steps to download files from Oracle Object Storage into the ADB filesystem using DBMS_CLOUD.GET_OBJECTS, ensuring a smooth and efficient data transfer process.
Prerequisites
Before we dive into the procedure, make sure you have the following:
- Oracle Cloud Account: You need an active Oracle Cloud account with access to Oracle Object Storage and Oracle Autonomous Database.
- Object Storage Bucket: Create a bucket in Oracle Object Storage and upload the files you want to download into Oracle ADB.
- Credentials: Ensure you have the necessary credentials (access key and secret key) to access Oracle Object Storage.
- Oracle Autonomous Database: Make sure you have an Oracle Autonomous Database instance running.
Step-by-Step Guide
1. Create Credential Object in Oracle ADB
First, create a credential object in your Oracle Autonomous Database that will store your Object Storage credentials. This is required for authenticating with Oracle Object Storage.
BEGIN
DBMS_CLOUD.CREATE_CREDENTIAL(
credential_name => 'OBJ_STORE_CRED',
username => '<your_object_storage_access_key>',
password => '<your_object_storage_secret_key>'
);
END;
Replace <your_object_storage_access_key> and <your_object_storage_secret_key> with your actual Oracle Object Storage access key and secret key.
2. Grant Necessary Privileges
Ensure your database user has the necessary privileges to use DBMS_CLOUD procedures.
GRANT EXECUTE ON DBMS_CLOUD TO <your_database_user>;
Replace <your_database_user> with your actual database user.
3. Download Files Using DBMS_CLOUD.GET_OBJECTS
To download files from your Oracle Object Storage bucket into Oracle Autonomous Database, use the DBMS_CLOUD.GET_OBJECTS procedure.
About the DATA_PUMP_DIR Directory
The DATA_PUMP_DIR directory is a default directory created during the setup of an ADB instance. Initially intended for handling export dump files used in data import operations, this directory can serve as a temporary staging area to download any file from Oracle Object Storage into the ADB filesystem. Note that the ADB filesystem is temporary and will be automatically cleaned up.
If your application requires a permanent filesystem, you can mount an NFS (Network File System) to ADB. However, DATA_PUMP_DIR is suitable for temporary file storage and processing.
BEGIN
DBMS_CLOUD.GET_OBJECTS(
credential_name => 'OBJ_STORE_CRED',
object_uri => 'https://objectstorage.<region>.oraclecloud.com/n/<namespace>/b/<bucket>/o/<file>',
directory_name => 'DATA_PUMP_DIR',
file_name => '<file_name_in_adb>'
);
END;
/
Replace the placeholders as follows:
- <region>: Your Oracle Cloud region.
- <namespace>: Your Oracle Cloud Object Storage namespace.
- <bucket>: The name of your Object Storage bucket.
- <file>: The name of the file in your Object Storage bucket.
- <file_name_in_adb>: The name you want to give the file in Oracle ADB.
For example, if your object URI is https://objectstorage.us-ashburn-1.oraclecloud.com/n/my_namespace/b/my_bucket/o/my_file.csv, and you want to store it in ADB as my_file.csv, the command will look like this:
BEGIN
DBMS_CLOUD.GET_OBJECTS(
credential_name => 'OBJ_STORE_CRED',
object_uri => 'https://objectstorage.<region>.oraclecloud.com/n/<namespace>/b/<bucket>/o/<file>',
directory_name => 'DATA_PUMP_DIR',
file_name => '<file_name_in_adb>'
);
END;
/
4. Verify the File in Oracle ADB
After downloading the file, you can verify its existence in Oracle ADB by listing the contents of the directory.
SELECT * FROM TABLE(DBMS_CLOUD.LIST_FILES('DATA_PUMP_DIR'));
This query will show you the files present in the specified directory, ensuring that your file has been successfully downloaded.
5. Upload ONNX Files into the Database
Once the ONNX files are downloaded and verified, you can upload them into the database using DBMS_VECTOR.LOAD_ONNX_MODEL. This step involves loading the models from the downloaded files and configuring them for use in Oracle ADB.
BEGIN
DBMS_VECTOR.LOAD_ONNX_MODEL(
'DATA_PUMP_DIR',
'tinybert.onnx',
'TINYBERT_MODEL',
json('{"function":"embedding","embeddingOutput":"embedding","input":{"input":["DATA"]}}')
);
DBMS_VECTOR.LOAD_ONNX_MODEL(
'DATA_PUMP_DIR',
'all-MiniLM-L6-v2.onnx',
'ALL_MINILM_L6V2MODEL',
json('{"function":"embedding","input":{"input":["DATA"]}}')
);
END;
This code uploads two ONNX models (tinybert.onnx and all-MiniLM-L6-v2.onnx) into Oracle ADB, making them available as TINYBERT_MODEL and ALL_MINILM_L6V2MODEL respectively. The json configuration specifies how the models should handle input and output data.
Conclusion
Downloading files from Oracle Object Storage into Oracle Autonomous Database using DBMS_CLOUD.GET_OBJECTS and the default file system directory DATA_DUMP_DIR is a straightforward process. By following the steps outlined in this guide, you can efficiently manage your data transfers within the Oracle Cloud ecosystem. Once your files are downloaded, you can upload ONNX models into your database, enabling which can be used in embedding data.
