Environment: AWS CLI 1.17.7, Linux CentOS 7, Oracle XE 18c
In my previous blog post I explained how you can load AWS CLI CloudWatch data into a MySQL database. In this blog post I will explain how you can do something similar using Oracle as database. The goal is to retrieve CPU CloudWatch metrics from an Oracle database on RDS and load the data into a local Oracle database using the utility SQL*Loader.
Note: I assume that you already installed the AWS CLI tools and also have access and connectivity to an Oracle database (for this example I used Oracle XE 18c, installed on the same host).
Let’s first create a table to store the AWS CloudWatch data:
CREATE TABLE MATTHIAS.AWS_CLOUDWATCH_CPU_MON ( AC_ID NUMBER GENERATED ALWAYS AS IDENTITY, INSTANCE_ID VARCHAR(50) NOT NULL, CPU_MAX_PERCENT NUMBER NOT NULL, MON_TIMESTAMP TIMESTAMP NOT NULL, PRIMARY KEY (AC_ID) ); ALTER TABLE MATTHIAS.AWS_CLOUDWATCH_CPU_MON ADD CONSTRAINT aws_cl_uc1 UNIQUE (INSTANCE_ID,MON_TIMESTAMP);
Some explanation about the columns in the database table:
- AC_ID: auto-generated ID and primary key constraint
- INSTANCE_ID: the name of the RDS database instance
- CPU_MAX_PERCENT: the max CPU CloudWatch metric (a decimal number)
- MON_TIMESTAMP: date and time for the CPU metric
Similar to the MySQL example in my previous blog post, we will first get the data using AWS CLI, print the data to a text file, filter out the unneeded columns, and finally load the text file into Oracle using SQL*Loader.
Let’s start by creating two files to store the SQL*Loader configuration data: the parameter file (sqlldr.par) and the control file (sqlldr.ctl).
In the parameter file we will store the Oracle database connection details, the name of the control file, and the name of the log file:
$ cd /home/mhoys/oracle-scripts $ vi sqlldr.par USERID=MATTHIAS/<password>@xepdb1 CONTROL=sqlldr.ctl LOG=sqlldr.log $ chmod 600 sqlldr.par
MATTHIAS is the database user that will be used to connect to the Oracle database that has the connect identifier “xepdb1”.
Don’t forget to change the file permissions to 600, so that only the file owner can read the file.
Now let’s create the SQL*Loader control file:
$ cd /home/mhoys/oracle-scripts $ vi sqlldr.ctl OPTIONS (SKIP=1) LOAD DATA INFILE 'aws_cloudwatch_cpu_mon.txt' INTO TABLE aws_cloudwatch_cpu_mon APPEND FIELDS TERMINATED BY "\t" TRAILING NULLCOLS (INSTANCE_ID CONSTANT "rdsoradb001", CPU_MAX_PERCENT DECIMAL EXTERNAL, MON_TIMESTAMP "to_timestamp(:MON_TIMESTAMP,'YYYY-MM-DD\"T\"hh24:mi:ss\"Z\"')" )
Some explanation:
- OPTIONS (SKIP=1): the first row in the file aws_cloudwatch_cpu_mon.txt (the output of the AWS CLI command) is empty, so we’re going to skip it
- INFILE ‘aws_cloudwatch_cpu_mon.txt’: the name of the AWS CLI output file where we are reading the data from
- APPEND: this means we are going to append new data to the existing data in our table
- FIELDS TERMINATED BY “\t”: the fields in the text file are delimited by the TAB character
- INSTANCE_ID CONSTANT “rdsoradb001”: the name of the RDS database, treated as constant here
- CPU_MAX_PERCENT DECIMAL EXTERNAL: the max CPU CloudWatch metric is a decimal value (for example: 25.00)
- MON_TIMESTAMP “to_timestamp(:MON_TIMESTAMP,’YYYY-MM-DD\”T\”hh24:mi:ss\”Z\”‘)” ): the date value from AWS CloudWatch is returned in the format “2020-02-05T12:46:00Z”, so we are converting it here to the Oracle TIMESTAMP format.
For your info, the full SQL*Loader documentation can be found here.
Now that we have the SQL*Loader parameter file and control file in place, let’s create the main shell script to call the AWS CLI and SQL*Loader commands:
$ cd /home/mhoys/oracle-scripts $ vi aws_cw_cpu_ora_load.sh #!/bin/bash # change to the script directory cd /home/mhoys/oracle-scripts # set Oracle environment variables (needed for cron) ORACLE_BASE=/opt/oracle; export ORACLE_BASE ORACLE_HOME=/opt/oracle/product/18c/dbhomeXE;export ORACLE_HOME start_time=`date --utc +%FT%TZ -d "12 hours ago"` end_time=`date --utc +%FT%TZ` instance_id="rdsoradb001" file_name="aws_cloudwatch_cpu_mon.txt" # run aws cloudwatch script and print output to text file /home/mhoys/.local/bin/aws cloudwatch get-metric-statistics --metric-name CPUUtilization --start-time "$start_time" --end-time "$end_time" --period 300 --namespace AWS/RDS --statistics Maximum --dimensions Name=DBInstanceIdentifier,Value="$instance_id" --output text|awk '{print $2 "\t" $3}' > $file_name # run sqlloader to load text file into Oracle database /opt/oracle/product/18c/dbhomeXE/bin/sqlldr parfile=sqlldr.par
Here we are getting the CPU Max metrics for our RDS database instance “rdsoradb001” for the last 12 hours, with an interval of 5 minutes (300 seconds) – so there will be 144 records in total. We only need the second and third column of the output, that’s why we are piping the results to the “awk” command.
In the last line we are calling the SQL*Loader executable (sqlldr), passing the parameter file that we created earlier (the parameter file contains the connection details and the name of the control file to be used).
Once you have saved the script, let’s try to run it manually:
$ cd /home/mhoys/oracle-scripts $ ./aws_cw_cpu_ora_load.sh SQL*Loader: Release 18.0.0.0.0 - Production on Wed Feb 5 17:43:07 2020 Version 18.4.0.0.0 Copyright (c) 1982, 2018, Oracle and/or its affiliates. All rights reserved. Path used: Conventional Commit point reached - logical record count 144 Table AWS_CLOUDWATCH_CPU_MON: 144 Rows successfully loaded. Check the log file: sqlldr.log for more information about the load.
Victory! 144 records were loaded into the table AWS_CLOUDWATCH_CPU_MON. Check the sqlldr.log and aws_cloudwatch_cpu_mon.txt files that should have been created in the same directory. If there are rejected records, there should also be a file called “aws_cloudwatch_cpu_mon.bad”.
We can now schedule the script using the cron scheduler:
$ crontab -e 0 8,20 * * * /home/mhoys/oracle-scripts/aws_cw_cpu_ora_load.sh > /dev/null
In this example we scheduled the script to run twice daily (at 08:00 and 20:00). The standard output is redirected to /dev/null to avoid that cron sends an e-mail on every execution of the script.
That’s it!
HTH
Matthias