Lab7: Create dataset and Auto Scale

Aurora Auto Scaling enables your Aurora DB cluster to handle sudden increases in connectivity or workload by dynamically adjusting the number of Aurora Replicas for a provisioned Aurora DB cluster. When the connectivity or workload decreases, Aurora Auto Scaling removes unnecessary Aurora Replicas so that you don’t pay for unused DB instances.

In this lab, we will walk through how Aurora read replica auto scaling works in practice using a load generator script.

This lab contains the following tasks:

  1. Configure aurora replica auto scaling
  2. Initialize pgbench and Create a Dataset
  3. Run a read-only workload

Prerequisites

This lab requires the following lab modules to be completed first:

1. Create a replica auto scaling policy

You will add a read replica auto scaling configuration to the DB cluster. This will allow the DB cluster to scale the number of reader DB instances that operate in the DB cluster at any given point in time based on the load.

Click on the Aurora cluster name and go to Logs & events tab. Click on the Add auto scaling policy button.

07-autoscaling-1

Enter auroralab-autoscale-readers as the Policy Name. For the Target metric choose Average CPU utilization of Aurora Replicas. Enter a Target value of 20 %. In a production use case this value may need to be set much higher, but we are using a lower value for demonstration purposes.

Next, expand the Additional configuration section, and change both the Scale in cooldown period and Scale out cooldown period to a value of 180 seconds. This will reduce the time you have to wait between scaling operations in subsequent labs.

In the Cluster capacity details section, set the Minimum capacity to 1 and Maximum capacity to 2. In a production use case you may need to use different values, but for demonstration purposes, and to limit the cost associated with the labs we limit the number of readers to two.

07-autoscaling-2

Next click Add policy.

2. Initialize pgbench and Create a Dataset

You can find the Cloud9 URL by selecting the CloudFormation Stack with description “Amazon Aurora PostgreSQL Labs Stackset” in the Cloudformation Console and referring to the “Value” for key Cloud9URLin the Outputs tab. Click on the https address in “Value” column to open Cloud9. Click the Window menu and select New Terminal.

Initialize pgbench to start the creation of dataset by pasting the below command in Clout9 terminal window.

pgbench -i --scale=1000

Data loading may take several minutes, you will receive similar output once complete:

dropping old tables...
NOTICE:  table "pgbench_accounts" does not exist, skipping
NOTICE:  table "pgbench_branches" does not exist, skipping
NOTICE:  table "pgbench_history" does not exist, skipping
NOTICE:  table "pgbench_tellers" does not exist, skipping
creating tables...
generating data...
100000 of 100000000 tuples (0%) done (elapsed 0.06 s, remaining 55.63 s)
200000 of 100000000 tuples (0%) done (elapsed 0.16 s, remaining 77.78 s)
300000 of 100000000 tuples (0%) done (elapsed 0.23 s, remaining 76.16 s)
400000 of 100000000 tuples (0%) done (elapsed 0.33 s, remaining 82.14 s)
500000 of 100000000 tuples (0%) done (elapsed 0.46 s, remaining 91.65 s)
600000 of 100000000 tuples (0%) done (elapsed 0.56 s, remaining 92.33 s)
700000 of 100000000 tuples (0%) done (elapsed 0.66 s, remaining 94.29 s)
800000 of 100000000 tuples (0%) done (elapsed 0.77 s, remaining 94.92 s)
900000 of 100000000 tuples (0%) done (elapsed 0.87 s, remaining 95.93 s)
1000000 of 100000000 tuples (1%) done (elapsed 0.95 s, remaining 94.27 s)
1100000 of 100000000 tuples (1%) done (elapsed 1.04 s, remaining 93.48 s)
1200000 of 100000000 tuples (1%) done (elapsed 1.14 s, remaining 93.96 s)
1300000 of 100000000 tuples (1%) done (elapsed 1.24 s, remaining 94.43 s)
1400000 of 100000000 tuples (1%) done (elapsed 1.35 s, remaining 95.05 s)
1500000 of 100000000 tuples (1%) done (elapsed 1.43 s, remaining 94.22 s)
...
98400000 of 100000000 tuples (98%) done (elapsed 119.10 s, remaining 1.94 s)
98500000 of 100000000 tuples (98%) done (elapsed 119.23 s, remaining 1.82 s)
98600000 of 100000000 tuples (98%) done (elapsed 119.34 s, remaining 1.69 s)
98700000 of 100000000 tuples (98%) done (elapsed 119.49 s, remaining 1.57 s)
98800000 of 100000000 tuples (98%) done (elapsed 119.60 s, remaining 1.45 s)
98900000 of 100000000 tuples (98%) done (elapsed 119.72 s, remaining 1.33 s)
99000000 of 100000000 tuples (99%) done (elapsed 119.82 s, remaining 1.21 s)
99100000 of 100000000 tuples (99%) done (elapsed 119.94 s, remaining 1.09 s)
99200000 of 100000000 tuples (99%) done (elapsed 120.05 s, remaining 0.97 s)
99300000 of 100000000 tuples (99%) done (elapsed 120.12 s, remaining 0.85 s)
99400000 of 100000000 tuples (99%) done (elapsed 120.21 s, remaining 0.73 s)
99500000 of 100000000 tuples (99%) done (elapsed 120.32 s, remaining 0.60 s)
99600000 of 100000000 tuples (99%) done (elapsed 120.41 s, remaining 0.48 s)
99700000 of 100000000 tuples (99%) done (elapsed 120.55 s, remaining 0.36 s)
99800000 of 100000000 tuples (99%) done (elapsed 120.68 s, remaining 0.24 s)
99900000 of 100000000 tuples (99%) done (elapsed 120.80 s, remaining 0.12 s)
100000000 of 100000000 tuples (100%) done (elapsed 120.89 s, remaining 0.00 s)
vacuuming...
creating primary keys...
done.

3. Run a read-only workload

Once the data load completes successfully, you can run a read-only workload on the cluster (so that we can trigger our auto scaling policy). You will also observe the effects on the DB cluster topology.

For this step you will use the Reader Endpoint of the cluster. If the DB cluster was created automatically for you, the reader endpoint can be found in your CloudFormation stack Outputs of the stack with description “Amazon Aurora PostgreSQL Labs Stackset”. Refer to the Output key called readerEndpoint. If you created the cluster manually, you can find the reader endpoint by going to the RDS Console - Databases section , clicking the name of the Aurora cluster and going to the Connectivity & security tab.

Run the load generation script from your Cloud9 terminal window, replacing the [readerEndpoint] placeholder with the actual Aurora cluster reader endpoint:

pgbench -h [readerEndpoint] -c 100 --select-only -T 600 -C

pgbench-output

Now, open the Amazon RDS management console in a different browser tab.

Take note that the reader node is currently receiving load. It may take a minute or more for the metrics to fully reflect the incoming load.

07-autoscaling-3

After several minutes return to the list of instances and notice that a new reader is being provisioned in your cluster.

07-autoscaling-4

It will take 5-7 minutes to add a new replica. Once the new replica becomes available, note that the load distributes and stabilizes (it may take a few minutes to stabilize).

07-autoscaling-5

You can now toggle back to your Cloud9 terminal window, and press CTRL+C to quit the running pgbench job. After a while the additional reader will be removed automatically.