PHP Developer News

Making Aurora Write Latency 15x Higher (or More!) by Choosing a Bad Primary Key

Primary Key design is an important thing for InnoDB performance, and choosing a poor PK definition will have an impact on performance and also write propagation in databases. When this comes to Aurora, this impact is even worse than you may notice.
In short, we consider a poor definition of a Primary Key in InnoDB as “anything but quasi sequential values”, which may cause very random access to data and thus increase the IO dependency.
In this post, I’ll try to demonstrate the potential impact of the primary key design when running on Aurora, and how a bad design can lead to a 15x write latency penalty (or more).
The Analysis
Recently I worked on a case where a customer was having issues with scaling writes in Aurora MySQL. While this is a known limitation in Aurora considering how the distributed storage layer syncs out data among all nodes of the cluster, we observed additional latency occurring when more clients were writing to the database.
The first thing I noticed is that their main table had a poor definition of the Primary Key as they were using UUID-like columns based on VARCHAR data types. In this case, the nature of values for the Primary Key was very random, which is really bad for a b-tree based storage like InnoDB.
With this in mind, I referred to a great post from my colleague Yves Trudeau explaining why UUIDs are bad for performance, so based on this premise I decided to try to measure how big this impact can be in the Aurora world.
The set of tests I’ve run were using a db.r5.2xlarge Aurora MySQL Cluster (8vCPU and 64GB of ram) which is pretty similar to the cluster my customer was using.
First, I’ve started with two very basic tables to avoid any extra overhead but something close to a real case:CREATE TABLE `test_sequential_PK` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`number` int(10) NOT NULL,
PRIMARY KEY (`id`),
KEY `idx_number` (`number`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

CREATE TABLE `test_random_PK` (
`id` varchar(26) NOT NULL,
`number` int(10) NOT NULL,
PRIMARY KEY (`id`),
KEY `idx_number` (`number`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;Then I created two simple lua scripts to execute with sysbench – super simple, I’d say – as they were just for doing inserts using either the auto_increment property in table test_sequential_PK or creating random values for test_random_PK table.
The final purpose was not to measure Aurora performance as a whole but the write latency when the Primary Key is not optimal.
I’ve started the process by warming up the instances for few days by running both sysbench scripts and pushing the instance really hard while I was filling up the InnoDB Buffer Pool, results were pretty good for some time until the traffic became IO-bound:

It took a few days but after some time we started to see an increase in the write latency. I created an initial set of data using 50 concurrent threads, which means the graphic above is not very helpful for the analysis I’m going to make.
After I was sure the buffer pool was filled and the instance was warmed up, I verified that the dataset is bigger than the buffer pool:SELECT concat(table_schema,'.',table_name) schema_table,concat(round(table_rows/1000000,2),'M') rows,concat(round(data_length/(1024*1024*1024),2),'G') Data,round(data_length/table_rows,0) DataRow ,concat(round(index_length/(1024*1024*1024),2),'G') idx,round(index_length/table_rows,0) as IdxRow,concat(round((data_length+index_length)/(1024*1024*1024),2),'G') totSize,round(index_length/data_length,2) idxfrac
FROM information_schema.TABLES
where table_schema in ('percona')
ORDER BY data_length+index_length DESC LIMIT 10;
+-------------------------------+----------+---------+---------+--------+--------+---------+---------+
| schema_table                  | rows     | Data    | DataRow | idx    | IdxRow | totSize | idxfrac |
+-------------------------------+----------+---------+---------+--------+--------+---------+---------+
| percona.test_random_PK        | 1586.25M | 104.53G |      71 | 73.04G |     49 | 177.57G |    0.70 |
| percona.test_sequential_PK    | 1840.49M | 54.74G  |      32 | 24.54G |     14 | 79.27G  |    0.45 |
+-------------------------------+----------+---------+---------+--------+--------+---------+---------+I’ll explain the difference between table sizes later in this post.
After this, I started to run separate tests to check how the write latency is affected by our table design.
First I ran sysbench as follows:sysbench /usr/share/sysbench/insert_sequence.lua    --mysql-user=admin --mysql-password=xxxxxx --mysql-host=francisco-sysbench.cluster --mysql-db=percona --report-interval=1   --threads=15  --events=0 --time=0 run

We can see that performance remains steady roughly at 124 microseconds average with all threads running so it looks there is no impact for about 24 hours.
Then I tested the random insert by running:sysbench /usr/share/sysbench/insert_random.lua    --mysql-user=admin --mysql-password=xxxxxx --mysql-host=francisco-sysbench.cluster --mysql-db=percona --report-interval=1   --threads=15  --events=0 --time=0 run

This is a huge impact if you consider that random access is causing the instance to suffer from performance, roughly 15x writing latency increase.
These numbers were very impressive compared to my previous experience so, being extra curious, I checked what was reported in Cloudwatch for Aurora Write latency for the previous 3 days.

It’s quite clear the 3 stages of the checks:
Warming up the instance and buffer pool by pushing really hard with sysbench in both tables (peak of load at the beginning)
First round of sequential inserts using sequential write (sequential writes)
Last round of random access (random writes)
As I said, the code used on each script was very simple. This is on purpose to avoid adding overhead somewhere other than the database. For reference, this is the interesting portion of code for random writes:local inserts = {
"INSERT INTO percona.test_random_PK (id, number) VALUES ('%s', %i)"
}

local rnd_str_tmpl = "###########-###########"

function execute_inserts()
             local id = sysbench.rand.string(rnd_str_tmpl)
             con:query(string.format(inserts[1], id, sb_rand(1, sysbench.opt.table_size)))
endAnd for sequential writes:local inserts = {

"INSERT INTO percona.test_sequential_PK (number) VALUES (%i)"

}

function execute_inserts()

-- INSERT

con:query(string.format(inserts[1], sb_rand(1, sysbench.opt.table_size)))

end
Conclusion
Primary key design for InnoDB tables was largely discussed in several posts, and specially UUID format impact was perfectly described in the post I mentioned, above so there are no surprises. What I’ve found interesting in this analysis is that in Aurora there seems to be an extra impact of this random access.
Given that, what’s happening underneath is not quite clear. I just can try to elaborate a theory:
In Aurora, there are no dirty pages. Basically, every commit is synchronized to disk immediately so all replicas can be virtually in sync with the source server. This is reducing the chance of hitting the same page in memory every time you perform a write operation so the whole sequence of grabbing the page, placing it in memory, updating information, flushing, and committing synchronously to the block storage is particularly expensive in Aurora.
Additionally given that every secondary index adds an entry of the Primary Key, the increase in data size (and thus the disk footprint) can be impacted a lot as we have seen before, so this may also cause an extra disk utilization.
So be warned, plan your PKs properly if you don’t want to see a huge performance degradation in AWS Aurora as the database grows and workload is increased.
Percona Distribution for MySQL is the most complete, stable, scalable, and secure, open-source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!
Download Percona Distribution for MySQL Today

Most Popular in Database