Every database migration must go through the standard data migration phases, such as analyse compatibility, schema validation etc. Two challenges worth mentioning are the speed at which we will be able to dump and reload over a billion rows for over a billion rows (nearly 1Tb) and planning a migration that would allow us to react to unexpected situations by rolling forward to a safe environment with minimal downtime and no data loss.
We spend quite some time figuring out how to load data into Aurora faster. The usual bottleneck is the data export/import. Now we know, that the data import needs to finish before the binlog is rotated. After exploring different options on our own, we turned to AWS support for a second opinion, and they referred us to mydumper and myloader. By exploiting the elasticity of AWS and RDS Aurora in particular, we were able to scale up and finish the data load within hours instead of days, and then scale it back to normal usage.
The second challenge of live migration and roll-forward in case of a disaster is a rather complicated task. First, we had to create our new aurora database as a replica of the current database. Once it had caught up with the replication, we created another database with the old DB engine as a replica of the new Aurora database. See as the diagram displays. It looks simple on PowerPoint, but in reality there are bits & bytes and some dark magic. Thanks to some extensive testing and planning, we managed to perform the live migration without obstacles, and after a couple of days the roll-forward was removed as well.