Swami Sivasubramanian’s keynote on data and ML kicks off the third day of re:Invent 2021 with the quote “Data is about the survival of the most well informed” before explaining that being informed allows you to respond best to the unexpected. [Tips fedora to shady character lurking in the back of the room “Oh hey COVID”]
As you’d expect with their broad set of data services, AWS is focusing on an end to end data and ML journey and defines this as having a capability in data, analytics and ML. One thing that is constantly coming through in the data presentations is around security and access controls.
One of the reasons we acquired privacy specialist TwoBlackLabs was to focus on security and privacy by design up front when building on the cloud. Nothing illustrates these two domains coming together better than data storage, and in particular where granular controls are available. It has been a massive blind spot for many organisations to secure data at a server or database level using network or OS-focused controls. Granular data controls allow you to define and implement access control policies on individual fields, rows or other levels of the data stack that take into account who is actually accessing what. Expect this to be a big change in the industry over the next few years.
Back to the keynote, a slide pitching moving fast with broad access to data versus Data Governance is displayed, highlighting the potential conflict between security and access. Swami explains that in fact, strong data governance upfront will help you move fast, which echoes our experience where customers can regularly get tripped up trying to engineer in data governance after the fact.
Managed data services is another big push from AWS and one I personally advocate for. When Lake Formation was launched at re:Invent 2018 it was IMO one of the highlights of the event as it took a multitude of AWS services and put these together in a manageable and integrated fashion. If we think about Big Data (a term I hate), prior to cloud services the cost of entry was astronomical and didn’t allow for experimentation making it a tough business case. Managed and integrated cloud-based data services are the key to unlocking data for smaller and smaller businesses - make it easy, make it cost-effective, and make it dynamic.
The Aurora driver customer example is worth a watch if you have a spare 5-10mins. It really is one of those future-looking “dream big” examples which I love seeing at re:Invent.
As always there is a swag of new announcements, so let's get into them.
Announcements:
If you haven't heard it enough yet at re:Invent, it is constantly reinforced with a large portion of the new services announced that there is no coding required. NO CODING IS REQUIRED. Got it yet?
Probably the thing that amazes me most about listening to the AWS Data and ML keynote is that if AWS just done Data and ML they would be an enormous business. In fact, if I draw a strange parallel to when I visited the Boeing factory in Seattle many years ago, the scale was simply overwhelming. The building that housed the assembly line for 747s (yes it was some time ago) had the capacity for six 747s and then some. A 747 is crazy big when you stand next to it, but the factory is so large it has a climate of its own (Google it). If a large cloud data company is a 747, AWS is the Boeing Everett factory that assembles them.
And I’ll leave you on that note...