SecurityBrief Australia - Technology news for CISOs & cybersecurity decision-makers
Google's new approach to machine learning without the privacy worries
Wed, 28th Aug 2019
FYI, this story is more than a year old

As we all know, artificial intelligence (AI) is transforming our digital experiences and it's going to impact almost all parts of the global economy.

A leader in the space is Google, who already infuse artificial intelligence and machine learning into many of their products.

They have also released an essential software library called TensorFlow into the open source community. This library assists developers with machine learning (ML) applications, including neural networking.

A clear example of AI in use is Gboard, which is an advanced software keyboard for Android and iOS devices. It uses AI for predictive typing and answers.

AI can have many benefits in our personal as well as business lives. Additionally, corporations see opportunities for cost savings and productivity improvements.

One of the trade-offs of these type of benefits is privacy. This is because traditionally, artificial intelligence uses enormous amounts of centralised data from users for machine learning.

This training data is what the artificial intelligence spends long periods of time working through and practising scenarios on, to gain insights, learnings and build neural networks. These are then used for automated decision-making.

Back in 2017, Google started experimenting with a new approach, now called ‘Federated Learning'.

The fundamental concept of this is that the learning isn't done centrally, but instead on many user devices.

These devices are currently smartphones, but in the future they could be computers, smart home appliances, smart cars or other internet of things (IoT) devices.

Each of these devices crunches its own training data, and this data isn't shared with anyone else.

The learnings from each of these devices are then sent daily to a parameter server which in turn compiles all of the data and redistributes updated learnings to each device.

The data that is sent back to the centralised systems isn't personally recognisable in any way.

Hence high-quality machine learning is achieved, without any private information being shared.

The machine learning is, in theory, no better or worse than a traditional centralised training data approach.

All of this can be done on pretty much any smartphone running Android or iOS, and it doesn't require a high-end phone. This is achieved with the data crunching and transmission happening at off-peak times.

Distinguished scientist at Google AI, Blaise Aguëra y Arcas, described this as similar to humans dreaming at night.

“Over a billion phones already have apps on them that use Federated Learning,” says Aguera y Arcas.

A key advantage of this approach is that the training data is very diverse, from lots of different users and devices. However, a disadvantage is that the learning and feedback cycle is a little slower as it needs to wait for off-peak time on the device.

There is also an environmental argument for the technology. With each device crunching its own data locally, a vast amount of less data needs to be moved around our networks.

This could have real benefits in IoT and smart cars, where vast amounts of data can be continuously generated.

Google has extended its existing TensorFlow open source machine learning library to include this new Federated Learning approach.

The hope is that many new uses for the technology can be imagined by developers globally.