Voice control and security

I will assume that I am definitely not the first one to write about that, but I feel the need to write anyway.

We saw during a few recent events that our new beloved always listening devices can interpret an ordre form almost anyone (Someone ordered a Whopper? Burger King: OK Google!)

It seems trivial and a bit childish, but when you start integrating many services into a system like that, you may have to think about security.

This goes at different levels : from limiting commands to voice-print recognition.

https://xkcd.com/1807/

The first issue that comes to mind, related to several recent events, is that you may want to include some kind of limitation on your Google, Alexea or whatever voice-activated device you are using. Just to prevent anyone from ordering everything from your favorite shopping website when they are in listening range. This is pretty simple, and you may just set up your system just to load your shopping cart, and wait for your physical confirmation on your phone/laptop.

Second, you may want to rethink the voice activation of many devices. It has been proved that this can be hacked quite easily (http://www.zdnet.com/article/how-to-hack-mobile-devices-using-youtube-videos/). The good thing is that you can limit what you are allowed to do on your phone without providing an unlock code. It depends of your phone, but you should at least check that out.

Then comes the issues that are not really solved, at least on publicly sold devices. There are two that I can think of right now : voice printing, and content securing.

Voice printing would allow your devices to recognize your voice only, so that no one else can use your device. Pretty simple, and some applications are already providing that, in a limited way. I know, it goes a bit against the current flow of speech recognition, which has been improved up to the point where it does not need to be trained any more. If anyone remembers having to go through hundreds of sentences with Dragon Dictate…

Content securing if the other end of the scope : how do you make sure that some content cannot be spoken, from your device, when it is private. “Siri, how much is there on my bank account?” You might not want Siri to tell the amount out loud on the bus, right? I agree, you should not ask the question if you do not want to hear the answer, but still, it might provide an additional security to be able to limit some outputs.

I have been notified that a device has been designed to provide privacy for your conversations and voice commands : HushMe. I have to admit I am a bit puzzled by the device : http://gethushme.com

I am not sure whether it is a real solution, and a viable one 🙂

Sharding your data, and protecting it

I am quite certain that there are many articles, posts and even books already written on that subject.
To be honest, I did not search for any of those. For some reason, I had to figure out sharding almost by myself building a customer design.
So this post will just be my way of walking through the process, and confirm that I can explain it again. If someone finds this useful, I will be happy 🙂

Here is the information I started with. We want to build an application that uses a database. In our case, we chose DocumentDB, but the technology itself is irrelevant. The pain point was that we wanted to be able to expand the application worldwide, but also to keep a single data set for all the users, wherever they were living, connecting from.
That meant finding a way of having a local copy of the data, writable, in every location we needed.

Having a readable replica of a database is quite standard. You may even be able to get multiple replicas of this kind.
Having a writable replica is not very standard, and certainly not a simple operation to setup.
Having multiple writable replicas… let’s say that even with reading the official guide from Microsoft (https://docs.microsoft.com/fr-fr/azure/cosmos-db/multi-region-writers) it took us a while to fully understand.

As I said, we chose to use DocumentDB, which already provides the creation a readable replica with a few clicks.
This is not enough, as we need to have a locally writable database. But we also need to be able to read data that is written from the other locations. What we can start with is to create a multiple ways replica set.
We could have a writable database in our three locations, with a readable copy in each of the other two regions :

And that is where you have to realize that your database design is over.
Have a closer look at that design. Very close look. And think about our prerequisites : we need a locally writable database, check. We need to read data written from the other locations, check. We do not need to solve the last step with database mechanisms.

The final step is made in the application itself. The app needs to write into its local database to maximise performance, and limit data transfer costs between geographically distant regions. This data will then be replicated, with a small delay, to the other regions. And when the application needs to read data, it will access all of the three sets it has access to in its region, and consolidate the data from all three sets into a single view.

And there it is. Tell me now whether it was me that was a bit thick to not understand that from the Microsoft guide at first read, or please someone tell me that I am not alone in having struggled a bit with the design!

Note : this issue, at least for DocumentDB on Azure has been since solved by the introduction of CosmosDB, which provides multiple writable replicas of a database, a click away.