For the past 8 months I have worked with the Open Voice Network (OVN) to put together guidelines and best practices for the governance of voice data.


Voice user interfaces are growing exponentially and have great potential for all online businesses. They are also are proving to be very useful for particularly vulnerable categories of users: children, elderly, users with disabilities, people with minimal literacy. In turn, this requires that the ethical standards in place are particularly high, transparent and truly worthy of users' trust.


Voice still falls short on ethical standards that aim at inclusivity, diversity and gender balance. There is a significant gap to be filled at the training data level. This can be fertile grounds for LSPs who have a broad global reach and experience with languages, locales, dialects, accents, inflections, etc.


As far as data governance is concerned, currently it is anchored on consent between user and service provider, however voice presents many use cases that can circumvent the guarantees provided by the standard legislation in place. 

  1. Voice data is biometric in nature and can reveal a lot of information about the user that goes way beyond the identity. Currently most users are not aware of the amount of information that voice analysis might reveal about them and of the profiling and discrimination that this opens to potential abuse. 
  2. At the same time, voice user interfaces are particularly prone to accidental data recording as they are often used in public spaces or when used within a household, can record users that interact with the UI but have never given their consent to their data being recorded and processed.

I will discuss bias in data collection, challenges posed by voice to the current regulatory standards and possible solutions.
 

Conference Event Type
Conference Track Format