Thirteen years after Apple launched the iPhone, smartphones have thoroughly disrupted the computer market, superseding our use of desktop and laptop devices. Among internet users aged 18-44 years old, the number of people who access the web from smartphones exceeds PC connections (Statista Global Consumer Survey, Sept 2018). For an array of functions from messaging, to navigation, and playing music, smartphones dominate desktop use. The App Store’s open architecture has allowed everyone from private individuals to public corporations to expand the iPhone’s functionality, allowing it to tackle an endless number of tasks from the trivial to the technical.

But the explosive era of growth and innovation that the iPhone unleashed is undoubtedly slowing. Apple has reliably launched a series of sustaining innovations over the last several years, adding features with diminishing marginal utility to phones in the absence of real technological breakthroughs. “Is this the end of the age of Apple?” tech columnist Kara Swisher wrote in the New York Times in 2018, “We need the next wave of innovation, and we need it now.” That next wave may have arrived in the form of AI-enabled voice assistants. Whether it is Alexa or Google Assistant, or Siri or Cortana that ultimately wins out, voice will precipitate a paradigm shift in how humans interact with technology, essentially creating a new kind of operating system. Here are three reasons that operating systems will be the next disruptive platform innovation.

Voice will outperform current technologies on key elements of user experience.

In the early days of computing, the concept of a “desktop” with “folders” and “files,” and a nearby “trash bin” was welcomed as a more familiar way to interact with computers than the complex commands and lines of code that preceded this metaphor. More recently, touchscreens have edged out physical keyboards and mice because they also present more intuitive ways for humans to tell technology what to do. What will happen when you can simply speak any command to a computer and have it executed? Screens full of text and icons have always been abstractions of the processing that’s going on behind the scenes. Voice-based interaction removes with the need for any visual interfaces for physical inputs at all, ringing in an era of more natural, instinctive human-to-computer communication.

Voice will compete against non-consumption, creating access for people who can’t interact with current technologies.

What Facebook has foreshadowed with its heavy marketing of Portal to seniors is actually a much larger opportunity. Some 16% of the world’s adult population is illiterate, with regional rates reaching as high as 30% in some areas. The removal of manual typing and text-based interaction as necessary skills for communicating with computers will enable millions of people who previously lacked access due to physical, educational, or other barriers to come online. Even for those who comfortably use smartphones and computers today, voice technology will open up new markets in certain hands-free occasions, like driving and cooking, which are underserved by existing screen-based technologies. In fact, in-car is already the second most frequent area for use of voice assistants (after mobile phones), with 45% of US adults speaking to one or more devices while driving. This is nearly double the largest in-home segment, which is smart speakers at 23% (CB Insights, 2018). Nonetheless, 34% of smart speaker owners in the US today turn to their device for help cooking at least once a month (Statista).

It is already “good enough.”

Smart speaker adoption has grown dramatically since Amazon first introduced Alexa in 2014. At the time the technology felt gimmicky, its primary use cases being things that were arguably easier to accomplish manually, such as turning the lights on and off. Then in her 2016 internet trends report, venture capitalist Mary Meeker (at the time a partner at Kleiner Perkins) declared that accuracy and latency for voice AI were “finally at acceptable levels” (Information Age). By 2018, over 118 million homes had installed smart speakers, a 78% increase from the year prior (CB Insights). That same year, Google CEO Sundar Pichai demoed a recording of Google Assistant phoning a hair salon to book an appointment. In an interaction that The Verge described as “jaw-dropping”, Assistant seamlessly incorporated thoughtful pauses, and even a natural-sounding “mmhmm” into its request. The person at the hair salon clearly had no idea they were talking to a piece of technology, and Pichai’s audience had to be reminded of the same. This level of ease in all of our voice interactions may still be years away. But while the majority of smart speaker use cases are still straightforward tasks like checking the weather or setting an alarm, a growing share are incorporating more complex interactions such as “access my calendar” and “make a purchase” (Statista, 2019). As the technology advances and more third-party skills are developed, this universe will expand, just as it did for smartphones.

It’s no surprise then that Amazon, Google, Facebook, Apple, and Microsoft are devoting enormous amounts of energy and capital to developing voice technologies. True, they may be the only companies with the required resources and scale. But they’re also keenly aware that they are the ones who stand to lose most from the coming voice revolution. They’re all perfectly motivated to develop the next winning platform…before it eats their lunch.

Shaye Roseman is a former Research Associate at The Forum for Growth & Innovation and a member of the HBS Class of 2021.

Photography by Kevin Bhagat.