- 26 Feb 2019
- Managing the Future of Work
AI and the value of expertise
Joe Fuller: From the hand-forged tools of the Iron Age, to the steam-powered looms of the Industrial Revolution, to the cloud-based solutions of today, technology has throughout history made people more productive at work. But new technologies powered by machine learning are leading many to question whether work will require humans at all in the future. And, companies are going to face hard choices deciding whether and how to deploy those technologies.
Welcome to the Managing the Future Work podcast. I'm your host, Harvard Business School professor and visiting fellow at the American Enterprise Institute, Joe Fuller. Today I'm speaking with my colleague, Raj Choudhury, who studies the implications of new machine learning tools. He'll give advice to companies implementing those technologies and share findings that many may find surprising about the impact AI will have on the future of work. Welcome to the podcast Raj.
Raj Choudhury: Hi Joe, good to be here.
Fuller: Raj, you study how new technology affects productivity and that's a subject that's bandied about a lot in the press, by politicians, by opinion leaders. What do you think are the biggest misconceptions people have out there about productivity and how technologies going to affect it?
Choudhury: So, recently I've been studying how AI and machine learning would affect the productivity of knowledge workers, and if you read the press and even some academic or semi-academic work, there seems to be this scare of massive job loss.
Fuller: Yeah.
Choudhury: And humans being out of work and will machines take over our lives and our livelihood.
Fuller: 50% of jobs disappearing-
Fuller: And things like that.
Choudhury: Correct. I would dare say that at best these fears are overblown and we need to do a lot of more rigorous empirical work to figure out the complementarities between technologies such as AI and human capital.
Fuller: Certainly our research in a number of areas has indicated that technology seems to be creating new jobs that have a somewhat different content than the old work and may be more demanding and require a little bit more comfort with information devices or data, but the jobs still exist.
Choudhury: That's correct. So, I think just to back out, the academic research on AI and machine learning, is pretty early stage still, and the consensus is, where AI and machine learning is helpful is it provides a more accurate and in many cases a faster prediction that can be used by human managers to inform the decision making. But as you said, humans still need to take the decisions. And what my research has looked at has even been how humans should use these tools to come at more accurate predictions that would be helpful.
Fuller: Mm-hmm. I know you've done a very intriguing study of MBA students here in our school and others, could you just describe that to us and what some of your key findings were?
Choudhury: What we did is we took a group of MBA students and tried to turn them into novice patent examiners and we gave half of them a machine learning tool to conduct their patent examination and the other half got the regular tool which is Boolean-search-technology based.
Fuller: So, a patent examiner is someone that receives a patent application and evaluates its merits?
Choudhury: That's correct.
Fuller: Okay.
Choudhury: What happens in the patent office is inventors or firms would submit a patent with certain claims and the patent examiner would examine these claims against prior art and prior art being not only prior patents but any published material relating to those claims. And they use search technologies to conduct that search. And so, the claim is machine learning can make that search much faster and [more] accurate. And, that was the experiment that we tried to run. And, the question was whether ML, machine-learning-based search technologies would turn these MBA students, who you could think of as novice patent examiners, into more-expert examiners.
Fuller: How'd you structure the research and what kind of comparisons did you run?
Choudhury: So, what we did is we took about 220 MBA students here and we gave them all five patent claims, real patent claims, to examine. And, these are claims that had been examined prior, in the patent office, and the examiner, the US patent office examiner, actually had rejected these five claims based on prior art. But the students didn't know that.
Fuller: Okay.
Choudhury: And so what happened was, half of these students got a search technology based on machine learning, where the machine-learning technology read the five claims and, based on something called semantic search, identified prior art relevant for those five claims. And the other half of the students got the regular Boolean search technology that's actually used even today inside the patent office.
Fuller: So could you, for our listeners just define Boolean. I think of it as essentially the underlying logic of a Google-type search engine.
Choudhury: That is correct. That is correct.
Fuller: So, you would type in key phrases and words and you just get unprocessed responses.
Choudhury: That's correct.
Fuller: Okay.
Choudhury: So, you would compose a search string, enter that into a Google-like interface and then the interface would work on the back end and find relevant patents that comprise those keywords and would return those.
Choudhury: Documents, often like hundreds, maybe even thousands of documents-
Fuller: Right. So, presumably the argument is that machine learning protocol would really narrow the scope of the previous patents that were proffered to the examiner to study and make it a much more efficient task.
Choudhury: So, we didn't have an exact hypothesis of how this would go, but you could argue that this predictive technology based on machine learning would return more precise prior art and make the job much easier and if you extended the argument, you could even argue in the spirit that we started the conversation with, that at some point maybe the examiner is not even needed. The machine learning technology would just read…
Fuller: …could just take over.
Choudhury: Yup. So, that's the sort of place where we started the experiment.
Choudhury: So, what happened was we gave half of the group using the machine-learning tool, as well as half of the group using the Boolean search tool, some advice from a real patent examiner at the patent office. So, what was this advice? It comprised search terms, keywords…
Fuller: Mm-hmm
Choudhury: …to be added to the search. The headline finding of our study is, without this advice, no student conducted the correct examination, quote-unquote, "correct examination." So, this advice was instrumental, was absolutely critical to find the relevant prior art that you needed to find.
Fuller: So, the experience and the expertise of the patent examiner materially improve the productivity of these novice examiners, who are students?
Choudhury: Correct. So the experience and the knowledge of the senior patent examiner which we essentially call domain-specific knowledge, really helped improve the accuracy of the search, not only for the machine-learning group, but also for the Boolean group. So, in a sense you could think of this expertise or domain-specific knowledge being critical for any kind of search involving either kinds of technology.
Fuller: When you think about why it works that way, what kind of hypothesis do you have?
Choudhury: So, that's a great question. So, I've spent a fair amount of time interviewing senior patent examiners of the patent office. And there's actually academic research also related to what I'm going to say. So, what happens in the patent office is, you could imagine, the patent lawyers and the firms submitting claims [are] trying to secure the broadest possible patent that they can, because that obviously has commercial value to them. And to do that, the mechanism they employ is language, [the] language of the patent claims. And the problem gets really complicated because, by law, the patent office in the US allows every signee, every company or inventor submitting a patent…
Fuller: Submitting one, mm-hmm.
Choudhury: ...to be her own lexicographer. So, in the patent text, instead of saying wheel, you could actually say cylindrical object with rubber. And, that's completely legal.
Fuller: Got it.
Choudhury: The patent lawyers are framing the language in a certain, let's call it strategic way.
Fuller: Mm-hmm.
Choudhury: So the job of the patent examiner is to really read through the lines and figure out what's the missing set of keywords that needs to be added to the search. And so that's what we validate through this experiment.
Fuller: So, this is an exotic form of cryptography where you're trying to break the code of the person submitting who's trying to get a very expansive, all exclusive patent?
Choudhury: You could think of this in econ language as kind of information, the symmetry between the two…
Fuller: Right.
Choudhury: …agents involved.
Fuller: Mm-hmm.
Choudhury: And so the patent examiner, through her years and years of doing this kind of searching and examination, really knows how to play this game…
Fuller: Mm-hmm.
Choudhury: …and really can bring to the table the expertise of adding critical search terms to the search process.
Fuller: So, it's that accumulation of experience and what I would think of as almost the heuristic of how a patent is written and what to look for that accounts for the productivity improvement of the experience?
Choudhury: That's correct.
Fuller: Yeah.
Choudhury: This expertise is developed over many, many years, and it's often developed through very informal conversations with colleagues. So when I was at the patent office, senior examiners told me that earlier in Alexandria that had this enormous basement full of prior art, prior patents, which they called “The Shoes.” And it was called “The Shoes” because President Jefferson used to be a patent examiner, and the story goes that he kept his prior art in his shoes.
Choudhury: So whenever you had this examination going on but you didn't know where exactly to look, you came down to “The Shoes.” You chatted with your colleagues. You got these suggestions of Senior Patent Examiners of key words to add. And this is a classic [example] of what we economists or strategy scholars would call absorptive capacity building.
Fuller: Mm-hmm (affirmative).
Choudhury: That by doing something repetitively over time, you develop the expertise and the knowledge of conducting that kind of a search.
Fuller: And there was, in the patent office, an institutional mechanism for that [to] get shared, and a network to be created at the same time.
Choudhury: That's correct, yes.
Fuller: We're there other attributes of the patent office that made it an attractive research site? How did you come to pick them as opposed to insurance underwriters or economics professors?
Choudhury: So partly to be honest it was opportunistic. I was writing a case with a patent office, and they had this machine learning tool sitting on the table. it was not being used, actually, in the patent office Because they have a very influential union called POPA. And they were still in discussions with POPA on how to use this tool. But then we made an offer saying: Can we run the experiment at our end? And, it was a win-win for the patent office and for the research team.
Choudhury: But I think, more generally, patent examination is a fairly cognitive task which involves this element of information asymmetry, where the patent lawyer is not, under all circumstances, revealing all the information. So, that made it really attractive to study patent examination.
Fuller: Well as you said, being a patent examiner is a very cognitive task, we often think about work as being divided along a line of demarcation between primarily cognitive or non-cognitive, physical type work. Do you have any hypotheses about how this might extend outside the realm of highly-cognitive, intellectually rigorous tasks to other types of work?
Choudhury: My hypothesis is, and I am continuing to work on this as we speak, but my prior is that for tasks which are routine, where the past also informs the future, for instance tasks such as facial image recognition: So, I have an ongoing project where we're using facial-image-recognition algorithms to code the facial emotions expressed by CEOs during their interviews.
Choudhury: In such tasks, machine learning algorithms are great, because they enable you to do this task of predicting emotions way faster than human coders could do, and way cheaper. But if the task is not routine, if there's an element of human agency, it would be very, very difficult to replace the human completely with just the algorithm.
Fuller: As a former CEO, I'm increasingly delighted not to be in that game where soon you'll be able to infer something about my sincerity or my confidence by doing an algorithm on my facial expressions.
But, I’ve thought about routine work as types of jobs that have low variability around what's expected of the worker. Is that a good definition, do you think that works as well?
Choudhury: So there are probably multiple definitions of routine vs non-routine tasks. But, I think, in this context of where machine-learning algorithms could potentially substitute [for] humans, I would say: If the task doesn't have uncertainty, if the task in the future is very similar to the task in the past, where the past training set data could predict the future – so, you know, it's easy to predict a smile or sadness in the future by looking at pictures of people smiling or being sad in the past, so I think that's pretty consistent temporally – such tasks would be very amenable to machine learning algorithms just being used on their own. But if the task has uncertainty, or has informational asymmetry, or any other form of human agency attached to it, I don't think the human could be substituted. That's my prior.
Fuller: I know elsewhere in your research you've thought about the impact of AI on building and testing theories. Could you talk a little bit about that?
Choudhury: Right. So we have a working paper again actually trying to make a case for how machine learning algorithms could be helpful for my field of strategy research. The overall idea in that paper is that theory building, in our world, but I would also extend that by saying theory building in several real-world applications such as strategy consulting, has traditionally been done in a very top-down way, in what we would call deduction. That you have the theory or hypothesis, you take the hypothesis to the data, you test it, and if it fails then you come back and try a different theory, and so on and so forth.
Now what machine learning does is provide the researcher or the manager or the CEO or the strategy consultant with a complementary way of developing theory: And that is bottom-up, something that we would call induction, where you would let the data speak. Machine learning could help reveal really interesting patterns in the data. And you could use those patterns in the data then to generate new hypothesis or new strategies or new recommendations for your strategy clients, for instance, that the top-down approach would not allow you to do.
Fuller: How do you relate that to what we were saying earlier about the role of expertise? This seems to be our episode where my previous life is going to come up a lot as a CEO of a consulting firm and we certainly practice that and deductive logic. And, on clients I was serving, that deductive logic was very often rooted in my expertise of having worked for dozens of companies over three decades. How does this inductive versus deductive paradigm contrast to what you learned with the patent examiners study?
Choudhury: It's a really interesting question. So, I think the human expertise would be paramount in several ways, but let me just try to explain a couple of things. So first off, the way I think about induction is really an exploration, it's a first step, right. So, you could use these machine learning tools to come up with these really interesting patterns in the data. But, then the strategy consultant would jump in and say hey, that won't work in this industry.
Fuller: Okay.
Choudhury: Based on, as you described, the years and years of experience and expertise of knowing this industry…
Fuller: Mm-hmm
Choudhury: …or just jumping in and saying, that would completely work…
Fuller: Mm-hmm
Choudhury: …and we should double down on this. So, I think that is where the prior experience of the human really would come into play. But also in terms of telling the machine-learning tool what to look for, what kind of inductive patterns might be helpful to the strategy consulting team.
Fuller: So, use the inductive reasoning to inform the expertise of a deductive thinker or have that deductive thinker essentially query the AI in terms of: “look for patterns like this, are you seeing them?”
Choudhury: Absolutely.
Fuller: So, in some ways we're talking about AI as a different way to think about causality and the relationship between variables and harnessing simultaneously what machine learning can do and what the individual who brings perspective and expertise and experience together.
Choudhury: So, I would caution and say that AI on it's own, and machine learning, does not lend itself to causal analysis. It just reveals an interesting coloration in the data. But that knowledge of an interesting correlation may be super helpful because, if I just did a top down analysis of the data, I might miss that interesting pattern. And so this is the data speaking for itself in a correlational way. And then I could take that correlation and then maybe run a small experiment to see if it's a causal…
Fuller: Mm-hmm
Choudhury: …relationship or not. But I think it's a really interesting way of combining the top down with the bottom up.
Fuller: Yeah. There's almost a metaphor to lab based sciences where you have a hypothesis generated by data but then you want to recreate it on the lab bench and validate that hypothesis through experimentation or render it invalid.
Raj, the many executives who come to our school and visit that we meet, visiting companies – writing about them, advising them – they're all very anxious about the subject of AI. Many of them read the same papers with the breathless predictions we talked about at the opening of this episode. What would be two or three pieces of advice you'd give to a manager about how they should be thinking about this and what they should be looking for?
Choudhury: So first off, I would say we should think about AI and machine learning as an opportunity and not a threat because of this extreme ability of AI to point out these really interesting patterns in the data. We live in the age of big data. So, machine learning in a very cost-effective way, in a very time-effective way can reveal these really interesting patterns in the data and could help things like personalized pricing or personalized experience of customers that we couldn't do earlier. So, think of these tools that offer really new, interesting opportunities. And the second, instead of thinking of experts and AI as two horses and trying to figure out which horse will win the race, I would almost imagine that right AI with the right training data set in the hands of experts can be super powerful. So, experts should benefit actually in many tasks much more from the presence of these tools than novices.
Choudhury: So, we'll probably need people, data scientists, who can work with experts so that the experts can get what they need from the tools. But these are really tools. I think these tools offer great new opportunities of inductive analysis and they're super valuable in the hands of experts. So, instead of experts being phobic or scared about these tools we should translate these tools for the experts and tell them what it means and really ensure that happy marriage.
Fuller: Think about how to make them productive for the experts as opposed to a substitute for the experts.
Choudhury: Correct.
Fuller: Raj, another question about AI and about the algorithms that underlie them: As you know, my field of study is the future of work for this podcast and in that area, there's a lot of discussion about implicit bias being built into algorithms unconsciously, subconsciously, by the people composing them. Do you have any observations about that and to the extent to which it is a threat?
Choudhury: So, I think it's a really important question and I can only offer really high level thoughts on this.
Fuller: Uh huh.
Choudhury: But, even going back to the context I studied, I would argue that it's a sort of implicit bias that got baked into the machine learning tool that the US patent office developed, because the patent office tool really was searching based on the keywords present in the patent text, which was being composed by the lawyers, and not looking for keywords beyond that text.
Fuller: Mm-hmm (affirmative).
Choudhury: Now, that's not a deliberate bias…
Fuller: Right.
Choudhury: …but it's a very implicit way of thinking about the bias issue. But, I think the general point is, if the machine learning algorithm is employing either past data which is biased for whatever reason it might be…
Fuller: Right.
Choudhury: …or the algorithm is only conducting its search or prediction in a biased way for whatever reason it might be, then the result and the output would also be biased. It could be a statistical bias, it could be a deliberate bias, but that bias would percolate from the past training data set into the future prediction.
Fuller: Mm-hmm (affirmative).
Choudhury: And I think we should be really aware of these possibilities and that's why we need humans, to really think this through, find the correct past data, find the correct way of search, and really generate a meaningful prediction.
Fuller: So, if I were to ask a data set, give the explanatory variables for the composition of the boards of directors for highly-successful companies since 1950, just from the data set alone it would be hugely biased towards white male business people since that would be the vast majority of directors from 1950 probably through 1990. And, if I didn't bring the expertise to bear to control for that I'd get misleading data.
Choudhury: That's true.
Fuller: Got it. Well Raj, thanks much for joining us and sharing the insights from your very innovative research here at HBS. It's really marrying deep academic expertise with practical problem solving, which is what we pride ourselves on here so thanks for sharing with us.
Choudhury: Thank you so much Joe, I really appreciate it.
Fuller: From Harvard Business School, I'm Professor Joe Fuller. Thanks for listening.