Are We Giving Google More Than We Get?
A new study asks whether personalized search results are worth the private data we give up
Hi, this is Steven Cherry for IEEE Spectrum’s “This Week in Technology.”
Remember when you first discovered Google? There were other search engines—AltaVista, anyone?—and then Google came along with its uncanny ability to show you the exact information you wanted. And the deal Google implicitly proposed was a no-brainer. They catalog the entire Web, give you this totally clean page where you enter your query, answer your questions nearly perfectly and....well, they’ll figure out how to get paid somehow.
And they did. With advertising links. Then sponsored links. Then Google-run advertising on other sites. Then personalized search, with advertisements tailored to whatever Google knew about you. And so the implicit deal changed. At the same time, the rise of social networking means that we’re going to other places than Google for information we used to use search for—Yelp for restaurants, Pandora for music, Netflix for movies, and our Facebook friends for just about everything.
So Google is now trying to figure out how to include friends’ recommendations in its search results. The new +1 button is just the start. And so the deal with Google—still pretty much entirely implicit—is changing yet again. It’s delving into not only our search behaviors, but our social behaviors as well—our preferences, our friends, our friends’ preferences. That raises questions of privacy. And some people will stop using Google entirely—just as there are people who won’t use Facebook. For others, though, the question is a bit more pragmatic: Is Google giving us real value for the privacy we’re giving up?
My guest today decided to take a hard look at that question, with a somewhat soft research study. Martin Feuz is a researcher at the Center for Cultural Studies at the University of London, though he joins us by phone from Zurich. He and two colleagues have done a study that takes a first shot at answering the question: How useful is personalized search? His preliminary answer: Not so much.
Martin, welcome to the podcast.
Martin Feuz: Hi, Steven. Thank you for having me here today.
Steven Cherry: Let’s start with personalized search itself. What exactly is it and why study it?
Martin Feuz: So the reason we set out to study this is that relatively little is known about, especially if you look at the interface today, there is still no indication which of the search results are in fact personalized and on what basis. And after all it’s been well documented that Google massively collects user data, and that personalization potentially has a variety of detrimental effects and unintended consequences such as social sorting or the generation of so-called echo chambers.
Steven Cherry: What exactly is an echo chamber?
Martin Feuz: So an echo chamber is essentially the effect that, based on your selections of a more personalized view over time, you’ll only be exposed to the kinds of views that you are already in accordance with and due to that and over a larger group of people, that would kind of lead to an error being discussed in these theorems to a more kind of group stance, less engaged kinds of dialogues in public. So people over time would lead to much less being able to discuss or come to common ground on issues that affect the larger public in the country for example.
Steven Cherry: All right, so let’s turn to your research study and maybe start by telling us how you set it up.
Martin Feuz: Right. So essentially in terms of the setup we created search histories for three philosopher personas and compared their search results with those of anonymous users. This allowed us to identify which of the search results for the personas were personalized. To create the search histories, each of the three personas got a Gmail account, which has 34 of the Web history check boxes selected, which is in fact the service that then leads to personalized searches also. What we then did is we performed so-called training sessions for the philosopher personas, and the search queries for each of the training sessions were based on the index of one of the philosopher’s books. Each search query was simultaneously performed for an anonymous user, that is one without the log-in credentials to a Google service for example or a cookie, and there were seven training sessions for each persona amounting to approximately 6000 search queries per persona. And that philosophy reflects, you know, an average use of two to four years of an average Google user, lets say.
Steven Cherry: So just to be clear, you know, you said that you created these fake profiles of three philosophers. You actually picked three pretty famous philosophers from history right—Immanuel Kant, Friedrich Nietzsche, and Michel Foucault. How did you come to pick them?
Martin Feuz: That’s a very good question. I’m glad asked me that. In setting up this study, I had to kind of think of a way in which if I would make the data available to other researchers, they would also be able to come up with an albeit subjective opinion whether in fact they believe what they see as Google personalized search results for these philosophers are in fact any good or not. So I tried to kind of set up a persona that can be understood, what it vaguely reflects in terms of its search interests, and that seems to have worked out quite well. And what seems interesting in a way is also that these three guys—Kant, Nietzsche, and Foucault—they do have in fact rather idiosyncratic characteristics, both in terms as reflected in the search queries that we assembled from their book indexes, and what I find interesting about that is that I believe, at least, and I can only speculate, but I believe the findings that we show might reflect only a quite a conservative activity of personalization. That is to say that if we were to take a search history or Web history for a more natural type of real person, I wouldn’t be terribly surprised that the intensity of personalized search results would be quite a bit higher if not be a lot higher actually. So the first hypothesis is that according to Google official statement, personalization is subtle and at first you as a user may not notice any difference. And our findings quite clearly indicate that Google personalization does kick in quite early, already after 10 search queries, and from then on steadily increases in terms frequency. At about 1600 search queries on average every third search query was served with personalized search results. And already after about 3000 search queries, which is, you know, a years’ or two years’ worth of average person search history, more than every second search query we saw having personalized search results in some cases, and that is quite everything else than subtle. So the second hypothesis we tested is that the more user search history gathered by a service such as Google personal search, the more long-tail content is retrieved. That is to say, the more Google knows about you the more relevant deep-down kind of search results you will be served. And again our findings indicate quite otherwise. In our tests we found that 37 percent of search results were actually found on the second page of search results when compared to the anonymous unpersonalized user and another 43 percent were in fact from the first 100 search results from the anonymous user perspective and those 80 percent of these personalization effects can be said to represent rather dominant voices in terms of search rankings, so not much long-tail content showing up there. And lastly, the third hypothesis is that personalization in fact represents only individual users past search and Web history, and that would mean that Google does not serve personalized search results for queries for which it has no direct means of assessing potential relevancy based on what they know about the users based on their factual search and Web history. And again we had to dismiss the hypothesis that their findings indicate that in fact all these search queries were, in fact—were personalized by Google and we believe that should not have been the case.
Steven Cherry: So maybe you could just summarize your conclusions, because it sounds like you found that the differences were significant but I think you also concluded that as far as the trade-off is concerned there really wasn’t much value to the users.
Martin Feuz: Yes, if you’re speaking about in terms of the trade-off between how much personalized information we give Google for the use of their services, as compared to the benefits that arise based on personalization, that is in fact how we see it. And the key question here really seems to be, what means have users today available to assess on their very own terms whether they strike a good deal or not in terms of giving up information and receiving a better service. And if you look at the interface, currently there is no means available to the user that he or she can actually make up her mind whether you know the quality of personalized search results she gets are in fact worth giving up all the information we do. So from our perspective we would strongly appreciate if Google in the near future would indicate which of the search results are in fact personalized and also allow users to kind of swiftly toggle between a personalized view and a nonpersonalized view for search results without having to, you know, log out every time, which I believe in terms of the microeconomics of the interaction nobody will actually do.
Steven Cherry: So let me just ask you one other question: Based on your results, are you now more or less concerned about that echo chamber effect?
Martin Feuz: That is a good question. I think it’s very difficult to assess, that it could go both ways in a way. Given that personalization is highly intense and frequently does show up very often, and given that it does render to a very large proportion very dominant voices, I could see that the echo chamber effect could be quite strong in fact because if you were to select—because you don’t know which of the search results you’re being served is the personalized one—if you were to select one that you liked that reflects your opinions and thus kind of confirm Google’s assumptions, that will embed you much deeper into that statistical group reflected by that opinion. And because the whole process is in fact black box and you don’t know what you’re actually clicking on, that could quite likely lead to such an echo chamber effect to a very large degree.
Steven Cherry: Yeah, so it occurs to me that, you know, one possible thing you could do going forward would be to, you know, maintain these personas maybe expand them and see how things change over time. I got say by the way, you know, I personally would be thrilled with the idea of a Facebook page for Kant. You know, I could friend him and, you know, he might post a rousing defense of the categorical imperative and I could click the “like” button on it. Do you think you might do something like this?
Martin Feuz: Yeah, I will think about it. That should be a good thing.
Steven Cherry: Do you have any further plans for the research to extend it or what’s next?
Martin Feuz: Yeah, that is a good question. Generally speaking your suggestion to actually have these kinds of synthetic personas available and use them kind of as a proxy, I would see that or we would see that as quite a viable future to have you know if you were to choose or have to make some search queries that have to deal with certain philosophical interests, why shouldn’t you be able to rely on a search and Web history that gets personalized, that already reflects that type of interest so to speak in a way. But also at the time when I conducted and set it up as a study, I thought that would actually be quite interesting. I could quite well see that people trade their quite interesting search and Web histories on eBay. You know what would be the search experience if you were to be able to use a certain person’s, a certain academic’s, search and Web history and get that perspective on your search results wouldn’t that be interesting? I think also what I can say is that there is a number of things happening. I mean, my personal research interest is also shifting more towards creating kind of search interactions that better support what is being called exploratory type of search interactions. That is, when things get a bit more difficult than the standard search query “What is the weather tomorrow in New York City” or “What is the address or telephone number of a certain company.” More targeted towards people that maybe have a certain condition they’d like to find out a bit more because they’re not doctors themselves, and would like to assess what they’re being diagnosed for example.
Steven Cherry: Very good. Well, I wish you luck with the next steps.
Martin Feuz: Thank you.
Steven Cherry: We’ve been speaking with Martin Feuz of the Center for Cultural Studies at the University of London about the trade-off between the personal data we give Google and the services it gives us.
This interview was recorded 13 April 2011.
Segment producer: Ariel Bleicher; audio engineer: Francesco Ferorelli
Follow us on Twitter @spectrumpodcast