The Structured Search Engine

Uploaded by GoogleTechTalks on 26.01.2011

HOGUE: Good evening everyone. Glad you guys can all make it. So let me talk about what
I three months ago decided to call the Structured Search Engine when [INDISTINCT] asked me for
a title and an abstract for this talk. I actually kind of debated a bit, you know, what I was
actually trying to talk about here. As he mentioned, it's--what we try to do in my group
here in New York Research Quality is try to really understand and, you know, understand
the queries, understand the documents, try to put them together. Structure isn't a great
word for that. It has a lot to do with what we're doing but, you know, it's not perfect.
[INDISTINCT] call Semantics Search Engine. This wasn't--this wasn't a perfect year. I
mean, semantics is kind of labeled with, well, semantics. You know, the people in semantic
lab, other things like that, it means a lot of different things to people. The Understanding
Search Engine, it doesn't really have a good ring to it. Intelligent Search Engine, that's
kind of what we're trying to do, but not perfect. That one's taken. So as we often do in this
kind of situation, I turn to Gary Larson to explain my feelings. I love this cartoon.
I mean, basically--you know, I think this is actually kind of what--what's happening
with search engines today, right? We want to say something. We want to ask a search
engine a question. We want to, you know, explain ourselves, want to give a lot of background
context and things like that and, you know, we wish we could do this. But really, all
the search engines we're really hearing is Ginger, Ginger. You know, it's not totally
getting what we're trying to say. A funny story actually happened just yesterday on
our internal quality mailing list. Someone brought up the query. He's trying to figure
out what the sixth lane on a running track, what the length of that was like an Olympic-size
running track. This is--this is the guy that basically wrote our index, you know, while
we're indexing serving system in Mountain View. So he tried the query Olympics--Olympic
track distance. He tried the query track distance--Olympic track running distance. He tried a whole bunch
of these queries. He was doing Ginger, Ginger, Ginger, right? He was--he was basically giving,
like, these really broad queries and hoping that Google can magically come up with the
answer. It turns out if you put in the query, how long is the sixth lane on an Olympic running
track? Top answer, it's right there. So it's so bad that even the best engineers at Google
have been trained to say Ginger, Ginger, Ginger to the--to the search engine. But it reality,
like, we're actually getting to the point where Google can try to answer some of these
questions about a more deeper understanding with longer queries, with a more deeper understanding
on the documents and the content that go with them. So today I want to talk about some of
these efforts that are going on. Most of these are going on in New York on the search quality
team here. Key topics I guess: what can we understand about the world? It's kind of like
where we started. Like, what things are in the world? Can we better understand the queries
that we're getting from our users? Can we understand content of the web, documents,
user reviews, things like that? And then probably, we can put all these together and actually
let Google do some of the work for you and you kind of put all this together so that
you wouldn't have to work quite as hard to get the answer to your questions. So I won't
read it but, you know, we're kind of go through a couple of technology result in here and
how we try to answer these questions. So the first question is, you know, can we actually
understand what's going on outside? The search engine shouldn't operate into a vacuum. You
know, we should be actually understanding that there are real things in the world. It's
not just documents and N-grams and so on. So as many of you quite have heard we acquired
a company called Freebase over the summer. They build a structured database of everything
in the world. You know, this is a graph. It, you know, allows links from between things
so we know that, you know, Bono and The Edge are members of U2 and so on. So for those
that haven't seen Freebase before, it's a database of entities as I've said. It's got
connections. It's got properties. It's got strong semantics so it's actually a schema
for freebase. It knows what a company is. It knows that it has employees and it has
a CEO. It's got about 20 million topics right now. It started of--when we acquired it, actually
about 13 million a week, we made a strong commitment to making it bigger. We actually
are a whole bunch of music data that's been published now. So Google's really committed
to making sure that this becomes a canonical reference point for high quality, 99% precision
entity topic data on the web. This data is all--this data is all publicly available.
This is an advertisement; its creative commons license. You know, you can--you can do whatever
you want with it attribution to Freebase. Has great APIs; I mean, it's got a very simple
new Java script query language. And there's some very good tools that they have, this
thing called Gridworks, which is recently renamed Google Refine, where basically pulling
in new databases of information and reconciling it into Freebase. So if you've got a great
database of ancient Mayan art or something like, Freebase doesn't know about any of it,
we'll give you the tools to actually import that and make sure that Freebase already knows
about some of it. You're reconciling it properly and merging it properly and then you can make
this better for everyone. So just sort of give you a little bit of an example of the
things that Freebase knows about. You know, it knows about buildings. It knows about molecules.
It knows about Aztec gods. You know, this is--this is a place where we start to get
further away from standard kind of databases of everything like Wikipedia. It knows about
candy, international candy, U.S. candy. It knows about art, of course. Just a quick example
of how Freebase represents information. Query knows that there's an entity called Blade
Runner. I know that it's a film. There's about other names for it. You can see that it's
not only Blade Runner, but there's a name in Russian and various other languages for
it as well. It knows that Harrison Ford acted in it. You now, the relationship between a
movie and an actor is not a simple one. They--it's not just an actor. It's the character that
he's playing. Freebase represents this but some call it a compound value type, which
is a way of kind of mediating between an entity and kind of complex information. So it's not
just a simple [INDISTINCT] store. It's a--it's a way of actually representing complex real
information about the world. So Freebase is cool. We can do a lot of stuff with that.
You know, we can build interesting search engines but--like I said, our goal here is
to actually have a database of everything. We want to know about everything in the world.
So, you know, 20 million entities, that's cool. You know what's really cool? A billion
entities. So how do we get to a billion entities, right? I think the really--the only way that
we can think of, and we can do this manually, you know, we can kind of walk through and
try to have each person to contribute, you know, their own 20 entity or the thing that
they know best. But really, we're going to have to do something automatically. And there's
a lot about what we concentrate on here at New York. So what can we actually extract
out of documents? The other simple stuff, things we can recognize with patterns, dates
and times and measurements, phone numbers and things like that. Some things are harder
because they're kind of ambiguous; things like locations. You know, Starbucks has many
locations, which one is this document we're actually talking about. You know, people obviously
very ambiguous; can we actually recognize which people are being discussed in certain
documents? And then even more complex types like [INDISTINCT] with Freebase, you know,
factual data like what are facts about an entity? So here's an example of a page that
is an entity, this floor standing speaker that Freebase has no idea about, right? This
is a long-tailed entity. And this is actually a pretty good page. I think we can pull a
lot of information out of this. We [INDISTINCT] the manufacture and the part numbers and so
on. You know, it's a good description of this thing. I would love to create an entity out
of this thing. If you scroll down the page, there's a really nice table here. You know,
it's got all kinds of [INDISTINCT] information. It'd be great to get this stuff in. So our
team here in New York works on pulling this information out. We use a whole slew of techniques.
We look at kind of--we look at tabulated data; a data that looks like it's organized in some
way. We look at a data that looks like attributes that we've already heard of. So maybe we've
already heard that weight is an interesting attribute. Lots of things have weight; Freebase
knows that. So whenever we see in weight followed by a measurement, let's pick it up and put
it into the--into the database. Even crazy things. Once we pick up a couple of attributes,
we can get a pattern. We can introduce a wrapper for the data and actually pull out things
like input impedance. You know, that's something that Freebase might not know about. We might
not have it on our schema. That's still useful information. So the approach here--just not
going to go into too much depth. That probably--this talk is going to be a huge fire hose and then
you're going to probably have to join Google in order to find out the real details but...
So the approach here as always the Google we have a huge--we have a large amount of
scale. We know the whole web. We're going to go for a very high coverage system here.
We're going to give up something in precision, but we're going to hope to get it back by
aggregating data from lots of sources. [INDISTINCT] tables. We'll look at, you know, things like
attribute common value. We know certain values takes certain types. So depth is always a
measurement of length. And if we can recognize the measurement of length and the word depth
is nearby, it's possible--it's quite probable that someone is trying to describe an attribute
here of some entity. We look at things like--you know, standard techniques like page segmentation.
We look at wrapper induction to kind of induce the pattern for what we're seeing on the page.
So all of this obviously, as I said, not much depth here but lots of machine learning going
on behind the scenes to produce this table; you know, lots of--lots of evaluation. We
want to find out how we're doing, kind of iterate on information we're seeing, what's
working, what's not. And we try to eventually build up this database. So you notice that
the bottom here by the way is just to point out the high coverage, low precision mean
here. You know, it's not just the good stuff. We've ranked this and we're doing pretty well.
I think I'll take questions at the end. Is there anything very particular right now?
>> Yes. The number on the left of the--on the [INDISTINCT], the left hand [INDISTINCT].
>> HOGUE: Sorry. So the--the question was about the number on the left inside. So that's
just the confidence that we assigned. It's--you know, you can think of it as maybe like a
probability of being true although that might be glorifying in a little bit. So as far as
what I'm saying where--you know, this is just the cream of the crop here. If you look down
a little further, you know, it's starting to get a little sketchy as we go to lower
and lower confidence values. But, you know, these things all kind of look like attributes
and values to our system. But thankfully, we've been able to kind of pull out from the
[INDISTINCT] here at the top and that becomes the useful data that we have. So that was--that
was the first fire hose. That was kind of some how we understand the world outside the
web and what can we do to kind of build up this database of everything and hopefully
make it to a billion facts. We're not--we're not quite to a billion yet but we're trying.
So the next--the next step is can we actually understand queries? You know, can we understand
what users are saying when they're trying to look for document information? And so--and
this man want to talk a little about question answering. So this is actually my started
project at Google, you know, six years ago. Still doing it. It's a--it's a tough problem.
As the [INDISTINCT] it's a great shot. Snapped by one of the guys in the team on the subway.
You know, Google--well sometimes these aren't found on Google, but we're hoping the ones
that are found at Google that we can surface them in a way that people can find useful.
So just to give you an example of what I'm talking about, I'm talking about this one
box here at the top. If you ask, when was Martin Luther King Jr. born? We want to surface
the answer right away and kind of give you the right correct answer. The team is also
responsible for highlighting the answer in the web search snippets and especially if
we have lower confidence that this thing is right. We want to still give you some indication
that maybe this is a date that's useful for you. So I'll take you to a quick work example
of how we understand a query like this. So the first ting we do, as I mentioned before,
we have a lot of systems here for understanding simple text. We call them annotators; things
like dates and times and measurements and so on. One of them is very good at recognizing
names. So first clue we have here for the query is that it mentions a name. That's a
good piece of information. Once we've got that bit, or even if we don't, we next try
to split the query into a thing that the user is asking me about; we call it the entity
and an attribute. You know, what property of the entity are they looking for? If we
can't find that, we're done. You know, this is probably not a question. But if we do,
we're going to try to figure out how to--how to retrieve a good value for that. The next
thing we look at, you know, we have a large database of entities. We know something about
how to relate it. Do we know any other names for the entity being mentioned? In this case
we know that Martin Luther King is--Jr. is also referred to as Martin Luther King often
MLK and so on. One of the things--you know, when somebody ask for someone when someone
is born, what are the ways could our database be representing that same value, date of birth,
date born and so on? We look at the whole query to try to give us some other clues as
to what the answer might be. We have a bunch of, you know, to be honest regular expressions
that kind of say "well". And if you say "when", "was", it's probably looking for a date. Even
if we didn't have that, because we have this large database and we can kind of do a lot
aggregate analysis, we know that date of birth are born often looks--often has the value
of a date. So we're--we think it's probably a date. "Where" can also mean where they were
born. [INDISTINCT] a little bit of confusion about that. But we're going to try--we're
going to favor dates in this example. So now we go [INDISTINCT] database. As I mentioned,
we've got a big table full of information like this. You can see that we've got a couple
of potential answers here. The birthplace is there, also the birth date. And we're going
to go in now... Sorry. The big insight here is that we're not just going to look at the
table. If just look at the table, lots and lots of query would match and we'd be able
to answer pretty much any query with something out of the table. As I've shown you before,
there's a lot of low confidence values in there. But the big insight here is we actually
look at the top search results. search is very good at delivering on-topic
information for pretty much any query. So when you say, when was Martin Luther King
Jr. born? Chances are the top--if it's a question, the top documents have the answer. And so
in this case, we go--we look up our suggested answers in the documents and try to figure
out if they appear. In this case obviously, the dates appear. You'll notice also that
the birthplace appears as well. But because of two factors, first, we find the date more
often, and second, we are expecting to find a date; when is it happening is, you know,
we actually answer the question correctly. We got a lot of collaboration at the top search
results and we try to get the users some ways of actually backing this up especially--we're
probably going to be wrong five, 10% of the time. We want to make sure these are kind
of [INDISTINCT] checked, also figure out where this answer came from. So that was the end.
I know there are quick fire hose, another project we're doing here in New York about
question answering. So the next topic I want to talk about a little bit is what we can
do to understand content. So we've looked at the world outside. You know, we can understand
entities now. We can look at queries. We can understand how users are asking questions
and we can look at--now we want to look at how we actually understand the documents,
the content on the web. I'm going to predict a particular slice of this and we've already
talked a lot about extracting information from documents and how we understand that.
But I want to talk a little about Sentiment Analysis. So Sentiment Analysis is the field
where we're looking at. Can--how can we understand what users are saying about let's say a business,
a product, a person. Is it positive? Is it negative? Are they happy about their experience?
Are they upset? You can put it on another way. It's a different [INDISTINCT]. You know,
Erick and his googly eyes or double face palm, right? I mean, like, what are users really--how
are they feeling about this? So just an example. This is one of the things that's actually
out there on today on the place pages which is part of local search. We have
a listing here for the Carnegie Deli. And you'll notice that we're actually summarizing
the reviews for the Carnegie Deli. There's 1,500 reviews almost for Carnegie Deli. And
if you really want to get a good feeling for what's going on, should I go there? Is it
really, well, it's cracked up to be, you can read all 1,500 reviews. But if we can we can,
we can actually summarize and try to give you a feel for it. One thing you'll notice
is that the star rating for the Carnegie Deli is three and a half stars. That's kind of
like an aggregate average of all the review text that we've picked up. And the snippets
that we put up here, we call them Franken snippets because they're kind of, you know,
sewn together. They're trying to be balanced in the same proportion as that three and a
half-star rating. So there's good stuff. Obviously, great food, great service. Some not so good
stuff. Some people think the waiters and staff are unprofessional, and you pay the price
for it, too. So we're going to try to balance the reviews. Somebody has five stars. I'm
going to try to give you mostly positive reviews. One star, mostly negative reviews. Now I'm
just trying to give you a good feel for it. You want to get into more detail, we also
try to parse out what aspects, what things are people discussing about this business.
You know, in the case of a restaurant, the substandard ones. Obviously everyone talks
about the food and the service. But the Carnegie Deli is known for its corned beef. So we actually
are able to identify that, pull out the reviews automatically and determine that corn beef
is a pretty big hot topic. Everyone's talking about it in positive or negative terms. So
obviously dig in even further and find out all about the desserts and so on. And you
go even further. I mean, we're really trying to give you a tool to really dig down, why
is it that everyone's talking about the corned beef? What's wrong with the service? Why is
that, you know, almost half red and so on? So that's just a little bit of the set up.
This is why we want to analyze sentiments and summarize. So Sentiment Analysis is a
great field for machine learning and NLP. There's so much text. It's such, you know,
an interesting, you know, kind of deep natural language problem. We have to deal with a whole
bunch of different issues here. Obviously, this is a basic one which is, you know, what
is--what is a positive sentiment? What is negative? What words represent positive stuff?
What words represent negative? These aspects as I discussed, what are people talking about?
You know, these deeper natural language problems. What happens when somebody negates a word?
You know, "It wasn't the best." You know, we don't want to give it credit for positive
just because the word "best" appears there. And as well, we don't--we also don't want
to apply the negative sentiment to the latter half of the sentence. And then finally; scope.
You know, often times people say things--excuse me--like "We came here because we couldn't
stand the lines of the other restaurant." We don't want to give negative credit to this
restaurant because of its lines. It's not--it's not the right scope. They're talking about
some other restaurants. So these are the kinds of things we're dealing with if we want to
accurately summarize information from reviews. So I may not go into all of this. I might
just go through a couple of worked examples here of how we deal with the positive and
negative words and how we deal with negation. So this positive and negative problem we call
classification--sentence classification. So giving a sentence, giving a review, is it
positive? Is it negative? You know, how is the user expressing their opinion? This is
a pretty broad goal. This is Google, right? So we want to be--we possibly use this in
many domains. We don't want to be just doing local. We want to do products, news, people,
whatever we can do. It's got to be international. We have a ton of traffic coming in from outside
the U.S. and U.K. and the English language countries. We don't want to just make a solution
that works for English and then stop. It's got to be robust. A lot of consonant in the
web is misspelled in case nobody even noticed. You know, it's got--and people don't use proper
grammars. And this can't be, like, a really hardcore, you know, parsing and so on. It's
going to mess up a lot just because somebody misspelled the word. And then obviously it's
got a scale and they were dealing with millions and millions of documents easily. I want this
to be updated everyday. I want to be able to take reviews very quickly and so on. So
the approach just for a classification is to build a lexicon, we call it, which is a
set of words that have some meaning associated with them; of words that have positive and
negative associations. And we want to do that quickly from a small set of seed. That's how
we're going to crack the internalization problem. We're going to say we just start with a small
number of words and we'll be able to expand upon them. So, really quick. To build a lexicon
you start off, as I said, with a small set of seed words. You know, these are simple
things like good and bad, fantastic and so on. About 100 of them actually is enough as
it turns out to just kind of give the system a sense of where to put the dividing line
between good and bad. And you also take a large graph of N-grams. So you think of it
just a little bit like WordNet. You know, it's a set of words or phrases that have associations
between them. In our case, we actually have something a little bit more powerful than
WordNet. We actually take the whole web corpus that we compute what's known as a distributional
similarity metric between all the N-grams that we can find. So it basically says, "What
kind of context do these word appear and how similar are those context?" As compute this--the
whole web corpus and we end up with a very, very large lexicon, a several hundred million
phrases and edges with weights between them. So it's a big graph that kind of says which
words are related to each other as far as how they're used in the language. Next thing
again about this, you know, there's nothing English specific about this. All we need is
a bunch of N-grams and a bunch of documents in that language so we can kind of go ahead
and build this graph. Then we run to what's known as label propagation over this. We start
of by labeling the notes that have positive or negative sentiment and we kind of iteratively
propagate those weights through the graph until we reach some sort of, you know, steady
state that says this portion of the graph is positive, this portion of the graph is
negative, and this portion of the graph, well, we don't know. It's probably somewhere in
the middle. And remember, this isn't just words. This is all kinds of phrases. You know,
this is things like truly memorable, right? Not just truly, not just memorable, but truly
memorable is the fairly positive word, one of a kind. It turns out [INDISTINCT] is slightly
negative. Who knew? So a pain in the ass. You know, squeaking, internal bleeding clearly
negative. When somebody mentions internal bleeding in the review of a restaurant, you
probably don't want to go there. Right? So since this is just a really small sample,
the weight associated with this kind of range from negative five to five, you know, it's
just based on how positive or negative we think that these words are based on the context
we find them. So just to give you an example of why it's good to use a lexicon like this,
it's built from lots and lots of phrases instead of just a simple dictionary. We can actually
tell the difference between nouns and other kinds of typically non-sentimental words based
on the context that they occur in. So things like dog. Dog is not a particularly sentimental
word unless you have one. But dogs barking is negative, dog friendly is positive, right?
Self-sufficiency is good in terms of self, but self-servicing is a bad kind of phrase.
Painstakingly is different than painful. Attention grabbing turned out to be a good thing. Money
grabbing turns out to be a bad thing. You know, and even great--even positive words
like great sometimes mean negative things like great expense is not necessarily a good
thing. So this is kind of the power of having a lexicon that's built from lots and lots
of documents, lots and lots of [INDISTINCT], you know, different, you know, ways of representing
the same information. Because the other sentiment kind of task that I want to talk about briefly
is negation. How do we actually handle when a user negates a particular word. So they
say not great. You know, how do we actually tell that great is not actually a positive
in this case but it's actually negative. So we could do this for the lexicon, right? It's
possible that we'll see the phrase "not great" or "wasn't great", enough in our web corpus
that we'll be able to identify as a--as a negative thing. But it's better--it turns
out to be a lot better if we actually build a specific tool for identifying negations.
So just to work a quick example. This is a review--a piece of review text here. It's
pretty standard. "Service wasn't the best but the food more than made up for it." And
this is what our negation system tries to do with it. So the beginning of the sentence,
zero here or green is basically is not negative. So it means that there's no probability that
this is a negated part of the sentence. And as we go towards one, it means something is
probably negated. So it wasn't the best. So best is not actually a positive sentiment
here. With a high probability, the user is not talking about best. It's talking about
the opposite of best. And then as you can see as you go down the sentence, the probability
of that is if it's still a negative, it kind of dwindles off. You know, this isn't--this
isn't a perfect example here, right? We actually want to treat food as a positive thing and
we're just barely, you know, kind of making it down towards where it's probably not negated.
But if you set the right thresholds here, we actually do fairly well in identifying
negations. So the way we did this, we [INDISTINCT]. Not me. We took a golden data that we basically
hand-labeled about 2,000 reviews. So we hand-labeled the negations in them. You know, we try to
get good agreement between the people that we're labeling these so that everyone kind
of said the same thing was negated. And we built a--what's known--a user technique called
a Conditional Random Field, which basically output probabilities. It takes a bunch of
features and outputs probabilities of whether or not what you would like to happen is happening
in this particular section of the text. And we trained it based on these positive examples.
So just to kind of--to give some of the results here, this was published last year on the
workshop. And this was actually on the public benchmark [INDISTINCT] called bioscope; a
bunch of biology papers. This is actually the best results that had been published so
far. It also has a dramatic effect on our internal metrics where we--when we looked
at local reviews. I don't know if it's easy enough to see on the presentation, but there's
a red line, which is the old system, a blue line, which is the new system. Better is up
into right. So we did a lot better here. Just really a quick point I wanted to make is,
you know, kind of [INDISTINCT] which is that Google does want it, you know, kind of worked
and published some of the--some of the stuff that we're doing. I mean, it's not just kind
of black whole here. We're actually trying to get out, publish papers at workshops, at
conferences and so on. I think the Sentiment Analysis is a great example of this. Both
the lexicon work and the instigation work were republished this year. And we're applicable.
I mean, we're really trying to get the data--the results out there and share them with the
community. Just a few more examples of negation; some that worked out well and some that didn't.
The underlying bits here are the things that are being negated. So your negations are hardly
a simple problem when detecting sentiment. You know, a simple problem is something that's
not a positive vehicle here. You can see, you know, we do well on certain things like
"Don't cry for me Argentina." You know, that's a--that's a good example. It's just that it
does have some issues. It doesn't do well on the kind of Yoda-Speak. It looks--it's
very dependent on finding evidence to the left. It doesn't do, you know, very well to
the right. So if you flip--you invert the negation, we just don't see a lot of text
like that. I'm sure we could try a few examples of Yoda-Speak but they just don't come up
very often in actual reviews of restaurants. So unless people start writing reviews in
Yoda-Speak I think we're all right. Okay. So that was the section on trying to understand
content, both the stuff that we talked about in the beginning about understanding and extracting
information from content as well as understanding content at a deeper level of natural language
processing with Sentiment Analysis. So this last section I want to talk a little bit about,
can we actually have Google do some of the work for you? We got all this great stuff
going on at the back end, but most of it is still ends up doing a single query and doing
one kind of bit of information processing for you. Can we actually do something that
actually does lots of query on your behalf and saves you a lot of effort? So it's--maybe
this doesn't happen everyday, but maybe there's some problem that we take a lot of effort
to put together the information that you need. And the question is: can we actually save
you some that time on these very complex tasks? So let me talk a bit about Google Squared,
which was our last product that we launched about a year and a half ago. For those who
haven't seen a Google Squared is trying to tackle the problem of kind of hard decision
making problems that don't occur everyday but, you know, a kind of high value, right?
So we multiply frequency by value. These are still about as--you know, have the same in
factor as a typical everyday question [INDISTINCT] query because they're much more valuable even
though they don't occur as often. So these things are like buying a car, planning a vacation,
choosing a college, things like that. This is a personal example; my wife and I a couple
of years ago were buying a car, right? And we--this is before Squared launched, before
it was kind of being developed. So we did what probably a lot of people would do. We
actually--you know, maybe we kind of wanted a big car. We've got kids. We actually made
a spreadsheet, you know. we put in that spreadsheet; we put a list of cars. You know, we put--and
then we were interested in the--you know, the Toyota Highlander. You know, we were interested
in the Saturn, you know, whatever. Put that down on the left side. And there's things
that we cared about. We cared about the crash test rating. We cared about the prize. We
cared about number of seats, things like that. But those [INDISTINCT] the top. And then we
did a search for each combination of those values, you know. We actually went through
and we found out what the crash test rating were in each of these cars. And the problem
with this is that there's no one [INDISTINCT] that quite does this exactly where you'd want,
that it's flexible enough to have both crash test rating and fuel economy and, you know,
like the prize let's say. You know, like--you know, some [INDISTINCT] has a lot of information.
It's not as configurable as we wanted. And so we ended up using a spreadsheet and kind
of manipulating the data ourselves. So these are the kind of problems that Google Squared
is going after. The magpie here, we choose this as our internal codename magpie because
it's a bird that collects lots of little bits of, you know, foil and hay and garbage to
build its nest. And, you know, some good, some bad but, you know, it's a process of
collection and kind of aggregation which is very similar to the kinds of tests that we
were going for. So for those who haven't seen it, Google Squared is kind of an example.
You type in one query [INDISTINCT] and Google Squared goes out and builds this whole table
for you. It's got a list of things. In this case, baseball players, pictures, descriptions,
facts about them. [INDISTINCT] just people that can do antibiotics. You can do things
like cheeses. Actually we're going to try and do a quick demo and see if this works.
So to see it in action--so this is just an example. We type cheeses, it builds this entire
square on its own. One thing to know about Google Squared is that it's not trying to
do all the work for you. We know in the case of information extraction, [INDISTINCT] domain
information extraction about the worlds here. As I said before, this is a high coverage,
possibly a low precision kind of, you know, domain. So we're actually going to only show
you the things we're pretty confident of but we don't--we actually aren't sure that we
have the right value. We're going to either hide it or, you know, kind of dim it out a
little bit. And we're going to work with the user to try to help them understand what information
might be right for their task. So, yeah, I liken this to search. If
search were perfect, every query would be unfeeling lucky, right? It wouldn't have 10
results. It's not, right? I mean, like you--it's wrong. It's wrong a lot as it turns out. Trust
me. So the idea is you get lots of links. You get a lot of feedback. And even for things
like, you know, what effectively question answering here? We want to give the user some
flexibility to kinda correct things. So, you know, maybe--you know, I want to actually
go and look at Wikipedia and find out this, you know. I believe Wikipedia that [INDISTINCT]
and that's fine. So I want to include this in my--in my Squared. And Google Squared learns
from this feedback. We kind of, you know, look at users that are adding and deleting
rows, users that are correcting values, changing values, and these are all editable so I can
go in and just change. I don't--I mean, I don't like this one. You know, Squared is
good at kind of adding new items. I noticed that Swiss wasn't in here. So maybe you want
to add Swiss cheese. And it'll go off and it'll try to pull in new values for each one
of this and kind of fill in the table. Swiss cheese as it turns out [INDISTINCT]. You can
also--you can also add columns here. So let's say that I want to know what kind of wine
to pair this with. So this is a tough--this is a tough query. Google Squared doesn't really
know exactly, you know, what I'm talking about. This is probably is in a standard attribute
in Freebase. You know, we can't just go to a high quality source of information. But
we do have some--you know, some information here. You know, we picked out the wrong part
of the value but, you know, champagne. That seems like a good value so I'll put it in
here with my [INDISTINCT] computer [INDISTINCT]. All right. You know, this one--[INDISTINCT],
beers and [INDISTINCT]. So this is the kind of process. We want to give people away a
very quickly pulling in information and building this--you know. We all we want to decide what
cheese [INDISTINCT] in and kind of figure this stuff out. Google Squared, where do we
start from? What are the key premises here? You know, why are building this? So--and then
how are we going to build it rather? So one thing we noticed as I--as I spoke about before
with the question answering and so on. Web search is a great--a great tool for information
extraction and trying to find the right information. We do a research at, it tends to
get relevant documents. Enough that we can use it, right? So we're not going to just
use it for a database. We're actually going to use all that information and store it inside
Google Search and all that pent-up experience and so on. One--another point here is that
a key step to making a decision, you got to collect data, right? I mean, if you're buying
a car [INDISTINCT] college, you need lots and lots of data and you want to kind of put
all that together. Another premise, as I mentioned, is never going to be 100% accurate. We're
not going to even claim this 100% accurate. Really lucky if we get 70% accuracy on open
domain information extraction. That would be world class. So we're going to make a trade-off
here. We're going to try to go for a high coverage, but we're going to pry users with
tools to correct their data when they--when it goes wrong. And finally, can we actually
make use of this search engine here and can we do a lot of work on behalf of this, or
can we save them all by typing or putting each individual query to try to get the squared--our
table built? And this is kind of fun fact. I mean, Google Squared--to build that square
[INDISTINCT] cheese, it [INDISTINCT] 200 queries to and [INDISTINCT] all that information
together. So just to talk a little bit about how this gets done: first phase, you got a
broad query, let's get a list of names. This is the thing that goes on the left side; what
kind of cheeses are there? You also want to expand that list. So if you've already have
some cheeses, what other cheeses might be interesting? What kind of attributes are interesting
for this? I'm going to go into this in a little more detail on the--and finally, can we actually
pull out a value to put into the square. So first step, as I said, finding a list of names.
So the approach here, we want to take [INDISTINCT] cheeses and go to something like Brie or Gouda.
We're going to do a search. As I said, that seems to be a pretty good tool. Here, look
at the search for cheeses. These are pretty good results as it turns out. Now this result
has a really nice comprehensive list of cheeses. So as this one. One thing you also notice
down here is that these are really comprehensive list. This is not actually giving as much
information about which cheeses are actually interesting, which are the popular cheeses.
You know, the Wikipedia entry has on the order of 1,000 cheeses listed or something that.
And I'm just trying to figure out what I want to search [INDISTINCT] with my wine, this
is probably not the best way to go; you know, go about it is just going to clicking on each
one of these links. So we're going to try to help these organize these things because
that requires some ranking. So we got a candidate list of maybe 1,000 cheeses or things that
might not even be a cheese and we're going to try to rank these things. So one approach
is to try to get more list as it turns out. Not less. So we run other queries. We run
list of cheeses, kinds of cheeses, top 10 cheeses, popular cheeses. Some of these are
very good for getting comprehensive list like list of cheeses. Some of these are good for
getting popularity of cheeses. So, top 10 cheeses is actually a great way to get a list
of people's favorite cheeses which gives us some ways of separating out the interesting
stuff from the kind of long tail. There's also a lot of user feedback that we can use
here. So squared users, as I've said, edit a lot of tables. So if someone has done this
query before, you know, possibly they've added--that's a pretty good signal to us that, you know,
this is probably a good cheese or whatever we're talking about. Web users also type in
the query of cheeses a lot as it turns out. We can look at what they type next. Do they
refine cheeses to brie or gouda? It's much more likely they're going to do that than
they're going to do something much more rare. And then we just have the raw popularity.
More pages contain brie, especially pages that contain less than danbo, a type of Danish
cheese, that I just kind of randomly picked from the list that I've never heard of before.
So that's kind of how we--I get the initial lists of names. We're trying to do this two-step
process of pulling out list and then ranking them. Now we want more names. We want to be
able to suggest additional ones especially if you've done some modification of the Squared
and you've added your own things, maybe you've gone down a particular path, you're just using--you
know, putting in European white cheeses or something like that. Can we actually suggest
more things that fit in to this category? So in this case, we use something very similar
to Google Sets; same basic algorithm. We go out and we look a list on the web, [INDISTINCT]
offline and we say given that you have several items, what is the most likely other set of
items that would be in the list, anyone on the web? So given that you've already seen
brie and gouda, you look other list on the web that say brie and gouda, what other--what
other items tend to occur in the same list? And it basically gives us a probability and
we use that to kind of rank suggestions for the pop out at the bottom that kind of suggests
other things, you might want to add to your square. Next, we want actually have some things
to go along the top. So I mentioned description is kind of given. Every square is going to
have an image and a description. But every--other than that, there's almost no commonality between
domains as far as what kinds of attributes it might be interesting. If people want [INDISTINCT]
birthday for cheeses, you want to know their--you know, where they come from and so on. So obviously
the first thing you think off, go and look at the fact table. You know, you got this
big table. Hopefully, a billion of facts. What [INDISTINCT] does it say about brie?
This [INDISTINCT] is actually pretty good. You know, it's got things like source milk
and aging time which turn out to be pretty good. It doesn't always work. Cheddar turns
out to be a fairly conflated name. You know, we don't have, you know, kind of a strong
idea of which one of these that might be. It turns out it's a town in South Carolina,
another town in United Kingdom; also apparently an album--no, a band I guess or maybe an artist.
Do we know who Chadder is? So this doesn't always work. I mean, we can aggregate. That
helps. I mean, you look a lot of cheeses, see what their attributes are. But I think
we want some sort of signal here to kind of help out. So there's a theme here. Second
approach is go and look at web search. So we actually take each one of the cheeses and
we do a search on We do a search for brie. We do a search for gouda. We do
a search for--and we look at the original search for cheeses as well. We look at the
tables in that search. And it turns out that there are certain attributes that kind of
mentioned over and over and over again; things like texture and country. And this is the
way of kind of narrowing down the context, right? Not only that web search kind of--tend
to be on-topic most of the time for this queries, but it also kind of allows us to aggregate
across many, many different queries of the same type and disambiguates. So even if there's
some ambiguous things in your list like maybe you're looking at cars or car manufacturers
and you've got Ford, then you might mix it up with Ford the president and get date of
birth, there's likelihood of that happening over and over and over again for every individual
car manufacturer is fairly low. And so by doing this throughout the entire list, we
end up with a good set of attributes. In the final step, I basically already talked about.
When you want to find the value to go into to cell, that's question answering. And actually
what ended happening was we took Google--launch Google Squared about a year and a half ago,
that launch from the labs, that was great. We got a lot of good feedback for it, but
it's labs. You know, we only got a few--you know, tens thousands of users per day. We
want to get this stuff out at so we took the same back end that runs the cells
for Google Squared and we put on and that's now our question and answering
system. So this is, you know, an example of where labs is kind of the breeding ground
for the good stuff that comes out next on web search and we got a lot more good stuff
coming down the pipe from here on out. So just want to talk a little bit about what
we've learned in general from all of these tasks. So first, and I've kind of reiterated
this several times now, web Search is really powerful, it's a great way to do information
extraction. knows, you know, if you say cheddar, it knows that you're probably
talking about the cheese even if broad, you know, strings--you know, the strings in our
fact database sometimes are talking about a town or a musical artist. Web Search kind
of stays on topic, it helps us stay on topic for the different things that we're looking
for. It also is very, very deep. It, you know, it has things for the long tail, has lots
and lots of documents are all on topic for given--for a given subject. Another thing
is scale. As you saw from question answering, from extraction, as we saw it with the way
that we build [INDISTINCT] analysis, scale lets you aggregate. You know, having this
many documents, having this may queries, having this many machines allows us to do things
that kind aren't able to be done anywhere else. We're tackling kinds of these kind of
NLP and information extraction problems. I don't think it could really be done in any
other setting. Another thing is that I talked a lot about, you know, coverage and so on
and the trade off, but it turns out that precision is actually really key. We ran a survey at
the top of Google Squared to kind of ask users how they were feeling about their experience
with Squared and whether it was useful and whether they saw their task. We did a release
at one point shortly after we launched that improved our measured precision internally
on our evals. About 10% were absolute. It turns out that the satisfaction of the users
also went up by 10%. So, it's, you know--it's not--you're not doing this in a vacuum. It's
actually used out there and improving some quality, actually have substantial improvements
in user satisfaction as well. Another thing is that coverage is very, very hard. You want
to get into the tail, you want to understand the whole web, you want to understand the
whole world but, you know, it's a very, very hard, difficult thing to go into the tailing
but it's critical. These are some of the queries that we've gotten on Google Squared. People
actually typed in titanium rings, design software, artificial tears, you know, these are queries
that people really want to get and build a square out of and, you know, we need to be
able to find this for them and find out ways and help people solve their problems. Another
thing we learned and--probably from the [INDISTINCT] is that you fail, you can ask, like, ask the
user. Don't--you shouldn't be shy about, you know, kind of being perfect and right every
time. That kind of makes that hard, brittle experience that isn't likely to succeed, you
know, every single time somebody tries something slightly outside of the domain of what you
plan to build it for. So you should build systems that are kind of robust to user feedback
and accepting of user feedback so that they can correct and you should learn from it.
And the final thing, another pitch, this is--this kind of work on open domain information extraction.
It's hard, you know, but it's pretty exciting and I think this is the place where we can
make a lot of impact by having the scale that Google has. So I'll repeat Artie's pitch,
we are obviously hiring. And I wanted to take questions. If you could use the mics actually,
that would be great; that way the video and the--everyone else can kind of hear then.
>> FreeBase is an on the acquired technology, right?
>> HOGUE: That's correct. Yes, from FreeBase. There was a company called MetaWeb.
>> That was Danny Hillis' people, isn't it? >> HOGUE: Danny Hillis founded it, yes, yes.
>> Yes, thanks. >> HOGUE: No problem. I'm going to have to
bribe you with a flag? >> Are you going to take the data that users
enter into Squared and put it back in to Search engine?
>> HOGUE: We do use the data for feedback, yes. We look at user corrections. We look,
especially at--when adding moving rows, when they're adding columns, when they're correcting
values. Obviously, it's a source of spam, right? I mean, people can, you know, go on
and put that, you know, their name is the current president of the United States. That's--you
know, it's a great wish, but it's not true. So, we have to look at it in aggregate and
we always have to compare to signals that, that exist outside of Squared but it does
turn out to be a very good signal, you know, and you can actually identify spammers. They're
the ones that fill the entire column in with their name. You know, they're not being subtle
as it turns out, so. Yes? >> Hi, do you guys do any normalization of
the data that goes into the Squares, like structural [INDISTINCT] or anything along
those lines or do you just pull it straight from the web?
>> HOGUE: We definitely do normalization; nothing kind of hand created like at the source
necessarily. But, for instance, I mentioned the [INDISTINCT] tiers that we have; we understand
measurements, we understand dates, we understand locations even, so if we see somebody refer
to London and we see somebody refer to London, England, you know, we know that there's a
high probability that those could be the same thing in the right context. Or if somebody
says, 39 inches and somebody else says a meter, we can kind of normalize those things and
we do that kind of normalization especially when we have some information about the semantics
of the string there. That's being used. >> Okay.
>> HOGUE: Yes. >> Okay.
>> HOGUE: Hi. >> I have three questions, one of which is
a follow up on the normal--on the normalization question.
>> HOGUE: Sure. >> Do you guys do some totalization in your
extraction algorithms? That's question one. >> HOGUE: So what do you mean by totalization?
I'm not quite... >> So like root of a word.
>> HOGUE: Sure. So we do some of that especially with getting lists, you now, and forming the
queries. So obviously, we--if somebody says cheeses, you know, we need to know it's the
same as cheese and so on, so we do get roots in that sense. I'd say that that's probably
the main extent of that kind of parsing. We do a little bit of sentiment analysis as well
with the aspects, so if you're talking about service and servers, we can kind of collapse
those two aspects in the same kind discussion. But yes, that's the probably the extent of
it. >> Okay. Next question, how do you--what's
your experience with bigrams and trigrams and how did you weigh the two? So would you
look at the bigrams more than the trigrams? Or you would see if you can find enough data
for three words, sequentially, would you use that more?
>> HOGUE: Yes, it's a good question. So, I'm not so sure if everybody--can everybody hear
the questions up through the mics? [INDISTINCT]. So there is a bit of light tuning there. Obviously,
there's a--but it comes out more in actually overall frequency in the corpus, right? So
we do some TFI/DF. For instance, like the phrase "A great knight" turns up less often
than 'great' or 'a,' you know, because it's just a more complex term and obviously, it's
going to show up less often. So, we do that kind of normalization. We have found out that
it does help sometimes but, you know, actually you should probably talk to Isaac sitting
on the steps out there [INDISTINCT]. He's the TL for the sentiment team. If anybody
is really interested in that kind of NLP, he's a great guy to corner after the talk.
>> Cool. And last question is, are there any papers in the field that you found useful
in classification IR? >> HOGUE: I--I'd actually refer you to Isaac
again. So I think, you know, probably check the references from the papers listed in the
talk. >> All right, cool. Thank you.
>> HOGUE: Yes, no problem. Let's go back to work.
>> Hi. Just wondering how you measure user satisfaction in search?
>> HOGUE: That's a good question. So we actually construct a survey that basically asks both
broad questions about the overall experience with something like Squared, so, you know,
"Did you solve the problem that you were trying to tackle?" as well as very specific ones
like "Did you use feature X? Did you add a row to the square? Did you know that you can
add a row to the square?" We also ask questions about specific updates that we make. So when
we initially launched Square, we weren't color coding the cells based on the confidence.
We added that later on, the kind of like, you know, gray squares and then the low confidence
squares. We ask people, "Did this help you?" you know. And we measure the difference between
those questions as we--as we mix updates. >> Thank you.
>> HOGUE: No problem. >> Okay. In the early part of your talk, you
talked about open source, Creative Commons content.
>> HOGUE: Yes. >> And of course, we got Google Squared, which
is Google property of course. What's the sort of relationship or current planning to say
create your own sort of corpus that you're allowed to sort of give back to the community?
>> HOGUE: So... >> If you're allowed to.
>> HOGUE: Yes, yes, sure. So FreeBase is still open, right? I mean, I think that's kind of
our plan is to keep FreeBase open in Creative Common license. As I mentioned before, you
know, it started of--well, it started very small, but when, you know, the summer we acquired
it, there was about 13 million concepts. Since then, we've added a ton of new data through
Google kind of sources that we've been able to also Creative Commons license things like
a comprehensive music database with artists and like, different facts and so on. So that's
our plan is to kind of keep that open and that will be the resource. It also provides
kind of a reference point, if we make another point here. You know, when your talking about
an entity in the world, we want to be the, you know, we want FreeBase to be the place
that you point because it's open so that we're not--you're not worried about changing. They
strive to keep it kind of like Wikipedia, very, you know, kind of even-keeled and steady
and it has always had identifiers that are kind of unique. So, we would like that to
be kind of the way going forward. >> So is there regularity to additional corpus
material or is it just whenever your legal team gets through it?
>> HOGUE: That's a good question. So there's a couple of answers there. So first, obviously,
yes, we need to figure out how to get the data if we're doing it ourselves. I mean,
that's a big--that's a big problem and we're working through that but it's a community
effort as well, right? I mean, like we have passionate users just like Wikipedia, then
people that are experts in steam locomotives and they add a whole FreeBase category for
steam locomotives there's an active discussion board about all that information. That's how
FreeBase has actually grown up until now and I think it's working pretty well.
>> Thank you. >> HOGUE: No problem.
>> Hey, just a question on top of this. Any APIs in plan for the dev community to build
products on top of Google Squared? >> HOGUE: That's a really good question. So,
to be completely honest, I haven't thought of it yet, so as far as APIs and plans, ability
to export a Square to a spreadsheet through Google Spreadsheet into CSB and so on and
it will be pretty easy to build an API on top of. That allows the kind of iterative,
you know, querying and so on, and obviously, we have all the tools on Search
as well. I'd say that most the effort right now on the team is focused on bringing what
we've learn from Squared on to and I think, you know, from there, I think
hopefully the tools that we have available to the dev community on--and then Google Search,
you know, which are constantly being improved as well, will get the benefit of all the stuff
that we learn on Squared. >> Sure. Thank you for that.
>> One of the most interesting problems that I seen on Google Squared is within the set
of attributes. If I understood correctly, you will determine the set of entities from
the query and from the set of entities, you try to find salient attributes on those [INDISTINCT].
>> HOGUE: Yes. >> Could you tell us about your ideas to determine
attributes from the query? For example if I ask for Arctic explorers, I really don't
care when they were born on and when they died.
>> HOGUE: Sure. >> But I'm mostly interested in when they
explored Antarctica? >> HOGUE: Right. Right. So first of all, I
think that the approach that we use which is to first build up a list of actual explorers
and then look for their attributes should get some of that information, right? Because
I think most of the time when you're discussing Arctic explorers, those are the kind of attributes
that you have. But as well as the fact that I mentioned when we do the initial query "List
of Arctic explorers" or, you know, "Top ten Arctic explorers," we look at those tables
as well and those pages. So maybe Wikipedia has a great table that includes a list of
Arctic explorers and the dates that they actually got to the South Pole or whatnot. And so we
look at those attributes as well as kind of another signal into picking good attributes
for the square. With that, I mean, those are the two sources that we have now which hopefully
get you focused. And then the last one, obviously, is user feedback. We'd like to see users add
that column to our square. >> So, also that I believe that the query
can be interpreted so that the Google Squared [INDISTINCT] could cleverly answer cars with
[INDISTINCT] by fuel economy. >> HOGUE: Yes, that would be awesome.
>> Well, will attribute--and so the entity set construction should not take into account
fuel economy. >> HOGUE: Right.
>> Should not eventually take to account the number of seats but use it as a post filter.
>> Yes. No, that would be really awesome. It's definitely something we've thought a
lot about. We see these queries on Google as well and obviously, this is something,
as I spoke about at the beginning, we want people to be giving us more information in
their queries, not less. Like, don't tell us the broad thing we want, tell us the specific
thing. So if you're really looking for a seven passenger vehicle with a good fuel economy,
it would be great if you could tell us that. So one of the things we have obviously in
that direction, we haven't used it yet, is we have a database of attributes so we know
what entities they're associated with. So, we can actually look at parsing the query
the same way we do with question answering and say like, fuel economy seems to be an
attribute; not, you know, a category. Let's see what we can do about fuel economy and
try to use that for parsing--for picking which items to list. But you're right. It's an open
area. I think it would be pretty cool. >> Thanks.
>> HOGUE: Okay. >> Google, famously, is strong in terms of
statistical machine learning in the sense, the original Google indices were these sort
of theoretic models of hubs of authority and then the entity and value extraction also
that you're suggesting also seem like they smack of parsing and all--[INDISTINCT] in
the statistical back-end. >> HOGUE: Yes.
>> I'm curious. One of the, as I understand it, benefits of building things semantically,
the so-called semantic web, is the notion that one can attach inference models and logic
models underneath a la psych. >> HOGUE: Yes.
>> Does Google have any intention of attaching systems that, in a sense, would allow sort
of reflection. Once a set of assertions are made instead of an entity and values base,
a system could then sort a query itself, reinforce it based on existing corpus of data and draw
in new inferences and assertions. Do you have any comment towards that?
>> HOGUE: Yes, I mean, I think it's definitely interesting a lot of research and I think
you're right that the approach that we're taking because it's so kind of almost anti-ontology,
with the exception of FreeBase, the heart, the good scheme that FreeBase has, it doesn't--it
isn't actually very amenable to it. We're much inclined towards building these kinds
of broad statistical frameworks as opposed to building, you know, very specific inference
engines. I think part of the reason is that in our experience, we've actually found this
to be somewhat brittle. You know, like, you can only get as many rules as you have time
to write up and we haven't found a good way of learning them. Obviously, you know, there's
other domains where that's fine if you have a very specific task that you're trying to
accomplish and there's a smaller set even, you know, a thousand [INDISTINCT] or something
like that that might accomplish that task, then that's a great domain for that. But we
found that, trust me, the types of queries that we get, you know, it's just--you have
never thought of writing inference rules for them. So, we've actually--we just found that
using the scale to kind of build from the bottom up is a lot easier typically than trying
to write rules that go top down. That's just kind of an approach, I think. So I guess my
answer is that we don't really have any specific, you know, plans to build these kinds of inference
engines. >> Is it all a route of exploration on Google's
part in terms of its R&D? I mean, there is the notion that where domains, in essence,
tumble out of collections of entities, right? >> HOGUE: Right. Right.
>> So the notion of cheeses... >> HOGUE: Of course. Yes.
>> ...can come out that, you know, you may be able to then talk about the notion of dairy,
manufacturers of dairy products. >> HOGUE: Yes, yes.
>> So, is there at all any interest [INDISTINCT]? >> HOGUE: And so, yeah--actually, I mean,
I shouldn't--I shouldn't say that we're not working on it, because I think there's a lot
of interest in this. There's a lot of folks in the--in the Google research, you know,
kind of area that do think about these things a lot and as well as FreeBase, honestly, a
lot of the guys there, because they're so steeped in the scheme of administering ontology.
There's a great language called MQL, it's Metaweb Query Language, that actually allows
you in JavaScript or in JSON to construct queries that do very much--you know, would
look an awful lot like, in principle--if they say, "Show me all the Googlers who have written
books published since 2000." And you just kind of express that as a, you know, as a
bunch of JSON that has stars in it and open wildcards. So I think there's a lot of tools
for doing that. I guess within my group, it's not really an approach that we consider.
>> Thank you. >> HOGUE: Okay
>> Do you have any plans for--you suggest that the more Enterprise oriented?
>> HOGUE: So Enterprise tools, yes, so within my group now, we're mostly focused on doing
things for the open web. We obviously do have an Enterprise group and, actually, we use
a lot of the basic parsers that I was talking about before for dates and people's names
and so on. And there have been some great ideas there about looking for--looking at
those signals to do a better job at ranking of smaller enterprise corpora where you don't
have the kind of rich page rank signals and other things like that like you do on the
open web. So yes, there's some talk about that. As far as, like, kind of Google Squared
of level, you know, complex analysis, I haven't--I haven't seen anything like that.
>> Thank you. >> HOGUE: No problem.
>> Okay. Going back to something you said on sentiment analysis.
>> HOGUE: Yes. >> At one point, you said that you're using
a traditional random field to determine if--whether in a certain sentiment, if something really
is happening and things like that. And you used the word that you were actually training
this. >> HOGUE: Yes.
>> So my questions is, by training, you mean you actually have someone tagging a text corpora?
You know, like "I had a great time," and this is, you know, a great phrase?
>> HOGUE: Yes, so from the--this is specifically for negation detection.
>> Okay. >> HOGUE: And as I mentioned, we hand label
a set of 2,000 sentences that had negations and didn't have negations in them for this
part of the sentence that was negated and then trained on that. So that was a case where
we actually went and, you know, spent some actual engineer time to build up a golden
corpus rather kind of learning, you know, from the ground up.
>> Okay. >> HOGUE: Yes.
>> And so my follow-up question is kind of similar to what Vidal mentioned. It's like,
you know, a lot of problems with--you know, a lot of these problems with these sport of
sparse [INDISTINCT] with like, you know, for example, like saying, "I had a great night,"
is pretty much the same as having--saying, "I had a great evening."
>> HOGUE: Yes. >> And you can determine the--that the evening
and night is the same from some other synonym database. I mean, is there any idea of like,
you know, training inference engines and stuff like that to actually reduce the sparseness
or classify things or stuff like that? >> HOGUE: Sure. Yes, it's actually a really
good point. And so, yes, we've had a lot of thoughts down that path. I think there's obviously
context as--it's not just the words in the lexicon, the sentiment words that we can use
as a distributional similarity metric to identify when they're being used in the same context.
You can also think about doing it for the aspects, for the nouns, other things like
that. So yeah, it's a great idea and I think figuring out that when people say evening
and night, they mean roughly the same thing because they're used in the same context,
that's one approach. And you can think of other ones based on, you know, the actual
UText and so on. Hi. >> Hello. Good evening.
>> HOGUE: Yes. >> I have two questions.
>> HOGUE: Sure. >> First one is on Google Squared. How do
you keep track on the back end what people are searching for and also to make better
searches for that same item in Google Squared? And then the follow up one to that one is
how do you keep track of overseas, how there are different phrases like when--it's one
thing to do within the US or a country that has, you know, US type slangs. What about,
you know, all around the world? >> Yes, it's a good question. So, on the first
question which is about how we actually keep track of things, we may have logging, right?
I mean, we a logs policy where we keep the logs for a certain amount of time. Google
Squared is nice in terms of privacy in that we really don't need much information about
a particular user to kind of learning from the logs. Really, all we really need is their
query, so we can strip all kinds of personally identifiable information now and just use
the raw data. If the person searched for this, then they added this, then they corrected
this, you know, that kind of brief session is enough information for us to kind of say
Swiss is a good cheese; people keep adding it when they search for cheeses. So does that
kind of get it? I mean, yeah. >> Yeah.
>> HOGUE: Yeah. And then as far as the overseas, I think--so first, Google Square is only launched
in English right now. It's pretty tough to get it launched in English and we haven't
actually, you know, tackled it in other languages yet. But in general, the nice thing about
the web is that for many countries and many languages, there's a really rich set of documents
out there already in those languages. In any of these statistical techniques that I was
mentioning where we learned from a large corpus of information, if we can get it to work in
English in a general way, it's very pretty to learn it in German or French or something
else like that, other places that have different slang because they have a similarly large
corpus of data as long we could figure out which documents are in which languages. Does
that kind of get at what you're asking about? >> Yes. Yes. Thank you.
>> HOGUE: Yes, cool. Thanks. >> Hello. I was wondering if you guys--like
another person asked if you guys were going to do an API. With a net API, would you--like,
do you think would you be able to, like, open up Google Square algorithms kind of scan other
data that's not necessarily on Google? Like, kind of like--or maybe bundle it up like on
a Google Search appliance type thing? >> HOGUE: That's an interesting question.
So, we certainly could. I think that the problem with Google Squared, as I mentioned, it's
very reliant on the kind of information that we get by doing a Search, right?
I mean, it's very reliant on being topical, being long tail, you know, understanding
lots and lots of information, having good ranking. So if you have another corpus that
has similar properties about, you know, having that kind of comprehensiveness, yes, I certainly
think that that would be possible. As far as the underlying techniques, the ones that
aren't based on web search, those are completely open, like identifying a date in the document
or identifying a fact in the document. Those kinds of things are much more generalized
and don't rely as much on web search and as I mention before I think that in the Enterprise
situation, part of Google's Search appliance is already people looking into adding those
kinds of things. >> Yes, so, you know, they'll be able to,
like, search internal blocking and some databases and stuff?
>> HOGUE: Sure. Yes, exactly. >> Okay. All right, thanks.
>> HOGUE: No problem. >> So you mentioned that there were about
20 million things in FreeBase, that most of them were hand inputs. Is that correct?
>> HOGUE: Hand is--if by hand you mean like big script--scraping the site and, you know,
kind of reformatting it and dumping it into a database, then yeah, hand.
>> So my question is at what point do you think that essentially you're going to be
able to set out and just sort of scrape the open web, try to populate more of FreeBase?
>> HOGUE: Yes. >> How far off are we from that sort of thing
and how much human intervention is going to be necessary to sort of vet algorithms on
that? >> HOGUE: Yes, that's a really good question.
So I think, as I pointed out, I mean, the difference between FreeBase with 20 million
entities and the goal of a billion entities and, you know, many, many more facts even
than that of our kind of fact extraction, tabular extraction of other techniques, I
think that we're already--now that, you know, Freebase is part of Google, we're already
looking at how we can take that information and find the best nuggets of it. The easiest
thing to do is just look to augment existing FreeBase entities. So, I mean, we probably
have more facts about the Eiffel Tower than they do just because the web talks about a
lot more things than perhaps they've been able to import. So that's an obvious one and
it's probably pretty easy to identify things that overlap. And then the harder problem
which we've just gotten started on is trying to identify the new things, you know, and
to try to identify whether--it's tough to tell whether something is just a different
name for a thing or it's actually a completely new concept you haven't seen before. But yes,
so definitely the direction we want to go is taking the kind of script writing and scraping
out of the equation and looking much more at these general techniques which would helpfully
scale much further. >> Thank you.
>> HOGUE: No problem. >> Okay. So at some point you mentioned that
there was sort of a value to doing all the extra work, like a value to each query. How
do you figure that out, like, how valuable a query is?
>> HOGUE: That's a good question. I wish I had an answer that was not kind of poofy.
So I was kind of using my own internal metric, right? I mean, if I just need to know when
Martin Luther King was born, the value is probably somewhat low but the time to execute
that query is also low. Whereas if I'm trying to decide where I'm going to college, I mean,
that's a, you know, that's 10--you know, a 100--$200,000 question these days, right?
I mean, like, that's a--that's a pretty big deal and so I'm willing to invest an awful
lot more time in putting that information together.
>> So does algorithm decide how much processing power to put into a certain query?
>> HOGUE: Yes, again, I wish that we--I wish that we could do a--priority kind of identify
how important a query is but we can't. I mean, I think that that's, you know--I would love
if we could figure out an algorithm for that. I can--I can think of ways to start to estimate
it. You know, you could look at the amount of clicks and other things like that and how
much action did the query get, how long do the people spend on it, but no. And honestly,
we don't actually have a very good metric for that. I was using more kind of a made
up... >> Okay...
>> HOGUE: ...theoretical metric. >> Do you see this as integrating with advertising
at something? >> HOGUE: In what way?
>> Well, I guess advertisers could buy, I don't know, spaces on the grid or...?
>> HOGUE: That's good question. Now, that one I hadn't thought of, honestly. Yeah, you
could certainly imagine, I mean, having paid placement, but Google Squared is completely,
you know, ad free right now. But yeah. >> Yeah. Okay.
>> Hi. There's been a lot of buzz about and how it's just getting a lot of people
using it as a resource. Do you think that--like what's your take on that site? And do you
think it will be a destination resource for search and, as they progress, people looking
for the right answer? >> HOGUE: Sure. Yeah. That's a good question.
I think Core is very interesting. I think it has some quirks right now. I mean, I think,
obviously, it's much better at kind of tech questions. You know, if you go and you're
asking about some esoteric--you know, how long is the Olympic track, then you'll probably
get an answer obviously but as far as the existing content on there, it's much better
as--telling you how Steve Jobs is doing perhaps than other kinds of domains. As far as--I
think what they do definitely have is this ability to do two things; first of all, to
answer much more complicated queries. I mean, kind of like Yahoo! Answers and, you know,
other things like that and that's something that's been done before. But I think the social
aspect is actually really important as well and the idea of kind of trusting the person
that is answering your question and having some sort of identity to that. Yes, I think
that they're doing a great job with that. You know, obviously, Aardvark, which we acquired,
is also doing things in that domain. But yes, I think that as far as becoming a destination
site, I don't know, it's probably at the whim of every other, you know, social startup and
whether or not they get enough traction. But yes, they seem to have a good take on it.
>> Thanks. >> HOGUE: No problem. Hi.
>> I was just--I was just wondering is there anything coming up with Google Finance because
now you've created a huge context, you know? And is there any idea about integrating it
to improve Google Finance people [INDISTINCT]? >> HOGUE: Yeah. That's actually a really good
question. I mean, I think we definitely talk to the team a lot. I think there's a couple
of things working against us in Finance. But--not to say that we don't want to work on it, but
one is that a lot interesting information is proprietary. It's a little harder to get,
so finance tends to do more things like deals to actually get the data that they want. Another
thing is that because it's financial information, they have a lower tolerance for bad precision.
You know, if you report, you know, the incorrect value of some asset or whatnot, you know what,
I mean, it's a pretty big deal. Whereas if I tell you that Martin Luther King Jr. was
born on 1930 instead of 1929, chances are you're not going to lose a million dollars
over it. So I think that those two things are working against us. But you're right,
I think this kind of open extraction to be able to do this kind of analysis over large
pieces of information, especially with sentiment analysis as well, I think is definitely an
interesting area. >> So to summarize, it's because it's too
big of a risk in market at the moment? >> HOGUE: To summarize, it's something that
we're interested in working on, I guess. I'm not saying we're not working on it; I'm just--I'm
just trying to point out some of issues with actually getting something out the door.
>> Thank You. >> HOGUE: Yes, no problem. [INDISTINCT[ the
room is popular. Hi. >> Hi. Thank you. So given how social the
web is and it's becoming in reliance on social networking, is it possible for Google Squared
to be personalized to people based on the opinions, or beliefs or habits of people in
their network friends, things like that? >> Sure. Yes. I think that's actually a really
interesting direction that it could go. I think some of the things we tried to do when
we launched Square, we allow you to share tables. We allow you to kind of edit, you
know, see another person's table. We didn't have a kind of read-write kind of system.
It was, you know, a little more complicated than we want to tackle for a labs product.
But we definitely had this idea that, you know, I'm planning a vacation; my wife obviously
has some say in that, you know, probably more say. You know, so we want to be able to collaborate
on this, where I think that was kind of an important thing, the idea of saving state
and, you know, kind of iterate through it. As far as getting feedback from your network,
I mean, I think to the point about Core and other social answer sites, I think it's actually
a really interesting area where, you know, it mostly deals a lot with trust. Do you trust
the site providing information to you? School teachers use it too as an example of, you
know, how to gauge whether or not--a lot of our email feedback, you know, is from teachers
using this way to teach students to gauge the trustworthiness of a site and how to actually
do deeper research and not just trust everything you see because Google Squared is wrong a
lot. That's definitely a good teaching moment there. But--so we deal with the trust in that
way. I think dealing with trust in terms of the people that you know are providing the
answers is another great way. It's not something that we've even explored but I think it would
be pretty cool. >> Searching through their friends on the
web? >> HOGUE: Their squares are there. Yes, yes,
yes, and ranking that. I mean, we do have a social search product that will look at
your, you know, your little profile and so on and try to surface results that your friends
have shared on Reader or whatnot. And obviously, that--those results would be available to
Squared as part of the ranks documents. >> Thank you.
>> HOGUE: You're welcome. Hi. >> Yes. So if you are dealing with sort of
relatively straightforward topics and, say, services at a restaurant or service at a [INDISTINCT],
I think the sentimental analysis should be fairly straightforward so you can do very
simple machine learning on that. But if you're dealing with much more specific questions
and on a certain topic of, say, a [INDISTINCT] and engines, how do you handle that, I mean,
to pick out just much more specific things? Do you have any ideas of how to handle that?
>> HOGUE: So just to make sure I understand what you're--so you're asking about the difference
between kind of a broad domain with lots of evidence like local businesses versus the
narrow domain. >> Yes. Yes.
>> HOGUE: Like, I don't know, a car--auto repair or something else, I think, right?
>> You know, specific part of an engine, you know.
>> HOGUE: Do people express strong opinions about specific parts of engines is one question?
Yes? >> No, but it could be valuable for a lot
of people if you are deciding what to do about, you know?
>> HOGUE: Sure. I mean, I think some things are universal, right? I mean, like, there's--most
sentiment terms, kind of, you know, the way that people express frustration or happiness
tend to apply to all domains, and it's the nouns and the aspects that change out for
[INDISTINCT]. They'll maybe be referring to the carburetor in one domain or the wait staff
in another domain. So I think the nice thing about the way that we approach sentiment analysis
is that lexicon that tells us what's positive and what's negative should be universal once
we've developed it. And then it would just be a matter of--I mean, I don't know if there's
different words that you might use to express happiness...
>> But I'm thinking if this is an engine, I mean, like, maybe positive is very specific,
what's the core temperature range or like, you have like, say, like your peak--your...
>> Yes, like peak torque or something like that is in a certain range and [INDISTINCT]
really happy about that. >> Yes.
>> HOGUE: Yes, yes. Yes, it's really interesting. I think we--I can imagine using some of the
tools that we have to build something like that but, as I said, most of our work is focused
on the broadest possible applications. So yes, I think we have the tools to build that.
It's not necessary an active direction that we're going right now though.
>> Thanks.[INDISTINCT] >> HOGUE: No problem. Hi.
>> Hi. Do you ever think about incorporating videos and their content into Google Squared?
>> Yes. >> Like, you know, if I wanted to look up
fitness videos or something like that. >> HOGUE: Yes. Actually, the first time we
showed Google Squared before released it to our Vice-President here in New York, Steve
Feldman, we led off with the query of roller coasters which was kind of cool because we
can show the heights and the top speeds and all that. And his suggestion was to add videos.
He's like, "Wouldn't it be cool if you could, like, watch a video of each coaster going
down its main drop?" >> People vomiting.
>> HOGUE: Yes, exactly. So, yes, I think totally. I mean, I think the main issue there will
be getting a video that's on topic, right? I mean, like, we--a lot--we still struggle
with Square. There's a roller coaster called Mantis and it's really tough for us to get
a picture of the roller coaster and not a picture of the bug. So I think video has even
fewer signals would be my intuition, and we'd have to work a little bit to make sure they're
on topic but I think we could probably do it.
>> Thank you. >> HOGUE: Yes? Hi.
>> Hey, a little bit, I've discussed with you earlier but just to give the background
that similar application have been working based on social networking and one thing I
just wanted to check with you; how did you guys improve the intelligence from 50 to 60
or the satisfaction? Because we found, working on those, improving it manually very specific
to particular queries was pretty tedious job and not every time relevant because it's a
time sensitive query. Some body's looking for game on a particular day and improving
that query won't make sense. >> HOGUE: Right. So we try very, very hard--and
you probably got the gist during the talk--not to do this kind of manual one off bespoke
kind of fixes, right? We have a little bit of white listing and black listing. We try
to avoid, you know, racial slurs and things like that and it's--you know, we want to make
sure to get those things right. But in general, we're really trying to do the most general
possible thing. So you asked about how we went from 50 to 60%, we've actually gone further
than that since then. It's through looking at the broadest class of errors that we might
make and attacking it in the broadest possible way. So, for instance, maybe--I don't know
the specific, you know, improvement one at that point, but we did things like add a new
general extractor. You know, we added something that looked--maybe we were only looking at
tables and now we're going to look at those clear problems, you know, attribute colon
value and try to find more data that supports that. We might look at a new signal that's
the next, like, user refinements on that might give us more of a signal as to
what's relevant and what's not. So typically, when we make those kinds of improvements,
we try to look at a very broad case and we try to do any manual fixes, with the exception
of these kinds of, like, worst case scenario kinds of blacklist and things like that. And
as far as the [INDISTINCT] issue, you know, there's certainly things that we get wrong.
Question-answering, Amit Singhal, our Google fellow for Search submitted that we were getting
the--I think it was the Prime Minister of India or something like that in your list
today, the arrow we had. Anyway, when we're getting some question wrong, he sent it to
us and it was--it turned out that he had just gotten reelected, and the new table that we're
pushing is going to fix that. So we'll fix things kind of only in a general way but time
sensitive stuff is still something we struggle with.
>> Sure. >> HOGUE: Yes.
>> All right, thank you. >> HOGUE: Yes. No problem. Hi.
>> Hi. How would Google Squared deal with life-critical or mission critical questions
like my dog just swallowed rat poisoning; what should I do?
>> HOGUE: Google Squared wouldn't deal with it at all, I hope. Google Squared is mostly
about categorical queries. So I mean, like, maybe it would answer ways to kill my dog
or something like that but maybe not. But, you know, is actually probably
a much better source of information for that and there, I think, we're going to try to
aggregate them [INDISTINCT] into the web. Hopefully, it would go to Answers. Cora I
think would be a great site probably. In theory, it would be a great site to answer a question
like that, but it's not really a domain for Squared. Yes.
>> Could you actually turn it away if they had those kinds of questions or...?
>> HOGUE: No. It's an interesting point. We might want to. She had a good question with
whether we should turn users away if they ask something like that. One problem is that
it's hard to identify those. You know, as, you know, as we were talking about before,
like classifying queries like that can be difficult. Maybe if there's certain very,
very obvious buzzwords, you could build a classifier. These kinds of things are very
tough. You want them to be 100% percent precise and 100% covered and that just doesn't exist
in these kinds of domains. Yes. My hope would be that a user would never find them self
on Google Squared for a question like that, to be completely honest. Yes? Hi.
>> Going forward, how do you see Google Squared integrating with mainstream web search? Or
I guess in other words, if you see that a user has searched for something that could
be better represented by Google Squared, do you ever see--do you ever think you would
see a Squared page showing up in place of URLs?
>> HOGUE: Yes. I mean, it's certainly something that we would love to do. As I mentioned,
we we're taking parts of Squared that we believe could be really useful like the question answering
bit and we've already launched that and we're pushing that on We're taking other
pieces in that they're kind of in active development as--they have to be kind of restructured and
rethought because they don't quite make sense in exactly the same way. But we think that
they're very valuable to answer these kinds of queries. As far as Google Squared knowing
enough to kind of take over the whole page when you type in, you know, hybrid SUV, it's
a little bit of a stretch but I can certainly imagine at least being kind of--there being
a way to get it immediately from your search results. But Google is mostly about getting
you to the destination of where you can best serve the information, so.
>> And do you have any figures on how many Google web searchers could be represented
by Squared? Is it--is it common to have searches come in that you would say, you know, let's
show this in the grid? >> HOGUE: That's good question. I don't have
current figures. I do know that they're fairly frequent. You know, I don't necessarily want
to say exactly but, you know, they're--you know, they're not--it's a non-trivial portion
of our query stream. Hi. >> You said the users [INDISTINCT] is like
allowing with the precision and you can measure the users [INDISTINCT] by, like, doing a survey.
But how do you--how do you measure own precision. >> HOGUE: Yes, so measuring our own precision,
this is something we do kind of a cross search quality. Basically, it's a--we build up a
set of queries that we care about, we build up a set of answers we believe are correct,
and then we kind of measure it. We automatically query our system and check whether or not
we're getting things right and we measure precision of that and we call up that--just
like, you know, any other kind of machine learning task.
>> So that's like way existing before. >> HOGUE: I'm sorry?
>> That's like existing way before the Google Squared.
>> HOGUE: Sure. Yes, yes, we do this for all types of quality problems.
>> And the other question, is Google Square is like a good tool maybe for academic research?
>> HOGUE: Yes. In fact, as I said, as I mentioned, that's one of our key demographics is school
teachers and so on. But yes, I mean, that was definitely the book report or, you know,
kind of research report domain is one that we sell at.
>> Yes, because I see that you have image like embedded in the [INDISTINCT] sheet. What
about the other maybe PDFs or those [INDISTINCT]. >> HOGUE: Sure. Yes, I mean, as I said, if
we can get people the information that would help them, we do pull facts from PDFs because
they're part of the Google index but surfacing them, yes, might be that interesting route.
>> Thank you.