Welcome to the Search Engines class, sponsored by the
School of Information. I'm very delighted to have as
our speaker today, Bradley Horowitz, who is a visionary
in the area of media spaces.
He co-founded a company called Virage, which is a market
leader in multimedia information analysis
For the last several years he's been at Yahoo!
as a director of technology, and I'm going to let him give
the details about his job there and tell him about you since
he has some slides on that.
But I think we're in for a really fascinating lecture,
so please join me in thanking Bradley for coming.
BRADLEY HOROWITZ: OK, so as Marti said, I'll start
by talking about myself.
I run a group at Yahoo!
called the Technology Development Group, and it's a
small group, there's about I'd say 20 people or so that are
affiliated with this group.
Some of them are not household names, but people that might
be known to you -- folks like Jeremy Zawodny, Chad Dickerson,
the former CTO of InfoWorld.
We've got some members in this class that are part of the
broader technology development group, because part of what
we do is work with Berkeley.
We have a lab called Yahoo!
Research Berkeley that's over on University Avenue and it was
set up the do collaborations so that we could actually have
students onsite, interns passing through, do Yahoo!
research, take your research onto our facility, and that's
been really successful and a lot of fun for me to help get
off the ground and get going.
So if you'd like to know more about that you can
talk to me separately.
So, yeah, at Yahoo!
I've been doing this Technology Development Group for
about four months.
Part of that is focused on innovation, so I run an
internal website, which we call Hack Yahoo!
which is really focused around breaking the rules with inside
of Yahoo!, hacking Yahoo!, and innovating outside the
boundaries of normal product development.
Before I did that, I was actually director of media and
client search, so that includes all the media search properties
like image search, video search, audio search, and
Flickr, as well as client search, which is toolbar
and desktop search.
So in the questions period, if you want to know something
about toolbar, desktop search, I'm pretty clued in to what
we're doing there as well.
And then, as Marti said, prior to that I had my own company,
and that kind of completes my work experience.
Until I joined Yahoo! I had never had a real job or
a resume or anything like that.
I dropped out of the PhD program at MIT to start Virage.
And Virage was really about automated media indexing, and
so we built workstations that basically took media in one
side and pumped out metadata the other side, and we did
that using all the technology that we could bring to
bear on the problem. So things like speech
recognition and speaker ID and optical character recognition
-- anything we could do to transform signal into
meta data we did.
And we sold Virage to a British company called Autonomy, which
recently purchased Verity in 2003 and that's when
I joined Yahoo!.
And as I said, before that I was an academic at MIT at the
media lab studying computer vision, computer graphics,
image processing, that kind of thing.
So, I'll explain these pictures later, but they're just
-- they're cool pictures.
So, before I get into multimedia search per se, I
want to cover a little bit about Yahoo!'s -- Yahoo!
Search in particular and the vision of Yahoo! Search.
And the vision of Yahoo!
Search is to enable people to find, use, share, and expand
all human knowledge, and that's a grand vision.
A vision is not like a mission statement -- it's not something
we hope to finish off next Tuesday or even next quarter,
this is something that will keep us going for a long time.
Now, Find is something you might expect to see in the
vision statement of a search engine, finding is kind
of the bare minimum that you have to do.
Use really speaks to a more holistic life cycle from the
moment of intention when a user realizes they want to get
something done or accomplish a task to the actual
completion of that task.
Share is something I'll talk about a lot today, so not only
finding stuff for your own purposes, but also sharing
that knowledge with others and benefiting from the knowledge
others have shared. And then finally Expand, so not
content to just index the documents that are out there
today, but creating platforms that allow more human knowledge
to flow into the system.
And it turns out that that forms an acronym, Find, Use,
Share, Expand, Fuse, which means to bring together
or unite or fusion, this great release of energy.
So we like Knowledge Fusion.
So this is the Yahoo! front page, you've probably
all seen it -- pretty exciting, and Search is right there
at the top of the page.
And multimedia Search is also at the top of page.
If you actually look at the tabs, we call those tabs up
there going across the top, they're kind of rank ordered by
utility and traffic, so image search and video search or
actually the number two and three vertical search engines
that we have in terms of traffic, so it really speaks to
how important they've become in the portfolio of products.
So, this is a pared down Search site for Yahoo!
-- if you go to search.yahoo.com, and not a lot
of people know this, but you get kind of an uncluttered
minimal search experience.
And if I were to type in Vishnu into the Yahoo!
search engine, you'd see a result like this with the
familiar sponsored listings on the right, the organic search
results, the number one listing there is the
Wikipedia entry for Vishnu.
And if I typed in Vishnu into image search or I clicked on
that image search tab, you would see a result like this,
which I thought was fun because it's very colorful and a
great example of what image search does.
And there's a bunch of things on this page which I think are
probably evident to all of you, but you can do a safe search
or an unsafe search and that speaks to the adult queries
or if we -- we use the term euphemism popular, the popular
queries, especially in image searches they account for a
good fraction of the traffic.
But you can also find images based on size, you can find
images based on color, so color or black and white images,
you can go into an advanced search would allow me to do
things like site restrict.
So there's a lot of different ways I can get in here.
And we have what we call and Also Try up there, too, so
looking at the co-occurrence of words in the query stream and
giving you some hints as to what other people before you
have queried for to spur you on.
So, if I click on any one of those results, you drill down
into a page that looks like this, and this page
is interesting for a number of reasons.
One is that you see this kind of framed approach, and what
you see in the lower half of the screen is actually the web
page upon which we found that image unperturbed, and that's
very important due to copyright restrictions
and kind of IP use.
If we were to take that image and use it out of context or
God forbid make money off it by showing an ad next to it while
it was hosted on somebody else's site, that wouldn't make
them happy and there were a number of lawsuits around
this and it was determined that this is fair use.
So, framing the site such that we the Yahoo!
hosted top frame, which is basically a description and not
generating any revenue, but this bottom bit here, which is
the original site unperturbed with whatever revenue
generating mechanisms the publisher wanted to have there,
whether it's banner ads or whatever, was fair game.
And so that's how we do this today.
You can also click on that link there to jump in an unframed
fashion back to that page.
And you've also got some of these share aspects up there,
so I can mail this search result to a friend, and it's
actually not mailing the image itself, it's mailing a pointer
to this very framed rendering.
Now sometimes, this is an example when the image that you
see up there is actually hosted by a provider that doesn't
like their content to appear on another person's site.
And so they can detect that it's actually a Yahoo!
hosted web page that is pulling the original image off and they
spoof, they cloak, and basically send up other bits as
opposed to the original image itself as founding context, and
that gives you a result like this, and the reason I wanted
to show that to you is later I'll talk about some
distribution strategies, but this speaks to kind of an old
school mentality, a Web 1.0 mentality, which is if I'm
going to host content for somebody then they're going to
see it on my site and I'm going to generate all the revenue and
we're not going to provide free hosting of images.
And so occasionally you run into this problem.
Here you see black and white photos, so I just clicked on
the black and white link there and we can restrict, and I'll
talk later about how we actually do this, but you can
qualify by color, space, or size.
What's interesting here is a mechanism we call
a direct display.
And so here as opposed to going into image search, I just typed
photos of Vishnu into web search, and it's always
surprising to understand and discover how few of our users
even understand the concepts of tabs across the top.
Even mechanisms like that, which are probably second
nature to everyone here, are hard for some users to
understand, and a lot of users will type inquiries like photos
of Vishnu as opposed to going to image search to look
for photos of Vishnu.
And so to respond to that what we do is in line we show you
the first few search results right there, which are kind of
steering people and guiding people toward the vertical.
If you click on that you'll end up in image search, but it's
another mechanism beyond the tab to drive awareness and
discovery of the feature.
So, we've extended this so that you can actually type black and
white photos of Vishnu or you can type large black and white
photos of Vishnu -- there's a bunch of other trigger words
that will trigger this, so pictures of Vishnu will do
the same kind of thing. And again, this is all
geared at driving awareness.
And the same is true for other multimedia verticals.
if you type videos of Vishnu, the same phenomenon will happen
-- you'll get a direct display, which will help drive traffic
toward the video search vertical.
There's large black and white photos.
So, that's the basics of multimedia search, and all of
our search properties, whether it's image, video, music
work along the same basic principles.
You've heard this from the other speakers that have come
in this year, basically multimedia search in process
is not too different than other forms of search -- there's a
standard crawl, index, serve paradigm.
And so we have automated spiders, which go
around grab and harvest content off the web.
In our case, we actually don't run a separate spidering for
multimedia content, we leverage what we call the YST, Yahoo!
Search Technology crawler.
So the same crawler that goes around and builds up our 20
billion object web index also harvests the media.
And so in addition to that we have feeds, so we have
relationships with providers and in the case of media, these
are providers like everyone from iTunes to Viacom that
basically spoon-feed us content, and I'll talk about
why that's important later.
But it's not just crawling, we also have access to kind of
behind the firewall scenarios, like iTunes which are not
acceptable via crawl, but fed to us directly from Apple.
And then in step two there is the generation of metadata
and dropping it into an index -- again, this is fairly
standard search technology.
We also do things like summarize with a thumbnail, so
as opposed to an abstract, we actually create miniature
renderings of the content, and in case of an image, we scale
down and create a thumbnail image, in the case of a video,
we have to determine which frame of the video actually
best characterizes the content and then pull that frame
out, render that as a thumbnail and save it.
And then the final step, of course, is a real-time query
processor, which is scalable and clustered so that as the
query stream comes in, we can serve back the results.
Now, there's a bunch of challenges that we have
in media search which are different than
traditional web search.
One of them is around relevance and ranking.
So in the case of traditional web search, you've got things
like link flux measuring the number of links pointing at a
site to give it a page rank or a means of determining how
relevant that is, and that's very important for the
navigational query. So if I type in IBM, I don't
want a site that has a lot of references to IBM, I want to go
to the IBM.com site and I can find that canonical site by
looking at this link flux.
But in the case of images, that is less meaningful, and often,
since you're not navigating to a site, you're not looking for
a canonical site on tigers, you're not looking for
Tiger.com necessarily, you're looking for a great
picture of a tiger.
And so one thing we can do, which you can't really do in
web search, is use clickstream analysis to determine that.
So, what we do is we present you with that grid, that page
of search results, and then we begin to look at what users
are clicking on and look for anomalies.
So, if the fourth position is a tiger that gets clicks higher
than the relative fourth position on average, we will
promote that up and say you know that must be a good
picture of a tiger because people keep clicking on it.
And one reason that's not possible in traditional web
search is when people spam and things like that, you don't
know what you're going to get until you take that secondary
action and actually travel off-site to the site and then
you find out it's garbage and then you go back.
In images the case is a little bit different because you're
actually able to digest and consume the results directly on
the search results page, so you don't have this kind of affect
of moving off the site before you know whether you like what
you've got -- you can at a glance understand that that's a
good image and we pay attention to that and actually treat
clicks as votes, which will promote the good
stuff to the top.
You do have the problem, of course, that today I imagine if
you typed in a query like tiger -- I mean I don't remember how
many Vishnu had, but you're talking about potentially
23,000 images, very few people will get off the first couple
pages of search results.
So you have the problem of the rich getting richer --
whichever ones we decide to place on those first couple
pages get to participate in this voting process, but all
those images that are kind of trapped in the bottom 20,000
never stand a chance.
So the problem of relevance is hard and different
for multimedia search.
Another problem is that crawling multimedia data can
be difficult, and this is for a couple of reasons.
One is the URLs are often dynamically generated and
people will intentionally make it hard to get to them, and
this is kind of, again, harking back to the Web 1.0 mindset,
but people didn't want to host images or video or audio for
the benefit of others, so they make it difficult for people to
actually discover the links they're in.
So a simple crawl can't necessarily discover all
of the multimedia objects.
Flash for that matter, so it can be hard to actually
get to the media itself.
Another challenge, and this is true for web search, but
it's more true for image and video search, the storage
implications are serious barriers to entry and operating
at scale you're talking about billions of objects in terms of
the storage of the thumbnails in the index itself, so that's
a consideration as well.
And then probably the biggest challenge is that multimedia
objects are opaque, they're binary objects that
aren't self-describing. The nice thing about HTML is
the same HTML that's used to render the page also tells you
the content that's on the page -- it's very easy to strip out
the markup in the tags, and what you're left with is
a full text rendering of the content on that page.
The same is not true for an image or a video or
a song out on the web.
When we bump into that object there's often very little
data to tell us what it is.
Occasionally you can depend on some header information that
kind of give you the bare minimum, but there's not this
kind of self-describing nature that web pages have, which
makes it very difficult to extract the metadata, and I'll
talk about that a little bit.
And then as I mentioned, relevance and ranking is
difficult to determine.
So, our approach to metadata extraction is kind of
by all means possible.
We are shameless and we'll grab the easy stuff, so it's
relatively easy to look at an image on the web or a video on
the web and determine the width and the height and the duration
and the aspect ratio and sometimes you can do things
like get out the color space for the Kodak.
So anything we can get through this kind of syntactical
analysis of the object we do, and we save all that and it's
very important metadata so that people can do queries like the
ones I showed you, show me large images of Vishnu.
But then you have to get smarter and begin to use
heuristics to get at the actual semantic content.
So a great one you can use are things like the name of the
file, so if the file's named Vishnu.gift, that's a big
hint as to what the file is.
You can also use alt tags, so this is a mechanism for people
to specify the content, and while your page is loading
they will often go there.
There's other things that I haven't put on here, things
like is the image a link and if it's a link what is it linked
to, or is that image linked to by other things and can we
grab the text off the links that point to that image.
You can also look at a visual rendering, so oftentimes images
are laid out in galleries or tables and we reassemble those
tables and look at the captions -- we look at the text that is
literally physically close to the image and rendered next to
the image and we scrape that text as additional context
to tell us what it is.
And then as I mentioned the formats and the headers
are also valuable means of pulling out information.
So in MP3 files you have things like ID3 tags, in JPEG files
you have EXIF information, in QuickTime files you've
got rich data in there.
So, anything we can get out of the headers we do, but
unfortunately a lot of this is left unpopulated or stripped as
it moves down the workflow and pipeline, so we often find that
these things are empty of nominal value.
And so that gets to kind of deeper analysis, so again, my
previous company looked at what could we do with video data to
automatically derive metadata from the signal, and there's a
bunch of things you can try.
So in the case of images, which I just showed you, it's
relatively simple to determine whether that's a black and
white or a color image -- that's not rocket science.
You can also determine is the image photographic or is it
clip art -- these things are relatively easy.
Then you get into more difficult and tentative kind
of levels of algorithmic recognition, things like
acoustic finger printing.
So what acoustic finger printing is if I have an MP3
file of you two's beautiful day and I've got a wave file, is
there a way to determine that those things are actually
dupes, that they're the same content rendered in different
file formats or different bit rates, and acoustic finger
printing actually looks at the acoustic signature of the
content and compares that to determine that these two are
actually the same thing.
Speech recognition is the process of automatically
deriving text from speech -- that's something
I know a lot about. My previous company worked
in that space for seven years and it's fuzzy.
Right now the state-of-the-art is such that if you train using
a product like IBM's ViaVoice or Dragon NaturallySpeaking, if
you train your laptop to your voice and or a mic like this
and have a controlled vocabulary and controlled
circumstance, it'll almost not drive you crazy.
But in the case where you're kind of plugging in CNN or, God
forbid, MTV and routing that through a speech recognition
engine, you're going to get poetry or worse
out the other end.
And it's interesting because there are a lot of cases where
you do have, you know, Peter Jennings, you've got somebody
who's paid to enunciate clearly and is well-miked and the
content is kind of known.
So there are cases where speech recognition can be applied and
it's a value, but there's a lot more cases where it will break
down and provide dubious value.
Speaker identification is actually a little bit simpler,
and that's the process of recognizing someones vocal
signature determining that it's Bradley or Marti speaking based
on profiles I have of what the tambour of our voice looks
And then you've got things like OCR, so stripping out any text
the may appear in a video or an image.
A lot of images on the web or graphics or logos, we can
read, literally read the text out of those images.
Then there's things like face recognition, which is actually
pretty robust -- face detection and recognition are pretty
robust if you can limit the scope and the domain.
So the problem of taking CNN and saying which of 1,000
people are in that real-time video stream is tough, but
limiting it to which of five people are in this interview,
you know, on Oprah -- if I train it for Oprah and her four
guests and I'm just asking the system to determine which of
those five people are on screen at any instant, you can
actually get pretty good results using face recognition.
Object recognition, the general case of object recognition
is really kind of the holy grail problem.
Any of you who've studied computer vision know that it's
more Star Trek and science fiction that it is
reality right now.
You can find things that look like other things, so a white
golf ball on a putting green well look kind of like your
white poodle curled up on your green couch from like,
you know, ten feet away.
And if you kind of squint and take a step back, you'll see
that these object recognition algorithms do a good job of
finding things that look the same kind of from that ten
feet away level, but they're not good at capturing the
semantics of the image.
It's very, very difficult.
It really gives you an appreciation for human vision
and a lot of the computational work in computer vision has
been modeled after biological vision systems and people doing
things like sticking electrodes in frogs and cats and trying to
figure out how it is we do what we do.
You can also do things like determine is it an indoor
or outdoor scene -- that's relatively simple and I've
seen some good results there.
Also things like music or speech classifiers, so is this
spoken word or is it music -- that's another relatively easy
one to determine by looking at the acoustic model.
And this is really dot, dot, dot -- there's a million things
you can do in an automated fashion, there's some clever
ways to kind of use context to help bound these problems so
you're not doing the general purpose/object recognition
problem each time.
And there are many, many PhDs to be issued in most of these
categories, so it's kind of and open-ended research
problem as to improving the state-of-the-art there.
There's some problems with these and I've mentioned
some of these. The first is that they're
typically very expensive, and by that I mean
So, reasonable speech recognition runs
in real-time now.
There are some techniques that run in kind of 2x real-time,
but the good ones run in real-time, and what that means
is I can't process audio very fast.
To operate at a web scale, I would need huge farms of
machines constantly being fed content in order to run that
process on them, and we're very concerned with scalability at a
company like Yahoo!, and it's very difficult to imagine those
systems working at a web scale.
As I mentioned also, the results are noisy and prone to
error, so they're not robust enough to give us reliable
metadata all the time, although some of the speech techniques
are able to know when they're doing well and when they're
doing poorly, which is helpful, so you can kind of filter out
the noise and get left with some signal that is of value.
As I mentioned these can be mitigated by using context, so
if I train the system to recognize the five faces on
Oprah that day, it can do a reasonable job, or if I train
it to Peter Jennings voice or World News Tonight vocabulary,
as I've gathered it over the last year, they can do a
lot better than in the general case.
But one of the big problems is that they don't really extract
the most valuable level of information, and that problem
of speech recognition in automated metadata extraction
is so difficult that it can be very distracting.
You can dive into that problem and spent years or careers kind
of chasing after that without taking a step back to think
about the actual value.
And there's a lot of content, you know, that for which a
transcript, an exact textual transcript would not
be that valuable.
There's this layer of information that I think is
more valuable and I'm going to talk about that a little bit.
So, I'm going to digress right now and talk about -- this is a
picture of a robot, and that's a lot of the vision that I
was doing was really around autonomous robot navigation.
But I discovered something called the ESP game, and it
change the way I thought about metadata and it's actually
just a clever hack that was put together by some very clever
CMU folks, and I encountered it, as you see it was on CNN,
and it's been around quite a while.
And the premise of the ESP game is that you login and your
paired up with an anonymous player somewhere else in the
world and you don't get to talk to or see or know anything
about this player except that while you're online they're
online and you're both shown the same screen, and you're
shown this screen, and you're staring at the picture there on
the left, and the game is to type in a word that matches
with your partner to describe that image.
And the challenge is that there's also these taboo words,
so once you've said pageant and beauty and dress and woman and
white, there's very little left that my partner and I have to
go on to kind of match against.
And so, if you get too frustrated you can pass
and go onto the next one.
And, you know, the taboo words for this one was 'by,' my next
guess was going to be 'your.'
I think I paired up with my partner on 'your,' maybe
we paired up on 'Jamaica.'
So, this game continues and it's a lot of fun, it's
actually quite addictive, and see I've got 190 points, and I
scored 250 this round and I've got a cumulative score of
24,000 and I've got only 900 to go for my next rank, so I dove
in and played another game.
And it's actually quite addictive and a lot of fun.
This game I wasn't doing so well because I was actually
taking these screen shots, which probably frustrated my
partner because I was very slow picture-to-picture.
But what's cool about it is once you kind of get over the
fascination with the game itself and the fun of the game,
the game is kind of a ruse in a way, and if you read the fine
print there there's over 12 million labels collected
since October 5, 2003.
And so if you take a step back, the game is really a trick to
get human beings to attach high quality metadata to images that
are discovered on the web.
And if you think about those, the metadata itself, it's kind
of double-checked in the sense that if you and an anonymous
partner don't agree on the label, it doesn't get assigned
to the image, so there's kind of built-in quality
control there. And to think about what
it would have cost Yahoo! or another company to
outsource the labor to annotate 12 million images.
if we outsourced that to China or India, I'm not sure the
quality control would have been what it was just through this
game, a light bulb went off for me and said this is a really
neat phenomenon and if you can leverage people at scale and
provide the right incentives for them to do some of the
heavy lifting that I was trying to do out algorithmically in my
career, that's incredibly powerful and incredibly
valuable, and I kind of filed that away and didn't
do anything with it. When I first encountered this
game I was still a Virage, so I wasn't really
thinking at Yahoo! scale at that point.
But I did have an opportunity to reconsider this concept
with a company that we discovered at Yahoo!
And what's interesting about Flickr is it's an online photo
sharing site, and it was something I -- it was brought
to my attention from an engineer of ours in Bangalore,
and I brought it to my management's attention and said
I think we ought to consider buying this company.
And they reminded me that Yahoo!
already had the biggest photo site in the world and we
already had sharing features and Flickr by any metrics you
care about is rounding error for a company like Yahoo!, in
it was just small.
So I had to get very practiced and skilled at trying to
describe to my management and other decision-makers at Yahoo!
why I thought Flickr was a value beyond
just the scale issue.
And so I came up with four things that I thought were
exciting about Flickr.
Rolling back to the images you saw as I was introducing
myself, those were all Flickr images, they were all
contributed by users, there were no professional
photographers, or if they were, they weren't paid for those
photos and chose to post them for free into the
Flickr service. And moreover, the fact that I
was showing you good photos, and there's a lot of photos in
Flickr, was also automatically determined, and I'll talk about
how the good stuff rises to the top in Flickr as well.
So, the first point I made is that Flickr is filled with
high-quality, timely, topical photographic content, and
all of it is user-generated.
And so we didn't have to go out and license images from Getty
or Corbis or any of these purveyors of digital
photography, the users filled up the platform.
And that's not necessarily new for Yahoo!.
We've had sites like GeoCities, we've had Yahoo!
groups, which are all user-generated as well.
So that's a practice that's well-known to us.
The second bit was the epiphany that it was all richly
annotated and indexed, searchable, browsable and
navigable and the metadata was also entirely user-generated,
so again, we didn't have to outsource or pay anybody to do
that, the users did it for their own benefit or for
each other's benefit.
Then they had tens of thousands of distribution partners each
deal brokered by a Flickr user.
By that I mean their distribution strategy in
contrast to that image hosting that I showed you on Tripod,
their distribution strategy was to encourage the use of Flickr
on non-Flickr.com domains.
So they actually integrated with popular blogging software
so that many bloggers use Flickr as a back-end image
hosting service and they can very easily surface that Flickr
content on their blogs, and every time they do that it
links back to Flickr and drives awareness and discovery of
the Flickr service itself.
So the terms of service are such you can't deep-link and
steal the Flickr image with Flickr hosting it, but you're
welcome to kind of integrated with your blog and drive
people back to Flickr and they encourage that and it
was very successful.
And then the last bit, hundreds of applications written against
the Flickr platform by thousands of Flickr developers.
So, by opening up their system as a platform, as opposed to
just a website, they encouraged developers to build novel apps
against it, and they actually encouraged it by providing
bindings for Java and CPlusPlus and PHP and Pearl and Python,
and the community kind of grew from that, and literally
thousands of developers had built value against this, and
I'll show you some of those apps.
A lot of them are just kind of fun and frivolous, they're
just people playing around or doing something wacky.
Some of them are actually very important apps that help the
Flickr community get things done.
And what's neat about this is this entire ecosystem and
phenomenon was really less than ten people when
we encountered them.
So you've got millions of people pouring content in,
millions of people tagging and annotating it for the benefit
of others, thousands, tens of thousands of people
distributing that across the internet, and then thousands of
people working for you building additional value against it.
This is kind of a beautiful thing that seven or eight
people could generate a phenomenon in an ecosystem like
this, and that's a lot of the rationale around why we bought
Flickr -- it wasn't necessarily the Flickr.com site was driving
numbers that would take Yahoo! through the roof.
It was more about getting that kind of know-how and that kind
of expertise and that kind of value into the company where
we could take that premise and replicate it across different
parts of our business.
So more detail on Flickr.
This is kind of the login, the splash page, when you log into
Flickr what you get, and you can see things broken down
really into three categories, and this resonates through a
lot of our social search work, but I've got my photos, I've
got my friends photos, and I've got the world's photo, so you
can kind of carve things into three buckets.
In Flickr I can actually get finer grain than that and
I can look at my family's photos or my friends photos.
So this contacts can actually be broken down into
different levels, including acquaintances, family, and
friends, but generally there are these three buckets.
This is the one that I go to most, and in a lot of ways it's
kind of a visual inbox -- at a glance I can understand all of
-- I can kind of read what my friends were up to over the
weekend or since I last looked, and I can see that Nathan was
with his cat, and I was with my puppy, I'm in the middle there,
and Lance was with his baby, and it's very easy for
me to digest this. At a glance I can kind of get
an understanding of what's interesting to me and what
I might want to dive into.
And that's really in contrast to a blog -- imagine how long
it would take me to read blog entries from each of
these dozens of friends.
There's a huge investment for me to kind of consume that
information in a textual blog, whereas visually I can
consume it very quickly.
And the same is true for the authoring aspect.
One of the things that's exciting about Flickr and I
think drives adoption is the very low barriers to entry.
So, I've always wanted to write a blog and I'm going to some
day -- Dana's gonna make sure of that, but I still don't have
a blog up and it's really a question of time -- I have not
yet found the time or made the commitment that that's
something I want to do, whereas Flickr makes it very easy for
me, and I'll do this right now, it's relatively painless for me
to take my camera phone and take a picture of you all
and upload it to Flickr.
So I do this in most of the talks I do -- why don't you
wave or smile or something.
And so with a couple clicks, I can post this up to
Flickr, even during another activity like a talk.
So the low barriers to entry both for consuming the
content, as well as producing the content, make it
perfectly easy to use.
I can then drill down into a specific friend, so if I want
to see more of Nathan's cat, I can click on what we call his
photostream and kind of dive into that.
All this stuff is RSS enabled, so again, toward that model of
syndication, almost every page, every user, every tag, it has
an RSS feed associated with it.
The Yahoo! tool bar auto discovers that --
it sniffs the page, discovers the feed, and with a click
I can actually drop that onto my Yahoo! page.
RSS is Really Simple Syndication, so it's a very
simple means for you to take content that might exist on
your website and inform others about it, and it's a great way
to drive awareness and traffic.
My Yahoo! is a big RSS aggregator, so in
addition to kind of having modules like the weather module
which you can personalize, our stock quotes which you can
personalize, you can also take content as it exists anywhere
on the web, and through an RSS integration you can put
it on your My Yahoo! page as well.
And what this means as Nathan pours in new content to his
photostream, it will appear here as links on My Yahoo! page.
One thing we've done recently is change this to a visual
rendering, so here I've got these textual links, but it's
actually much better when you have the eye candy and can
actually see the photos. So now on My Yahoo!
you can actually put what we call a media RSS stream,
and that media RSS stream will render visually up
there, which is great.
So this phenomenon of tagging, again, one of the things I
think Flickr got right was to lower the barriers to entry so
that made it very, very simple for people to add metadata.
So, instead of annotating this through a structured taxonomy
where I'd have to type or navigate through animal,
mammal, feline, cat, you know, my cat, tabby, you can just go
into a free text field and basically free-associate.
So I can click on add tag and a text box pops up and whatever
comes to mind I can type in there, so it can be cat,
feline, kitty, tabby, whatever I want to type, and we
don't worry about the structure whatsoever.
And in fact, we create tools downstream on the back-end that
help organize and categorize that stuff, but keeping that
barrier to entry very low is part of the magic.
And so what I can do is navigate through these tags,
and here I typed in cats and I see recent cats is entered
into the system, and so this is actually reverse
For a long time, for the first year, that's all Flickr had was
kind of reverse chronological sorting so that I could see the
most recent images tag with cats, and that was actually
very reasonable and still is very valuable.
What we've introduced that's new is clustering -- well,
actually, before I talk to clustering, interestingness
is the word we actually coined to describe what this is.
And I mentioned before that determining relevance in the
media space was difficult, and the same is true within Flickr,
so we had to figure out a way of determining what are
the most interesting photos in Flickr.
Now, one way we could have done that and given that it's a
community based site, we could have asked people -- we could
have had a voting mechanism, so rank this picture or rate
this picture or vote on it.
We chose not to do that and instead what we did is we
looked at the implicit behaviors around these photos,
so how many times has this photo been viewed, how many
times has it been made a favorite, how many times has it
been shared or syndicated, and all of those things kind of
factor into this equation.
We also look at the structure the social relationships.
So, if your mother made your picture a favorite, it carries
a different weight than if a complete stranger made your
photo a favorite, and we kind of put it into a melting pot
and we determined an interestingness
for each picture. And so now I'm looking at the
most interesting pictures of cats in the system, and so now
it's rated, you've got that close up, and we also look at
the kind of discussion around the pictures, too, so there's
a capability for people to comment and talk about
pictures, so that can also influence the interestingness.
And again, there's no right answer, but if you look at this
interestingness metric and you actually click on that explore
link up there, we've got a calendar view where for every
day of the year you can look at the five most interesting
photos on Flickr and it's an amazing way to browse the
system, it's just quite staggering the quality of
photos that get in and the level of activity around
them, and so I recommend everybody does that.
But in some ways, because we don't have that link structure
to analyze, we chose to use other heuristics to determine
relevance and the ones we chose were implicit, so people can't
game the system by kind of getting people to vote for them.
It's only the natural activities -- and we don't
divulge the secret formula either, so that it's another
obstacle to people kind of gaming the system.
But the natural implicit activities give us clues
as to what's good.
The other thing we did is clustering and there was a lot
of speculation that when Yahoo!
acquired Flickr, this concept of folksonomy would break down,
this kind of free tagging that we were doing, and it would
break down because of language barriers, it would break down
because of spam and people generally trying to
abuse our systems.
And shortly after the acquisition, the Flickr team
launched this clustering capability, which not only
proved that folksonomies wouldn't break down, but also
that there were ways we had on the back-end of applying
intelligence to help tame them.
So what this does is it looks at the co-occurrence of tags
to determine the clusters that they live in.
So a typical example is the query disambiguation problem.
So you type in a word like turkey -- there's turkey the
food, there's turkey the bird, there's turkey the country.
But it turns out that Turkey, Istanbul rarely occurs with the
food -- I'm sure they eat turkey in Turkey, but it's rare
-- not rare as in well-done rare, but there are cases where
this breaks down, but in aggregate we can cluster along
the co-occurrence of these tags to find a Turkey the country
cluster, a turkey the food cluster, and a turkey
the bird cluster. There are some surprises in there.
So it turns out that one of the things that people do when they
get a macro lens is they run and they begin to take
pictures of their pets.
And so you will find in the cats cluster that there's a
macro cluster within that that is really about the
lens and the photography.
There's kind of a cute cluster in there and there's a gato
cluster there, so we're kind of dealing with the multilingual
tagging that we talked about before, and then that's the
kind of canonical cat as pet cluster, and it's also very
addictive to go in there and look at some of the
queries that you can do. You can go to love and look at
how love gets clustered and there's kind of artifacts of
love, like hearts and things like that, there's kind of
familial love, and kind of people and settings with their
family and things like that, there's romantic love that's
clustered separately, and then there's all these
emerging clusters. So, London didn't have a
bombing cluster until one unfortunate day when we saw
this huge influx of photos and people using Flickr to upload
that content in real-time, and then suddenly there was a
bombing cluster associated with London, and these things
are computed on the fly dynamically.
So, these things in real-time will show up.
So, clustering has been a really valuable tool to help
kind of tame this folksonomy without raising the
barrier to entry. There is still a problem.
If people type in turkey alone in isolation with no other
context, we don't know today which of those clusters it
belongs to because the system depends on looking at the
co-occurrence of terms to determine the clustering.
So there are some other techniques that we could use.
But today this depends on rich tagging and the occurrence
of multiple tags per images to do its job.
So, speaking to that distribution strategy, here is
something you see called the Flickr flash widget and it's
drop on their site and you can drop a badge on your site, so
here's a blogger, and that badge is animated and it's a
live connection to the photos that are hosted on Flickr
for that particular rendering of the badge.
So, I can see this guy's photos and they kind of move around in
tiles and it's great eye candy, and again, whenever you
click on that it drives you back to Flickr.
Here's what I was talking about before, just some sites that
are hosting Flickr, and if I click on that it's
back to Flickr. This is fun, this is third
party apps that have been written against the
Some of these are pretty frivolous, like the one up here
is a color picker, and I can go in that color wheel and change
the brightness and luminosity and these Flickr pictures kind
of bounce around it and kind of here's some Flickr pictures
that match the color that I've selected.
Maybe there's a use for this among graphic designers, I'm
not sure, but it's fun and it's another kind of
There are things like this which is your social network
browser and it's away visualizing the social network
and it's also very interactive, you can drag it and there's
kind of a physical modeling that they do where these things
kind of swim behind your cursor.
The one in the upper left is probably one of the more useful
ones and that's looking for the presence of geographical
information in the tagging stream and then mapping that
onto a map so that I can actually dive in and say show
me Flickr photos from my neighborhood or from this
vacation spot that I want to go to.
And so kind of combining geographical information
with tagging is something that's very powerful.
It also speaks to a more structured rendering, so right
now if I type in London, we don't know if you mean London,
Ontario or London, England.
It would be great to associate metadata, which is more
structured, as much as we believe in the folksonomy
and think it has its place.
There are also times where latitude and longitude would be
very helpful, and it would be great to kind of create means
of entering that data which still have these low
barriers to entry. So if I knew from your profile
you lived in London, England, maybe I won't even bother
asking and I'll just associate the lat/long for London,
England whenever you type it in.
So, I won't belabor this slide, but one of the other objections
that people had around Flickr was that we've got a small
group of enthusiasts right now, but when we take this to Yahoo!
audience it'll break down, because not everyone is an
enthusiast or is motivated.
We find that there's these cultures of participation
that really have this order of magnitude relationship,
so in Yahoo! groups you might have 1% of
the population which starts a thread, they're the kind of
instigators that will go out there and say hey, what does
the world think about this.
You might have 10% that kind of are inspired to respond or
interact with that thread or reply, and then you've got the
rest of us which are the lookers, the 100% of us that
benefit from the activity that the other 11% had done, so you
don't need necessarily to have all 100% of the people being
the kind of people that would start threads.
People participate at different levels and derive
different value from what each other does.
It is our goal, however, to kind of move away from this
orders of magnitude pyramid into more concentric circles
for each of these types here.
So, a good example of that is our launch cast
radio music product.
Using that I can basically listen to music and rate it as
it's streamed to me on the fly, so it can send me U2 and I say
I like it, it can send me White Stripes and I can say I like
it, it can send me Britney and I say I don't like it, and the
more I rate and review the content, the better and better
it gets at playing music that I like.
So I'm basically building up a profile about the music I like
in a very natural interaction that makes my radio
station better. What's interesting is that that
natural act of consumption is also an act of publishing, so
as an artifact of what I've done, I can choose to publish
the Bradley station and my friends can choose to listen to
that and hear my musical tastes, but it didn't require
any extra steps for me to do that -- I didn't have to drag
music over -- I just listened, I said what I like from the
selfish motive of hearing more, and that publishing
aspect was an artifact.
So, as a consumer I became a creator with no additional
steps, and that's kind of what we're moving toward is
continuing to lower these barriers to entry so that it
becomes a very implicit and natural act of consumption.
So I'm going to diverge a little bit here and talk a
little bit about our strategy.
People ask us at Yahoo!
are we a media company or a technology company, and
it's clear that we're both.
We have strong roots in both, and I think more and more to be
a media company you have to be a technology company
in this world.
As a student at the media lab, we heard this premise, the kind
of migration of the world from atoms, physical things in the
universe that require a lot of effort to move around and ship
and keep track of to bits, which are easily distributed
and infinitely copied with no degradation and
that kind of thing. And when you think about the
current media, the state of media in the world, a lot of it
is really artifice and derived from the fact that we entered
this world of atoms that were hard to move around, and so
this economics of scarcity created this almost venture
capital-like system now where we have the financiers, which
are the studios or labels, kind of putting these high stake
bets on a large number of companies or films, and a few
of those take off and become the mega hits, the Titanics or
the Microsofts or the Cisco systems that pay for all of the
others, so it's this kind of high stakes game based on the
scarcity of distribution.
And that's changing as we move into the world
of the Long Tail.
The basic premise, and I'm sure you guys have seen this at some
point, is that the -- this is a typical power law distribution
that we see in many different places in our business.
If you look at query distribution and kind of the
things that people are searching for in search
engines, you see a couple of spikes as you look at the top
navigational queries and adult queries and things like that,
but you see this very, very long tail of everything else.
And the economics around that long tail are very
interesting when you look at them in aggregate.
If you look at the area under that curve, the tail extends
so long that if we can, as an industry, figure out a way to
take advantage of that and provide value to the user
around that and also make a buck around that,
the stakes are high.
So, there's kind of this dichotomy between mass media,
which is this artifact of the last century -- we've done this
because we've had to, because a channel like CBS exists,
because bandwidth is scarce, and they bid for that bandwidth
and built out affiliate networks in many cities, and by
definition in order to have a viable product they have to
have a product that appeals to large audiences, which
has created this high stakes economics model.
Content is expensive to produce, that bandwidth is
scarce and therefore valuable, and we've kind of had to
pander to the least common denominator.
We're entering the world of micromedia where it's actually
simple and cheap to distribute the stuff, it's actually
getting cheaper by the day to produce the stuff, whether
that stuff is photographs like Flickr or even audio or video
content where the barriers to entry to create professional
sounding audio or video are dropping daily as well.
And the economic model, there's not a lot that I can say
definitively about it, except we know that it'll be different.
You won't have to necessarily have a blockbuster hit that
brings in hundreds of millions of people to have an economic
relationship to the content that you're publishing.
And what ways will that came get monetized?
There's kind of the standard ways -- they'll be
pay-per-view, they'll be subscription, they'll be kind
of tethered downloads like we have in the Yahoo!
music engine. So there's all kinds of
different economics around this and it's unclear exactly
which models are gonna stick.
At Yahoo! we clearly believe in both parts of that curve.
We are not necessarily about liberating a long tail at the
expense of the head, so it's not necessarily about entirely
user-generated content across the curve.
What we really believe in his personalization.
So we think people will continue to enjoy popular stuff
and we will continue to have relationships with studios and
labels to bring that stuff to our users.
But this new micromedia is where there's a lot of
opportunity and discovery, and we want to create the platforms
and the tools the allow people to publish their own content
and discover stuff that's suitable for them but may
not have a wide audience.
So, in Yahoo!
video search we really do this in three different parts.
One is we have explicit feed relationships, and so we
approach the top providers of streaming video and
downloadable video on the internet, and we basically
ask them if they would like a bunch of traffic, a bunch
of well-qualified users.
And generally they're very skeptical, because we have a
part of our business that we used to call Overture,
and now Yahoo! search marketing, which
basically sells qualified traffic -- that's
part of how Yahoo! makes its money is to basically
help connect people who are looking for things with people
who provide those things and then we take a cut.
And so when we went to these major streamers, when we went
to the Viacoms of the world and the ESPNs and we said we'd like
to bring you qualified leads, they were a little bit
skeptical thinking that we wanted to charge them for that.
At the state of video on the net today, we didn't feel it
was necessary to charge them.
We wanted to create a great user experience in Yahoo!
video search, we wanted to be comprehensive and have
all the content in there.
Later on some day we'll figure out as an industry a way to
monetize that, but we didn't feel that was a precursor to
having a great product of value to our users today.
And so once they kind of got over their disbelief, we've
been able to work with probably the top couple dozen providers
of streaming content on the web today.
What's nice about the feed relationship is, as I
described, metadata is hard to come by and websites are hard
to crawl, and so what we're able to do when we have this
explicit feed relationship is drop the whole effort of
reverse engineering, the URL and the deep media link and
who's in the video, and we go right to the publisher who
created the content and they describe it for us in an XML
feed, Extensible Markup Language, just a description
they hand us -- I call it spoon-feeding -- they
spoon-feed us the content so that we know exactly where it
is, exactly what's in it, and we can help our users
get to that content.
What we do here in kind of the torso of this curve -- if you
call that the head in that the tail, I guess that's the neck
or the torso -- we do our comprehensive media crawl, so
we do send our spiders around the internet and we attempt to
find every bit of media on the internet and that's how we get
comprehensiveness, that's how we get tens of millions of
assets into the index as opposed to just the hundreds or
thousands that we get through the explicit feed
relationships. And then we created something
new for the Long Tail and we call that media RSS, and
understanding that we can go to a couple dozen, but we
don't have the purview or the manpower to go and cut a
deal with everyone putting video up on the internet.
And so media RSS is kind of a self-serve way for
an independent small publisher to alert Yahoo!
as to the existence of their content and get it into
our search engine. So what we did is we took the
popular RSS, Really Simple Syndication format, we build on
the podcasting movement, because podcasting is really
the existence of media enclosure in an RSS feed, and
what we said is let's dress up that media enclosure, let's put
some metadata to make it searchable in there, and we
created a name space, a canonical name space that
allows you to basically fill in the blanks and tell us the
title, the description, the key words, the copyright
associated with your content.
Our search engine can then exploit that and slurp that
up and deliver qualified users to your content as well.
And we did it very much in the spirit of collaboration -- we
did it with message boards, we did it with our users, we're
working cooperatively with other companies.
This isn't something we think is proprietary or we want to
own, we did it just to get something done -- we did it
because we were first out there and wanted to create
the mechanism and it's been very successful.
So, on the one hand at Yahoo!
you've got, you know, content like -- my slide's not building
here -- there it is -- you've got content like -- oh, this is
jumping right to my punchline here -- you've got content like
Donald Trump, which is head content, you know, it's stuff
that's carried on a major network.
And then you've got tail content like this -- this is a
guy that lip-synced to a -- you actually don't want to hear
this, but I'll -- looks like Star Wars kid, you know.
Basically this kid who was doing this lip-syncing to some
Bulgarian disco hit of the '90s became an internet pop star for
a day and he actually was on Good Morning America and had
his 15 minutes of fame or embarrassment, and that's kind
of micromedia -- this would not have made it, as compelling as
the content is, it would have not have made it to one of the
networks for you to see, it's stuff that you can only
discover virally on the internet.
And then we have cases like JibJab.
JibJab was a vector animation, a flash animation done by a
couple guys -- started out through viral marketing, it was
political satire, it got very popular, and eventually it got
the attention of some senior media executives at Yahoo!
who said who are these guys, and we went and we cut a
deal with them and brought their next product
exclusively to Yahoo! users first.
And so that's an example of kind of discovery -- content
moving from the tail naturally up to the head, and we cut the
JibJab founders a nice fat check that actually made all
their efforts worthwhile. So, you know, that's an
example of how this stuff can get discovered.
I don't think that discovery is kind of the exciting economic
model that we're all looking for in terms of promoting the
content in the tail, I think that will happen, but what's
much more exciting is creating very natural ecosystems and
digital marketplaces for people to derive value from their
content without necessarily hitting the big time or getting
a deal with Yahoo!, and that's our intention, too.
So, I hope that this is not our future, by the way.
There's a lot of skateboarding dogs and karate chimpanzees
and stuff like that on Yahoo!
video search right now, and I think we're in this
bootstrapping cycle as an industry where kind of we're
waiting for content to kind of fill up and get to the next
level and we're just going to have to endure these growing
pains until we get there.
And we're also working with purveyors of the head content
to help get their content to our users, so we have deals
with The Apprentice and The Contender and Entertainment
Tonight and others to try to bring standard, what people
would think of as traditional television content onto
the internet as well.
With that I will put up my email address and invite anyone
who has further questions or an interest in Yahoo!
to contact me at your convenience, and we have
time for questions, if there's any questions.
BRADLEY HOROWITZ: Thank you.
MARTI HEARST: OK, we have plenty of time for questions.
AUDIENCE: My question was about the Flickr site and the
interestingness, and I guess one of the metrics you probably
use is how many people have clicked on a photo.
If you're driving people to say the top five for each day, do
you or how do you factor out those types of clicks, because
you're obviously going to have a more bias towards those that
have already been chosen.
BRADLEY HOROWITZ: Yeah, it's the same problem as before the
rich get richer kind of thing, this discovery process.
I'm actually not sure that we do factor that out.
So, it could be that there is this kind of rich get richer
phenomenon and we need a damping affect to make
sure that that doesn't happen within Flickr.
I'm not sure today that the algorithm is sophisticated
enough to do that. The fact is that a lot of the
magic and interestingness was fairly ad hoc We played around
-- the engineers who built it played around with a bunch of
stuff until they found something that we all agreed
was interesting and then we launched it in the spirit of
Flickr, and I'm sure we'll continue to tweak that, and
I'm sure if we applied some economics to this, it would
be gamed a lot more than it is today.
But at this point we haven't seen a lot of abuse and we
haven't seen a problem with the images that are coming up as
the top five most interesting ones -- they're definitely
compelling and interesting and you can check it
out for yourself. One of the problems with this
is I'm sure if I put 100 photos in front of each of you and
asked you to rate them in terms of interestingness, we'd all
come out with a different answer -- there's no right
answer for this stuff. So in some ways the best you
can hope for is an answer that is compelling to a
large number of users.
In the long run, you could apply personalization to
interestingness and say photos that are interesting to you,
and in some ways Flickr manifests that already by
allowing you to look at photos from within your social space
and your social network, but you could apply that same
concept to interestingness, so we could watch your click
stream and if you seem to be clicking on a lot of cats,
you're more likely to be interested in cats and dogs
and we could factor that in. So there's all kinds of places
to go with the concept of interestingness, we're just
scratching at the surface to keep with the cats analogy.
AUDIENCE: There was an article which talked about how this
concept of Flickr is also influencing how Yahoo!
views itself as a company, so could you throw some more light
on how that's going to make you distinct versus, say, a Goggle?
BRADLEY HOROWITZ: Sure.
I think it's important, and I'm not sure the article got this
right, we do believe in what we call better search through
people, but that is not at the expense of scalable
automated technology. We've spent the last two years
building what we think is world class scalable automated
technology to harvest every document on the internet, index
it, get the right relevance, and that's kind of foundational
-- there's kind of no way around that, you need
that level of algorithmic automation to do a good job.
What we are investing in is kind of beyond that, what I
think is a very exciting proposition of social search.
So today when I -- and I know Marti going to have someone
come and speak about that specifically, but today when I
do a search and you do a search, we're kind of stuck
with what the webmaster's decided was important, so it's
a voting system today where the only way you get to cast a vote
is by building a link to another site, and the only
people who get to cast votes are those webmasters, and
moreover, they vote for all of us.
So they vote by proxy for me and you, and if I type in IBM
and you type in IBM, we all get the results that the webmaster
has decided were right, and that's very limited.
And it's worked very well and it's a huge innovation
versus the old way of just looking at words on pages.
So to take a step back and look at the topology of the web
is very valuable, but what's the next major breakthrough.
We think it's kind of democratizing that process away
from just the webmasters and allowing people to vote on
what's important to them, and allowing social networks to
define the expertise that you care to adhere to when
executing a query. So show me, you know -- it's
also important to understand where this technology
is relevant. So if I'm trying to find the
population of Idaho, you don't need social search, you know,
in fact Alta Vista circa 1998 did a very good job with
that kind of query. But if I'm trying to find media
to consume, a song to listen to or a blog to read or a plumber
in Berkeley that I trust or a restaurant I want to go
tonight, all of these have a subjective component where
the expertise of my social networking friends is something
that's relevant to me, and I don't want the webmasters
voting on which restaurant I go to in Berkeley, I want my
friends to help me decide.
And so that phenomenon of social search, allowing me to
see that query, is rendered by my friends not to the
exclusion, by the way, of traditional organic
search results. So we have a very nice coupling
through something similar that I showed you on images, a
direct display, so I can type in a query like Berkeley
restaurant and if any of my friends have tagged or
annotated or saved Berkeley restaurants, that will show up
in a small region on my search results, then I'll get the
organic results underneath and I can choose to drill in
to that vertical social search, if I so choose.
So, I think we think that the concept of Flickr has
relevance across almost all parts of our business.
If you look at the principles behind Flickr, so
user-generated content, user tagging and metadata
distribution in this kind of syndication, platform
orientation, so as opposed to building websites, building
platforms that the world can build on top of, all of these
things are kind of the Flickrization of Yahoo!
that that article was talking about.
AUDIENCE: Could you talk about what Yahoo!'s doing with
podcasting, in your podcasting search engine, what it's doing
now compared to being able to search in iTunes and then what
you see the figure for that particular type of media--.
BRADLEY HOROWITZ: Yup. I'm not particularly close to
that product so I can't speak like the product manager could,
and I can put you in touch -- if you're interested, email me
and I'll put you in touch with the product management of
podcasting in general.
But we have been seeing a huge rise in podcasts, and our
crawlers actually discover these, so as we're out combing
the web and staying fresh, we've noticed that there are a
lot of podcasts out there, and so it was relatively
straightforward, given that we have all this infrastructure,
for us to expose that as a vertical search within
audio search, and then a separate team at Yahoo!
actually built a product around that for aggregating
So, the kind of vertical search that we have is usually driven
by consumer need, and we see a lot of that consumer
need in the search box. So if we see people searching
for, you know, Gilmore podcasts a lot, we see the spike in
demand that we pay attention to, and when we see that we
think that's a good candidate for vertical search.
So having the technical underpinnings to deliver
vertical search, as well as sensing the consumer demand and
seeing that manifest in the query logs, are kind of the two
things that put a particular vertical search on our radar
and podcasting definitely hit the radar and it seemed like a
good bet to build a search vertical focusing on podcasting
and that's why the product exists.
AUDIENCE: I want to take it back to Flickr again, and I'm
curious about the rights of the image of the individual.
As you mentioned, it's a very public kind of thing that
everyone has access to, and one of the things you mentioned
before with the image search in Yahoo!
was the -- there are certain users who are upset about
advertising, so making a buck off of it.
So, for instance, for myself, I would be rather upset if some
of my photographs, whether they were of myself or of photos
that I'd taken, were used by some other source and someone
was making a buck off of them.
BRADLEY HOROWITZ: Yup.
So, within Flickr you can, first of all, set various
levels of sharing and privacy.
So you can put a photo up there that only you can see -- you
can have a digital shoebox in the sky that's just
private photos for you.
You can also share it with just your family or just
your friends or everyone in the Flickr community.
So you have a lot of control as to how you want to dial up or
down the use within the Flickr system of your
If you do choose -- and what's interesting is the terms of
service on Flickr basically say anything you put up there can
and should be shared with the world.
So if you make it public, you have basically said it's OK for
some blogger to put my image up on his blog site -- that's what
public means -- it means that you're tossing this photo into
the pool, and if you don't like that we've got the knob so that
you can dial it down to various levels of privacy right
down from your friends and acquaintances down to only you,
if you want to do that.
What's interesting is when we made the transition of Flickr
as an independent company, for a long time Flickr meant the
Flickr community as it exists as Flickr, you know, the few
million people who had discovered Flickr
knew about Flickr. Now Flickr means Yahoo!, and
that's a much larger, more global international community,
and it means that content is being exposed to a different
audience that you may not have originally anticipated, and it
also could be monetized in various ways.
So, that decision as to how you want to dial up your photos in
terms of privacy is not locked in, you can always take a photo
that was public and dial it back to be private, and that
will mean that no one can host it outside of the Flickr domain
or even access it within Flickr.
So I think the best we can do is give the users the right
controls so that they can make informed decisions about how
and whether they want their photos used.
Ultimately, what I would like to see with Flickr is this
concept of a digital marketplace, so that you could
say yeah, you can use my photo to sell Cheerios, but I want to
see a quarter every time somebody clicks on that, or I
want to see 1% of all Cheerio sales, or whatever terms and
conditions you want to set associated with that commercial
use of your photo you should be able to define those
business rules within the system, and Yahoo!
should be cutting checks to the people that want to do that.
So that's kind of the long-term goal is to create this digital
marketplace which sets up the business parameters
that, again, you decide.
You can say never sell Cheerios with my image or here's what it
would take to bribe me, and that's kind of
where we're headed.
AUDIENCE: You mentioned the photo or you showed the Flickr
photo mapping app as one of the more useful apps
people have built on that platform, but Yahoo!
has made some news recently around Yahoo!
local and all of the offerings that it's brought out recently,
which is, you know, it's still pretty much a map and
text-based interface, so I was wondering if there were any
plans to integrate some of this multimedia intelligence
and integration with the local service.
BRADLEY HOROWITZ: Yes, there are.
That's a great idea, we're doing it.
And it's really -- you know, the Flickrization of Yahoo!'s
partly about that Flickr spirit, but if you think of
the Flickr content, previous question not withstanding,
there's a lot of places where, you know, you're on Yahoo!
local and you're going to a restaurant, show me images of
that restaurant before I go there, or you're on Yahoo!
news and you're interested in breaking news in Hong Kong,
show me images of the event in Hong Kong.
There's actually stories like the kind of citizen
journalism-type anecdotes where one guy was in Hawaii and he
heard -- he got a call said, dude, your apartment
building's on fire.
So he ran to the web and looked on his local news affiliates,
you know -- none of them had coverage, anything like that,
and then he remembered that they were a couple members
in his apartment building that were Flickr users.
So he logged onto Flickr and actually discovered that the
part of the apartment building that was on fire was the other
side, so he went back to the beach and continued
his vacation in peace.
So like Flickr kind of being used in this citizen
journalism, this kind of microcontent that's kind of
below the radar of mass media, even a local news carrier
didn't have time for the one alarm fire coverage, and if
they did it wasn't on the web in the moment.
Flickr users have gone out there and kind of, you know,
eyes of the world in some ways kind of covering this stuff,
and so I think it has bearing in local for places, I think
it has bearing for events, and we have Yahoo!
calendar, and we recently acquired Upcoming.org, so
wouldn't it be cool if every event automatically had Flickr
content associated with it.
I think it has bearing for products, Yahoo!
shopping, you know, wouldn't you like to see the product
you're buying, talk to other owners or connect with other
owners of a particular car or computer?
You look around Yahoo!, there's almost nowhere where this kind
of phenomenon of user-generated content is invaluable, and
I think local is one of the first ones we're gonna hit.
AUDIENCE: I wonder how easy is it for a user to take their
photos out of Flickr right now, and I guess I'm thinking in the
future, like I used to have an account with Ophoto, which was
the Kodak thing, and recently they're telling me I have to
now pay or at least order so many photos a year if I want to
keep my photos there, and of course I can do that -- I
can take my photos down in a very manual sort of way, but
they pretty much have them. So what's Yahoo!'s take on
Flickr kind of owning these photos versus making it easy
for somebody to take them off of Flickr if you
change your terms?
BRADLEY HOROWITZ: We make it very easy for people
to take their photos out. We don't believe in the roach
motel, like where data goes in but it doesn't go out, and this
is not just Flickr, this is kind of across the new Yahoo!,
we want to make it very easy for people to get their content
out so that there's not kind of hijacked or hostage there.
So in terms of how you do that there are some methods, right
now I know that we just did in integration with folks that
will burn a DVD, you can get it programmatically
through the API.
It's kind of like the question is have the places that would
receive your photos, like another photo service or a DVD
burning service or something like that, have they done the
work to integrate to Flickr, but through Flickr you can get
your content out, it's just that not a lot of people have
created the other side of that equation which is where do
you put them when you do.
But we're very open to that, and we recently launched
printing services and the ability to burn a DVD of
your photos and all that kind of thing.
AUDIENCE: Question about video search.
So in the last year or so, the major search engine
players all launched the video search service.
Can you just talk to the consumer acceptance to date of
video search in terms of accessing video content online,
and as well the potential for the future in terms of
accessing more mainstream content like TV, movies online?
BRADLEY HOROWITZ: Well, the adoption has been exceeding
all of our expectations. As I said, those tabs across
the top really represent marketshare of the different
search verticals, so video right now is our
number three search.
More people are searching for video than a shopping search,
local search, travel search, which surprised us and is
interesting, given that the content is skateboarding dogs
and lip-syncing 14-year olds.
So, it's kind of ahead of our expectations, given the dearth
of content, it's actually pretty surprising and pretty
encouraging in the sense that that traffic allows us to get
more -- it's kind of breaking that log jam and that inertia
of where we were a year ago at a dead stop.
We can now approach publishers and say there's real
opportunity here, there's critical mass of users, we can
start thinking about the interesting ways to get more
that mainstream content online.
And so I definitely see progress on that front, and
I think the publishers are getting more and more
comfortable with the fact that they actually can
reach audience through the internet as a vehicle.
It's not moving as fast as I would like.
You would think that after the lessons in the music industry,
you could go grab these studio executives and say it's
the same thing, you know.
But it hasn't been that easy.
And I understand why, because these folks have multi-billion
dollar existing franchises and they're on the hook to deliver
revenue this quarter, and so they can't pause for a couple
of years while they move their businesses digital, they've got
to kind of make sure they don't cannibalize their existing
revenue streams, even as they see the future dawning on them.
So, it's moving, it's not moving as fast as I
would like, but Yahoo!
will be there. If you look at the company,
we've got all the right parts to be there for that
opportunity and we're working together with the other parts
of Yahoo!, folks like the broadband group, the mobile
group, the media group that's down in Santa Monica to make
sure that we're there at that next generation of high-quality
content that you will consume either through your computer or
on other devices, whether they're cell phones or set-top
boxes, IP delivered video.
AUDIENCE: If someone makes collages of content, either
video or Flickr photos, that the source is owned by multiple
different people, creates a new piece of content and reposts it
on Flickr or Yahoo!, who maintains the ownership of the
content of these samples constructed into new content?
BRADLEY HOROWITZ: Right. That's -- in terms of who
actually owns it is a question for Lawrence Lessig.
Let me tell you my understanding of our
relationship to that content.
So in Yahoo! video search, another big
objection we got from studios and when we first approached
them a year ago was video on the internet -- we're trying
to kill that, we don't want to help it, you know, that's
piracy and that's people abusing our content
and all that. And it turned out that our
tool ended up being one of the great tools that they had to
discover infringing content.
So, if I'm Warner Brothers and I want to see who has illegally
sampled my movie and put it up online, a year ago
there weren't a lot of great mechanisms for
discovering that. Today you can go into Yahoo!
video search and type in the matrix and discover people that
may be using clips from the movie without expressed
permission from Warner Brothers, and they do that.
And our obligation is, under the Digital Millennium
Copyright Act, is for take-down, so that if Warner
Brothers contacts us and says this specific instance of
content is in violation and we believe we hold the copyright,
we will take it out of our index within 24 hours and then
they will litigate or discuss with the webmaster who's
hosting that content whether or not, you know, they
should take it down or not.
But we have what's called conduit status, so we are
actually a search engine and we are protected from being liable
in that case so long as we adhere to these
take-down policies. So that's kind of our role is
to basically be reactive when a copyright owner does tell us if
there's content infringing, we make it so that users can't get
to it through Yahoo!, but they have to be explicit with us and
tell us which specific piece of content they need taken down.
So they can't tell us when a user types in the matrix they
should get nothing, because, you know, despite the movie,
there's also that lecture on matrix algebra and
there's, you know. So every single individual
instance of infringement has to be exposed to us and then
we will take that down.
So, in terms of re-mix and things like that, I mean it
gets very complicated, and the beauty of it is due to Acts of
Congress, we don't have to be the policeman that's basically
deciding what is infringing, what is a derivative work.
All of these kinds of issues are really held by the legal
system and our role is somewhat protected in this ecosystem.
It is different, by the way, if we host the content, and in the
systems I"ve showed you today, apart from Flickr, Yahoo!
audio search, video search, and image search, we aren't
actually hosting the content itself, we're referential
and help steer people back to the external websites.
One more question.
No more questions?
AUDIENCE: I was just wondering are you familiar with a
website called YouTube.com?
BRADLEY HOROWITZ: Yes.
AUDIENCE: Because it has the same like style and same
tagging system as Flickr.com.
BRADLEY HOROWITZ: Yup.
AUDIENCE: What's the deal with that site?
BRADLEY HOROWITZ: There have been a huge rash of sites that
are the Flickr of blank, the Flickr of video, the Flickr
of audio, the Flickr of this, the Flickr of that.
My feeling is that Flickr's going to be the Flickr of
video and Flickr of audio and Flickr of everything.
You know, the idea is a great idea and there's going to be a
lot of attempts to kind of clone the magic, and it's also
non-patentable and I don't think it should be patentable
-- it's an idea that can be applied in many different
areas, and, you know, it's very flattering to see all of these
different start-ups and companies, basically, jumping
on the bandwagon and saying this is a good idea.
So, I think YouTube is an example of a Flickr for a video
and I think it's flattering and I think that when we apply the
Flickr principles to all different kinds of domains,
Yahoo!'s gonna be in a great position with our userbase and
audience and traffic and products to be the Flickr
of anything we want to be.
So, thank you.