HCIR 2011: Human Computer Information Retrieval - Presentation II


Uploaded by GoogleTechTalks on 13.12.2011

Transcript:
>> FREUND: Thanks Ryan. So, my name is Luanne Freund. I'm from the University of British
Columbia at the School of Library, Archival and Information Studies now known as the I
school at UBC. We're aware of it. Why not? Get on that bus. What I'm going to be talking
about today is--are three interrelated factors that play a role in the HCIR Dynamic. These
are document usefulness, document genre and information tasks. This work is motivated
in general by my interests in non-topical pragmatic approaches to Human Information
Interaction in Digital Domains. So, I'm interested in what motivates people to search and how
that affects their behavior and also in this case, so I'm going to be talking about their
assessments of documents in terms of usefulness. So, the questions here broadly, what does
it mean to say that a document is useful for a given user? More specifically what is the
effect of the motivating tasks and the document genre on usefulness? When I speak about document
genre I'm talking about categories of documents that are recognizable to searchers, to people,
based on differing communicative intents. So, documents that are designed to do different
things usually look differently and people can recognize those differences. So, things
like reports, memos, FAQs, guides, fact sheets, et cetera. I'm also interested--in this study
I'll talk a little bit about the extent to which these genres are recognizable and distinguishable
by searchers. One of the challenges in the genre domain is--Gary was talking about, it's
one of the scruffier kind of the things you could look at in information environments
because different groups recognize--think about genre in different ways. It's very often
domain dependent, whether or not you would recognize a particular genre. So, it tends
to be a little bit difficult to capture. So, we're looking at that a little bit today as
well. I'm focusing on the E-Government domain and the reason that I think that search in
the E-Government Context is important is that first of all, you all need to access government
information just to sort of run our lives. So, it's a--it's a--it's pretty much everybody
needs to find this information and pretty much all of us are non-experts when we search
this kind of content. So, I think there's a lot of room there for system designers to
develop systems to support this kind of activity. And I assume that everyone from your grandmother
to your 15-year-old are searching this kind of information and need to find what they
need to find. The study I'm going to talk about today is an experimental user study.
And the study was a document assessment study. So, we didn't have people actually searching.
I was focusing on this issue of usefulness assessments of documents. Twenty five participants,
university students, Canadian residents, this was important because as I said genre recognition
is domain specific. So I wouldn't expect people from another country to be able to navigate
through a Canadian government information space. Each participant carried out five tasks
and in total assessed 40 documents. So, eight documents for each task scenario and they
did a usefulness assessment. This was a seven-point likert scale from not all useful to very useful,
for each documents. They also talked a little about how they made that assessment and they
did some genre labeling as well. An important direction in my research is the idea of different
task types. So, here I'm not talking so much about search tasks. But rather, what I'm calling
information tasks. It's a bit about higher level task category. So, what motivates you
to search? What brings you to the table? And the five types that I'm talking about are
fact finding which is actually the closest to what we think of as a search task but also
deciding, doing, learning and problem solving. So, in this study we created 20 situated work
task scenarios. You have one example here for a doing type task. You can see that searchers
were given quite a bit of context to draw up on in an effort to prompt purely naturalistic
assessment behavior within the limitations as always of a laboratory study. And so for
each type I had four different examples of tasks. Like, four fact finding, four doing,
four deciding and that's important because I think that to really start saying that we're
talking about task type differences we have to have multiple examples of each type to
say that a single scenario represents the whole task type is problematic. So, I was
trying to get closer to that in my work. The documents are documents that were drawn from
the Canadian federal web domain, the GS.CA domain. And we retrieved these documents by
running simple Google searches. We derived a simple query from each scenario, searched
on Google, limiting it to the government domain and retrieved 20 documents per scenario from
that--those 20 documents we selected eight documents that were somewhat varied in genre
and used those for the assessment task. I would have done more but I couldn't keep the
people in the seats long enough to do that. To spend that much time assessing documents.
One of the challenges here again, if we go back to this issue of the challenges of doing
user studies is that you can't really control genre examples because they are kind of organic
things. They live on the web, people create them, people use them. So, if I would have
carefully, you know, taken content and morphed it in to five different genre and fed it to
people it wouldn't be realistic, it wouldn't carry the weight that real genre do. So, our
limitation here is that I have no way of knowing that there was equally useful documents for
each scenario but rather that there are a diverse set. And you can see some examples
here of the kind of things we're talking about. One example is clearly a guide. One is clearly
a press release. These things are common in the Canadian government web domain. The Canadian
government even has metadata taxonomy for document types of document genre, and there
are 50 different genre type on that list. They don't use it very consistently but it
exists. Okay. So, there's some potential there to work with it. And one of the interesting
characteristics of this study is that we had at least five assessments for each document,
for each scenario. So, one of the first things I want to talk about is the degree to which
usefulness assessments are consistent. That is not in the paper here but it was in another
paper and I think it's interesting and I want to highlight it. So, we did this analysis
of Interclass Correlation Coefficients to look at the consistency between judges and
they disagree there is no consistency. So, people simply do not agree about the extent
to which the document is useful and you might think that giving them these more detailed
scenarios would prompt more consistency. I suspect it's the opposite. The more contexts
you draw upon in making an assessment then less likely you're assessment is going to
be like your neighbors. So, overall 2.84, correlation very, very low and then we actually
have the opportunity here to look at the differences across tasks and I think this is even more
interesting that we see that fact finding is the only task that has some reasonable
kind of midlevel correlation. And for problem solving and learning there's almost no consistency
there. So, it really brings in to question I think some of--some of what we do. Agree?
okay. Other effects that we saw, the effective tasks. So, there is an effective task on usefulness
scores. They vary significantly by task. In general fact finding and learning people find
the documents to be more useful or they're less critical if you like. They assign them
higher scores. For problem solving, they're most critical of the document's usefulness.
There is also an effective genre. So, there are certain genre that people generally find
more useful or at least in this--in this experiment, home pages very high which reinforces a lot
of the high ranking that home pages get in WebSearch. Things like news and reference
materials perhaps because of the particular scenarios here people are probably less interested
in the government's self promotion efforts. So, they gave the--they gave those less. In
here--here's where I think we have a useful result here. Particular is the interaction
effect between task and genre. So, we see that genre makes a difference for three of
our five tasks types. Genre matters when you're doing a doing task, when you're doing a deciding
task or a learning task. We have significant variations in usefulness scores among that--the
genre palette that we're working with. But in fact finding and problem solving task there
was no significant difference. So, here you have the most complex kind of task, the most
difficult to--I'm not sure exactly whether it's the complexity, the difficulty but problem
solving is known to be a complex task and genre doesn't really help us there. So, in
other words if the task is really difficult any document that helps you is good. Fact
finding very simple usually discreet tasks, and in that situation any document that contains
the fact you're looking for is good. It doesn't matter what the format of it is. If you turn
to the genre identification part of the experiment people found a labeling genre that they encountered
to be very difficult. As one participant said, "I didn't really know how to describe them
other than information. They were reports of website pages with info. So, they've got
this really general vocabulary terms but we saw on the other hand that it does make a
difference in how they search but they don't know how to tell us that. So, it is an implicit
form of knowledge rather than an explicit form of knowledge. In terms of labeling, they
use multiple labels and so with many--with, you know, around average about three labels
per document they correctly labeled about 50% of them. But only about 25% of their labels
corresponded to our expert assessment. So, there's--there are challenges there. If we
take the group labeling and we say, "Across all five people, what was the most common
label assigned it did match?" In almost, so for most of the genre it matched our expert
assessments, so there maybe some benefit there in doing group labeling and using that as
a way to assign labels. Implications, people don't agree about usefulness--surprise. As
searches become more difficult or complex usefulness scores dropped and agreement deteriorates,
genre does matter but not for very simple or for very complex tasks so we need to think
about ways to create task-based searching approaches that take into account genre. And
genre knowledge is implicit and situational so our common way of making use of genre is
to give people a browsing structure with--with genre categories. Well, I don't think that
going to be very useful because I don't think people think that way they're--they're not
aware of the terminology and they're not specific or consistent in there use of it but a system
that say's, "Hey, you're trying to build a garden shed, let's give you documents that
are about building garden sheds that have instructions." That maybe a more effective
approach and some acknowledgements and that's it. Thank you.
>> [APPLAUSE] >> [INDISTINCT] okay. May I [INDISTINCT] because
they might be able to [INDISTINCT] I just want to.
>> FREUND: Other studies have been done that have found their domain mark, well, so let's
work with some [INDISTINCT] sorting type thing but--but most of my participants they defaulted
to putting multiple labels on things. >> Was--was that the only [INDISTINCT]
>> FREUND: Yes, well, I'm not sure what is--was is the--the difference between multiple labels
or putting a document or telling that it had to go one pile? That--that's perhaps that
they have to--to only choose one. Right. So the experts were up, can we proceed. And we
basically [INDISTINCT] on a decision, not very hard, not very hard. Especially when
you have to choose a single document--sorry, a single genre label I really think that multiple
genre categorization is the way to go but in order to have kind of a study that made
sense we opted for that. You have some query logs for me?
>> Oh. You know, I just, you know [INDISTINCT]all--all the things [INDISTINCT] yes, I'm--I'm curious
if you would--have any better luck if you made it in some way--in nominal choices less
than binary, so for example; if you ask people, well how violently do you disagree with the
[INDISTINCT] in genre in this kind of air [INDISTINCT]
>> FREUND: Yes. >> Or so [INDISTINCT] just state that or be
more flexible given the [INDISTINCT] >> FREUND: First of all they found it difficult,
they just told me they do not think like this, so that's the first obstacle, when I asked
them, you know, how difficult is it to make these genre assessments versus, how difficult
is it to make usefulness assessments? It was easier for them to assess usefulness than
genre. But, yeah, I think there are--there are ways in which we could make it an easier
task for them if we were--if our goal is to kind of find labels and attach them to documents
but in real time, I'm not sure. >> Thanks [INDISTINCT]
>> Okay. So our next speakers--let me see, Gene. Here we go.
>> GENE: Hello. What I'm going to be talking to you today about is a--some work that was
done at FXPAL over the last couple years, that let up overtime, with Jeremy Pickens,
and my intern from the summer Abdi, who is from Unity College in London. And what I'm
going to be talking about is Collaboration and Information Seeking, and this is a practice
that we've heard about today and many of us have engaged in--in whatever--whether it's
in your personal life during travel planning or looking for medical advice with your family
or in a legal domain, in a medical domain and all sorts of places where people work
together to find information. And in most cases this activity is very poorly supported
by existing tools, there are a few exceptions but mostly people send email or do IM or exchange
URLs, in someway--or even hand out printed pieces of paper. And the basic problem--there
are several basic problems which we can group in two main classes; there's lack of support
for awareness of what you're collaborators are doing, short of receiving their chat messages
or their emails, and the system doesn't really understand that there's collaboration going
on. And so we can think about a collaborative search system as consisting of several pieces
if you will. There are some interfaces to which you interact with the system that actually
would cause it to retrieve information, there are--there is a set of tools that help you
communicate with your collaborators and to maintain awareness of what they're doing.
And then there are some underlying set of algorithms that are in fact aware of--that
you're collaborating and can do something with the various input streams that people
are providing to help the team be more effective than the individuals working separately. And
this is a kind of thing that Roberto and Gerard's poster was trying to asses--I think the poster
is just around the corner here. They're looking at how--what happens when teams work together.
So we can think about this whole human machine system as like two but really--could be more
than two people working together and through a system that consists of two main components;
one that mediates their communication and one that mediates there search activity. The
actual business of retrieving information. And so here's a quick list of some systems
that I have--that are out there--some of the I researched--some of them are actually things
you can use. And roughly and very, very quickly rated by me in these four categories. One
is, do they support exploration? And, you know, some do, some don't. How good are they
at supporting various communication activities? Most of them are designed for that pretty
good. Awareness? You know, there's--there's usually some sense of--some query history
or some chapter, so you kind of get along. And Algorithmic Mediation that is how well
does the underlying search engine represent the different strings of activity and help
people actually find information? Most cases not at all. The one outlier there is that
system in the middle called search [INDISTINCT] that's a system that Jeremy and I and Chirag
built at FXPAL about four years ago which was focused in Algorithmic Mediation and neglected
everything else--almost just to prove the point. So this chart illustrates some opportunities.
In short all these systems do the easy thing which is to add some chat and add some query
results sharing but very few of them actually support information seeking directly they
don't have faceted browsing, they don't have any kind of significant history or if they
do it's--it's kind of primitive, and certainly most of them don't do anything in terms of
mediation of the search. So what can we do? And, so first I'm going to talk about some
really high level, sort of theoretical considerations of what you could do then I'll show you an
interface that takes a step in that direction. So one thing you could do is when you--when
you talk about this process of mediating communication--and when you have two people communicating over
some document or over some search result, with chat or with comments or with some other
means of communication, the system can observe that and make inferences about it. And so
you could have a side effect from this channel or from this communication channel into the
Algorithmic Mediation Component, so that because you had a conversation around some document
the system might make an inference that, that document is somehow special different useful
pertinent, not useful, whatever you--is appropriate, whatever inference you can actually reasonably
make, and use that in subsequent information retrieval steps. So you could use it for relevance
feedback or you could use it for other things. So an example here is, you know, you have
a long conversation about some document, the system might find other documents like it
or use it in a conjunction with some other query that you ran to--to biased results over
things that are similar to that document. You can go the other way as well, so you have
this algorithmic mediation stuff were your interacting with--with a system to retrieve
information and you might make relevance judgements, you might run multiple queries, you might
do some reading and the system can observe this either through your clicks or perhaps
to an itracker or whatever and can potentially make an inference from this that says, "Oh,
this person is really interested in this document or this topic and we [INDISTINCT] a bunch
of documents, ran a bunch of queries, maybe we should notify the collaborator and say
something interesting happened there." Examples--oops--that you can flag documents for collaborators to
look at based on documents that somebody else had used as relevance feedback, or in fact
you can do the whole query. So if I ran a query and started saying, "oh, this is great,
this is great, I found this document [INDISTINCT] this document for a long time, maybe my collaborators
want to know that this query and this result set are somehow special compared to some of
the--the other things that I have done." Now, I could tell them but maybe it's easier if
the system does it for me in some cases, so the idea is that you--given this--these two
communication channels between people and the data, can we couple the two in ways that
are useful? So what we did in this querying project is we built a system--we looked at
what was done before, I showed you the slide and identified some opportunities. We built
a system that actually tries to capture both of the communication kinds of mediation and
the algorithmic mediation--similar to what we did in search [INDISTINCT] and we did some
iterative design and now we're using that platform to explore these kinds of issues,
so this is a screen shot of the system and I'll be demonstrating the system later on
in the day in the search challenge, so you'll to get more hands-on sense of or eyes-on sense
of how--how it works. But what you--what you should look at here is you have some--this
is a faceted--set of faceted filters and I have two kinds of filters here and I used
the year of publication for the articles that are indexed here as one facet, you could obviously
add others, that are based on the metadata of the documents, right? What was the term
that you used Gary? >> The surrogates.
>> GENE: The surrogates. So this is--this is something of a surrogate for the documents.
The upper part of the filter where it still starts with normal and then has all docs,
useful docs, seen docs, that's a surrogate for the session--for the browsing session,
so it shows you in this particular query result, how many documents you've actually looked
at, how many you said, "Yes, these are useful." How many you've said, "No, they're not." How
many you--you haven't actually paid any attention to yet? And you can filter on those just as
you can and in combination with the document metadata. So that you can--you and your colleagues
can do some kind of triage on the result sets and this may facilitate kind of a divide and
conquer strategy or a "what did we do? What have we seen? What's--what's there left to
do?" kind of operation, so that--that's a hypothesis behind this interface. Okay. Below--below
that there's a set of query expansion techniques that use relevance feedback to take some set
of documents and different queries there use, different sets of--different kinds of query
use different sets documents, I'll talk about that in the afternoon. To do relevance--actually--you
select some documents either by giving it a thumbs up or just by doing ad hoc selection
and it takes those documents and--as well as--feedback finds new things for you to look
at. And then that's query result like anything else. All right. There's a query history that
shows the query might show whether relevance feedback was used or not, so you can see here.
This says, with three documents, this says, with seven documents, and this one says by
Jeremy Pickens, so this is actually a collaborative tool, right? So there's--there were two people
searches here and it shows that, that particular search it was done by my partner at that point.
So--and I can click in any one of those things and go back to the results over there. This
histogram is an interesting idea which we haven't evaluated properly yet, but hopefully
it will make sense. This is a surrogate for the retrieval history of that document within
this search task. Each bar corresponds to a query that you ran overtime and the higher
that the colored bar is, the more--the higher rank that document was with respect to each
successive query. So looking at that first document you can see it's been retrieved by
almost every query we ran except that one in the middle and the color represents who
ran the query, so that document was first repute by a query I ran, second by Jeremy
and then a bunch of queries that I ran after that. If you look at the document with rank
number six here you can tell at a glance, this is the first time that's come up. Nobody
seen that before. So that's kind of a--again, a metadata about what's been going on. And
I can look at this vertically and I can say, "Okay. Been there, been there, been there,
this one is new, I can just focus right on that." Or if I'm trying to review I say, "Well,
that's not relevant and move on." just depending on what task I have, and you can click on
any of the bars to navigate to the corresponding query. This where you make the up and down
assessments--I'm going to speed up because my time is ending. And this is the text of
the document, if I had a PDF I show PDF, if the PDF the PDF fails to load because the
URL is gone, I just show the text that's been indexed, so that the person can see something.
The data was noisy, you have to make do. Okay. Next. Okay. Great. So we combined--entered
the chat facility and--and you can make comments and documents, I'll show some of that maybe
later. We've done some algorithmic mediation here where you can do relevance feedback on
other people's queries, you can also take results of all the queries you ran in the
session and fuse them together and produce one rank list from all of the documents that
we retrieved to get us again a sense of what happened over all in the session. And right
now the relevance feedback stuff is manual but we're--one of the things we're looking
at is some of the theoretical stuff for sighting when inferences are legitimate. And there's
a whole bunch of questions as to whether those inferences are actually legitimate or on in
what circumstances you can do this, so, you know, just because people talk about a document
it does not in fact mean that document is useful, necessarily. Sometimes it is sometimes
it isn't. One of the things we're going to be looking at is when--how can we tell when
it--when it's what. Perhaps you can do some sentiment analysis on the text perhaps, the
volume of traffic is going to predict something, we don't know yet. The notion of sharing with
the--versus usefulness is an interesting one. We actually did some interviews. This data
is in the paper. And people had a clear distinction between when they share a document and when
they mark it as useful. Now, it doesn't mean necessarily that we should treat those completely
differently with respective relevance feedback. And that's an interesting question, you could
still--just because people don't think that sharing something necessarily means the document
is relevant, because it could be fine or whatever. It might still be effective too as a--as a
form of relevance feedback. And similarly just because the person has marked--say something
is useful it may still--you may still want to share it even though it's not in the overt
act of sharing. Whether that--that kind of information is useful or not is an experimental
thing that we need to validate. So, you know, sometimes users' expectations with respect
to these things, with their intentions don't actually match up with what the system can
do given this information and whether that's possible or not it is the subject of next
year's presentation. Thank you. >> [INDISTINCT]
>> GENE: Well, the--the two-word answer is of course it depends. We've heard that one
before. I think that some of the factors it depends on are the collection and the kinds
of search tasks and--and the kinds of metadata that are available because in some cases you
can make pretty good--I mean relevance feedback works effectively in some cases, if the vocabulary
is focused and--and you can pull things together. If relevance feedback isn't working very well
then obviously you don't want to be making these inferences. But beyond that I don't
know, we have to actually play around, get people to use the data and see what we can
find by doing this kind of speculative analysis. >> [INDISTINCT]
>> GENE: Well, I think--I think that--my feeling is that some inference is appropriate and
will lead to better performance and so what we need to do is just collect enough data
and test it to see--and so how do you test this? Well, let's say that you--people use
the system and particular topic and doesn't matter whether it's one--whether it's one
person or more they ran 10 queries and they made some--saved some documents or used some
documents in some way that were retrieved by those 10 queries, you can look at the--the
entire set of documents in aggregate and say, "Okay. We've found these--in the end these
things were the ones that people found useful." Now, it turns out that five of those ten were
retrieved by same query but at sort of--such low--by the--by the first query, but at such
low ranks that they never saw them and then some more came up and some subsequent queries,
so now you can--using that as the ground truth within that session you can do speculative
analysis to see how well can--what kinds of algorithms can you apply that predict the
retrieval of those documents at high ranks. And then--and then that--and then now you
have to figure out how best to communicate that. Whether it's through some recommendations
thing or, you know, whether there are some intelligent agent that whom you're collaborating
with or--or some other means, there are some user interface treatment of this that makes
the recommendations palatable without necessarily disrupting the organic results which people
might still expect and be able to understand easier. But that's again--whether such an--I
think that that's the kind of analysis I want to do but I need some data first and I'll
be collecting this data in the next few months. >> [INDISTINCT]
>> GENE: Yeah. >> [INDISTINCT]
>> GENE: Yeah. That's what the query fusion does. That is we do a rank fusion on the results
of all the queries that in which run in the session and documents that consistently come
up even they ranked low, get a combined score that's higher than documents that are retrieved
only sporadically and [INDISTINCT] each. The result--the ranks of each query by how many
useful documents, the saved documents that query retrieved. So, this is the cluster hypothesis
in operation that if you ran a query and found a lot of documents to be useful that were
retrieved by that query, we expect that--it's likely the other documents also retrieved
by this query would be more useful then documents that were retrieved by a query in which you
didn't save anything. It's--it's probabilistic and you know, it's just a little bit of a
waiting but this is something that we did in search [INDISTINCT] system as well. It
seemed to have worked okay. >> [INDISTINCT]
>> GENE: Right. So, this should scale just fine to small groups. We're do you in fact
know everybody, you know. Four, five, six people it should be fine. Beyond that, I think
the notion of relevance breaks down because you have no guarantee that everybody's on
the same page with respect to what they're actually looking for, even if they're running
similar identical queries. So, you kind of need to back off a little bit. In fact, there
was a poster--I'm blanking on the name of the person who presented it. Yeah, Scott,
who talked about collective information seeking, which I think is more what you're talking
about. Were you--you can make some inferences. It's not just a recommendation system but
you can actually make some inference about the fact that some body else is looking for
the same information kind of at the same time and maybe--and you kind of maybe know that
person through some social graph. Yeah, in the case of ancestry might be the--you have
some shared trees or something like that or maybe you're interested in some the same surname,
you know. You could establish these kinds of correlations in some way. And once you
have that information, you might be able to use that as a catalyst for more explicit collaboration
but you need--you need to treat the task sensitively because judgments of usefulness, relevance
are--are tricky because--just because you found something doesn't mean that I would
find relevant to the same query if were not sharing it--sharing explicitly. Thank you.
>> And no, I'm fine. Thank you. Oh, I can't see it [INDISTINCT] it's great. And did you
copy your slides here? >> MEDELYAN: Yes.
>> You did. >> MEDELYAN: Hi. My name is Alyona Medelyan
and I lead the research at a New Zealand company called Pingar. And today I'm presenting joint
work that we did with Anna Divoli. She was a post doctorate at the University of Chicago.
I actually found Anna's--a screen shot of one of Anna's work in Martin's book and contacted
her because it looks so similar to something that we developed the Pingar and I wanted
to evaluate it but I never ran a user study before. So, I suggested we collaborate and
luckily Anna said, "Yes" and these are the results of our research. We decided to study
the features of a search interface and how useful they are? And we focused on bioscience
because this is one of the verticals of interest to Pingar and Anna's background as well. We
have identified five features that were of interest and here they are in a very typical
system that is used in for searching biomedical data, PubMed and one is order complete and
you guys are all familiar with this. Search expansions are ways of modifying the search
query to find more results. We also looked at facets and PubMed suggest to the facet
of refining search results based on their type but there maybe many other ways of using
facets as well. Related searches help to re-focus research and result previous shows, in this
case what's--what is the title of the document? Who were the authors? What's the journal?
And yeah, it depends. We edit a second dimension to our research by looking at the search task
similar to Leanne. In the paper by Keller, they found that majority of the task are related
to transactions. Such as, writing emails, doing online banking. And the remaining tasks
are spread even the between fact finding, information gathering and browsing. We actually
found that in bioscience it's pretty hard to identify the task--of the type of the search
task but looking at these actual descriptions of search results by our participants, you
can see that there's some similarity. So, anymore models Huntington's disease is quite
specific so, we go with the fact finding search task. If a researcher want to find information
on that relates to their own studies, it's information gathering. And if there's just
browsing a new publication in a given area then it's browsing. So, our research questions
were, how useful are the five features of a search user interface for researching in
biomedical texts? And how that--and we hypothesized that it depends on the search task. And in
addition we also wanted to find out what works best in terms of representing and computing
facets? And our hypothesis was that it's better to display dynamically computed facets rather
than a complete list of everything that's available in a facet hierarchy. We first run
an explore--exploratory study were we emailed--no, we interviewed and questioned several Bioscientists
and just asked them in general, explain them the search tasks and--and asked for typical
queries that they would categorize under those types. We then used the result from the study
to design our main study with 10 Bioscientists. They were all experienced researcher and they
were very keen to participate. [INDISTINCT] >> I'm working on it.
>> We emailed the participant in advance and asked them to provide us [INDISTINCT] And
so, we used the responses by the participants--the actual search queries that they used in their
work to create screen shots of how various systems implement specific interface features.
So here, we used the PowerPoint presentation to compile them all in one place and show
it to participants. And here's this presentation for one of our baseline queries connects in.
Can anybody recognize were is Google and where is Bing here and where is PubMed? I can. So,
we didn't tell the participants which system is which but you can see that how different
systems treats such a common feature as order complete. Those who had a commercial system
go PubMed and next by another commercial system. And the research prototype the [INDISTINCT]
so, not all the system implemented all of the features. For search expansions we had
an internal implementation using Pingar technology which is presenting a check box and if the
users select it we build a Boolean query. And PubMed, you already seen this example
has quite a specialized search expansion option. For faceted refinement, we tested several
systems developed by Pingar using our API and we also used solar to generate facets
based on existing metadata assigned to medical articles. At the bottom you'd see the PubMed
facets and E is a system with a full hierarchy of terms from a research prototype. We also
tested the same presentation but with check boxes and go PubMed used to check boxes with
their facets as well. We also looked at related search queries and you can see how different
the suggestions are that users would get if the used the systems. So, we were focused
specifically on the content but also collected users ratings on design. In terms of the search
results, we have compared only two systems. Both developed at Pingar and one is using
standard tools available through the solar search engine which is E. We used--we compared
keywords that were driven from the metadata assigned to the articles with laws that we
automatically identified using our keyword extraction tool and for previewing the actual
content of the--of the document we compared snip it's, calculated by solar with text summarization
to by Pingar where you mouse over and the context of the--of the sentence is displayed.
We have collected a lot of data and put it all into a single spreadsheet. And I would
really welcome those tools that make user studies less of a hard work. So, in the first
column we compiled all the information we knew about each participant and all the remaining
lines represent ratings of various features. So, you don't have to read this but this is
a rating of a one participant on how they evaluate the order complete or the query,
animal models of Huntington's disease. We asked them on the likert scale, how useful
they would consider this feature when searching, and below we have collected squares for individual
systems separately for looks and for the content. And those asked them to rank the system in
order of their preference. We then ended up compiling the results to support our research
questions. And the first research question was how useful are different interface features?
So, and--and whether it depends on a search task. So, here we have a stroke representing
a single participant. And green denotes the positive ranking of a feature and if we look
at order complete. At the very top you can see that three people were performing a browsing
task and they all said that order complete is useful as a search feature. People who
were performing fact finding disagreed and people who were information gathering were
predominantly neutral with respect to order complete. Regarding query expansion, somewhat
more positive and responses on browsing but very mixed opinions for the other search tasks.
Whereas faceted refinement has been rated highly for all the features equally. Related
searches on the other hand was negatively ranked and this is participant commented on
this--they rarely use the related searches option because they know what they're looking
for, and they don't want to change the focus of their search. They are more interested
in refining and this is where facets are very useful. And the query of search result is
useful. It's kind of intuitive. You have to know what results are returned. So, participants
commented why they found order complete is less important. They say they feel pigeon
holed by those suggestions. So, they don't want to see them. And with regards to facets
they commented that they should be focused on their search queries. Say if they're looking
for a particular disease, they would like to see the symptoms and they really like--they
category organisms. So--but when they saw one of the system that had a very large number
of facets they felt overwhelmed, and we had to force them to actually look at the system
because of how colors were use there. So, check boxes is something were they all said
a very useful. They want to select multiple choices. And in terms of how these findings
relate to previous research, we--our findings are in agreement with order complete but with
regards to facets, other researches have found that participants do like full hierarchy not
by a scientist though. So, in general our participants told us that the looks of our
system has to be good enough for them to use it but what really matters is the relevance
of the suggestions. And yes. So, I think this is an important finding. So, to conclude this
talk, we asked whether search interface features are useful but for various search task in
biomedical literature and we found that facets and results preview are useful where as other
features are more useful for browsing. So, if you are designing a search interface. Do
not waste your time coming up with great algorithm for related queries. In terms of faceted navigation,
very few query oriented and specific values is something that Bioscientists prefer and
they like to use check boxes. Another finding is that Anna actually now not at University
of Chicago anymore. She's in--she's in Greece and applying for visa to come to New Zealand.
So, I'm looking forward to more experiments with her.
>> I have a question about why you think some party some of the touches, queries with their
[INDISTINCT] >> MEDELYAN: Research expansion specifically?
So, there were two comments that they made. In terms of PubMed implementation it was only
understandable for experts, people who have experience in searching in library data bases.
And in terms of the suggestions that we generated using Pingar technology, we actually have
included various modification of the search which also included not only synonyms but
also alternative spellings and they said that this will be included--the spelling should
be included automatically though should be the synonyms though they felt like the system
should do it for them rather than them doing the selection.
>> [INDISTINCT] that is my question about I would like to tell you graphically [INDISTINCT]
>> MEDELYAN: Yes. >> Very logic, so. Instead of [INDISTINCT]
>> So, we--I will show you the related searches again and--and here this is the only suggestions
that were generated by Pingar. This is the PubMed suggest and there are suggestions by
Bing and Google and all the different systems and the suggestions were generated for each
participants individual query. The actual queries that they enter though I believe that
it represents the correctly what the system are capable of.
>> [INDISTINCT] >> MEDELYAN: They--they did comment on--I
think they did comment on related documents. We did not include this feature though in
our study. >> Just actually the follow up with that.
I'll be curios when--when you have related searches that there's a distinction between,
when people see related searches and say, oh yes that's what I want to find, versus
when they actually go and click through them and there's a disconnect between what they
expected they would get to what they actually do find. Just different problems if you're
able [INDISTINCT] >> MEDELYAN: The only comment that I do remember
is that they just want to get more specific with everything with every single action and
when they looked at some of the--at the related searches suggestions. They click on the link
they felt is going to open another window and then they have to deal with this. So,
they weren't keen on clicking on any links. >> They don't want to [INDISTINCT] they want
to... >> MEDELYAN: Yes.
>> ...regret. >> MEDELYAN: Yeah.
>> Nice work. >> MEDELYAN: Thank you.
>> And so, we end. Now, we're going to have four short talks. And please hold off on questions
until I end. And then I'll ask all of the [INDISTINCT] up here, if you know, anything
[INDISTINCT] and--you want this left or do you want to...
>> CAPPER: No, that's fine. Okay. So, I'm going to talk to you about some really early
stage work. I'm Rob Capper by the way. I'm going to talk to you about some early stage
work that I've been doing with some of my students. This project in particular with
Jason Wright, one of our master students on Faceted Search on Mobile Devices. So, we got
motivated by this Marty because of your quote here that mobile computing is growing in popularity
but there's still sort of this open question about how well faceted navigation works on
small screen devices. And it turns out that this seems to be almost as true in 2011 as
maybe it was in 2008. There are a lot of different things out in the wild in terms of how people
are handling this. So, you can compare what people are doing in their mobile devices to
what they're doing on their desktop interfaces. So, Target for example here, these are sort
of single fast or single hierarchy navigation on their mobile device but in their, website,
desktop site, they have full facets. JCPenny for example, they try to mirror on both. They
seem to support full facets in the mobile device and on the website. And obviously this
depends a lot on what task you're trying to support and what user goals there are. But
there don't seem to be some good--there doesn't seem to be good research behind this. And
even the UX community has identified that there don't seem to be good design patterns
for this even. And I love this quote, "Designers and mobile applications don't have established
user interface paradigms and they don't have abundant screen real estate either. So, we
got interested in this. And like I said this is some very early stage work so I underscore
the word explore because that's kind of what we're doing right now. And I hope that later
today I'll get more feedback from you guys and you can help shape the directions we go
with this. What we've done so far was to develop a system to help us prototype a different
types of faceted interaction on mobile platforms. And so I'm going to show you part of that
today. I'll talk a little about the architecture. And then I'm going to show three particular
interaction styles that we have mocked up. So, the basic elements that we were trying
to support are very simple. A search box which is the S box there, were someone's going to
type a query in if they want to. My student liked to call this the Cherry Picking Display.
I think of it as the Breadcrumb Display. So, that's what the C box is. And then the F box
is where you're displaying the list of the facets themselves in some way. And then the
R box is where the queries results actually show up. So, the architecture that he built
was using the solar back-end, the solar PHP client interface so that we've got a bunch
of PHP libraries that help support each of those components, so the C box, the R box
and the F box. So, we implemented a number of styles of interface with this. And we used
the open video collection that Gary talked about earlier. So, let me run through these
really quickly. One is called the accordion style. And in this you got a button to narrow
the results. You press that button and the accordion expands out. If you want to look
at the genre that accordion expands out. And then if you choose the femrel it adds that
to your list of active filters. But that takes up a lot of space on the screen. Next one
is kind of an interesting one, it's called the teaser. It's a teaser because you get
a little teaser over to the right side. That the facets are over there but you can't quite
see them. And if you swipe over, then you look at this. And you know that the results
were still back over there on the left side and you can swipe back. And then you can do
the same things here, you can kind of go through and it gets added to you filters. The last
one is called an overlay. This is sort of a model overlay where when you hit the refine
results button it pops up an overlay. You can expand that out and then it gets added
to your filter. So, we've got a bunch of ideas where we're going with this in terms of evaluation.
I think there's some interesting questions from sort of the HCI and ease of use side
about how these work. But I think there's really some interesting cognitive questions
about what you're able to see on the screen and how easily you're able to go back and
forth during the results to the facets. So, you can try the system out that's the shortened
url, and I'd love to hear a feedback from you about it.
>> BAGLEY: Hi, my name is Keith Bagley. I'm from IBM and also from UMASS. So, I'm going
to describe to you in very brief terms some of my early PhD. research on Conceptual Mile
Markers. So, question has been asked and sometimes they ask is, "How did I get where I am at,
and how do I get back?" And this question is not just from a information retrieval perspective
but in everyday life. So, when my wife calls me up on the phone, and says, "I'm lost in
Boston, give me some landmarks on how to get back to 93." That same question has to be
realized for someone to have some useful value out of it. So, landmarks, mile markers are
things that in the real world we use for navigation purposes, for way points in to help us get
to where we want to be in terms of destination. Lessons from a travel industry that I'm trying
to apply here are things like this. You know, we have general places that we want to go
and maybe we have some tour guide or some tour book that gives us the perspective of
what we want to see. But then there were some interesting places that we would like to go.
Either there's some landmarks maybe someone from your family has already been to France,
I haven't. And there are certain things that are particularly interesting. But there are
also some diversions, someone put it as a serendipitous event where you actually take
a scenic tour and you find something that maybe you weren't expecting but it had some
intrinsic value to you. So, there's also that as well. So, we can think about maps and way
points and points of interest but we can also think about user preferred routes as well
in terms of--well, this is a really nice side route that you might want to take to get some
value of it. So, we put this whole notion of retrieving from the electronic perspective
one said, and we can talk about best practices from even our own vacations. So, you know,
someone has been successful in the vacation, then there's usually some planning here. But
from information retrieval perspective, we find from many of the studies that a lot of
the people in this room have done is that they end in frustration often. They sometimes
are unsuccessful and there's often some overlap in the goals between users. Okay. So, what
if we were to actually apply some of the best practices from the travel industry or from
our own best vacations to this approach? So, we can think about, you know, places they
go with our queries. But we can also talk about things that I'd call road maps. Some
people call them query trails or threads of the search. But in that there are potentially
some points of interest. Now, the goal here is to give the user the ability to get not
only the end point or the destination but also that serendipitous event possibly because
they actually can expand out this road map and see what those points of interest are.
And before they commit to actually going down the path, have some landmarks. Some visual
cues that pop-up and say, "Hey this is what you will see if you take this trip." Thereby
giving the flexibility of either committing to that or backing up and doing something
else. In addition, we have this notion of basic road maps and use or applied scenic
routes or preferences which can also take what we already know as well as what we bring
to the table, so almost like a bring your own route. And if you think about what the
typical GPS devices due today, you can actually say, "I want the fastest route. I want the
scenic route. I want something in between that." Something like that is what we're also
doing with this prototype. So it's a framework. It's the prototype bringing the wisdom of
the masses as well as you bring your own road map of success to this which actually will
allow for collaboration. The last piece here, real quick, I know we're running out of time
here, is looking at how we can expand this search paradigm to mobile devices. What happens
with this prototype is that we encode the road maps into QR codes, so that they can
actually be pushed over to mobile devices. Hereby allowing this split notion of starting
a search possibly at the desktop and then going to the mall and continuing based off
of what you see there and where you want to go. Okay. And future work is here. Thank you.
>> Right here. >> Okay. When I met Xingjin this morning and
she told when she was preparing for slides in a flight, someone next to him told him--told
her that she's using the same template. Now, I k ow who he is. Is that you?
>> Yeah. >> Yeah. This is a very good use of relative
results than that. This template has being preferred by a lot of HCIR researchers. The
study is about the Effect of Cognitive Styles on the User Performance in an Information
System. Before I start my presentation, I want to mention that this study, unlike the
two poster--two posters IDs introduced this morning, this study is derived from U Albany
Faculty Research Grant which I received right after I started the job. So this data analysis
is based on some existing data analysis. We--the data were collected several--a couple of years
ago. So I want to give my thanks to HCIR and also my university who provided me a lot of
support--research support during the past years. And in this study we--in this data
analysis we focused on--this study is derived--is based on a larger experiment which--in which
16 subjects did several tasks using all the information system which is what I'm going
to introduce here, the Web of Science System. And the other 16 different subjects were using
a system called early information derivation system called science-based. And when the--when
each subject comes to the experiment, they first did a cognitive task. Then we asked
them--we give them a tutorial of the system either could be the web of science or could
be the science-based. Then the--after the tutorial they did all those tasks. And before
and after each task, we could collect some information--collect some questions using
questionnaires and also at the very end, we asked them experience about the whole experiment
and all the tasks. So this is the web of science system. I'm not sure how many of you have
used them. And this is where--this system--by using this, researchers can access to a lot
of leading citation data basis. And you can see from the screen that all the scientific
literature, well represented by a rank list of results. And here is a cognitive task we
choose. Cognitive task is a construct--it is--which can describe--which can be use to
describe the habitual model of a person's perceiving, thinking, remembering and problem
solving process. And we choose Peterson's ECSAW task in which the style preferences
are measured by comparing their main [INDISTINCT] action times on the holistic questions with
their main reaction times on the analytic questions, so that we can collect the WA ratio
which is the holistic analytical reaction time ratio. You can see from the results in
the next slides. So these are the task we gave to each subject. And the task is categorize
into two types, analytical search and aspectual. So if you are interested in how we--why we
choose this task we can discuss after that. And also these are some results. We measure
their task performance based on time of task completion, satisfaction, result correctness,
number of mouse clicks and aspectual recall which was specifically tailored to the aspectual
task. And also you can see from the results, we categorize them the group of subjects based
on their median of their WA ratio. We didn't find--across these two groups we didn't find
any significant results on the--in terms of the performance. And we--in the other--in
the analysis of other--the other information of realization system, we do find some interesting
results in terms of this measures, in terms of how cognitive styles have a significant
impact on the user performance. So if you are interested in the results, one already
is in press in Journal of Information research. The other one is a poster we did to compare
the search factor menace between the systems. And also I want to thank--I want to thank
all my collaborators for this project and this is our contact information. Thank you.
>> In this, we have a one more talking and we'll be--questions after that. Did you figure
the slides for you? >> ZARRO: Hi everyone. I'm Mike Zarro. I'm
a PHD student at Drexel University. I'm going to talk about some work I'm doing with my
adviser Xia Lim who's here, on how we can help the--about a 140,000,000 health information
researches in this country to find a high quality of useful health information as the
intermediaries on their--the doctors, nurses, hospitals, other sources are kind of somewhat
being replaced or at least augmented the process called App Mediation by Eisenbach with information
systems with search and browse techniques. So you know from a digital library background,
the challenge as I see is how can we use some sort of metadata or labels or terms to provide
what have been called navigational science post to these users. So if we see that, you
know, heart attack the term I might use as a lay user well really myocardial infarction
is what a doctor might say. Now, this is a very simple example I think mot of us in this
room at least would understand this but perhaps there are even users out there using the web
today who wouldn't understand this. So how can we help them and how can we help ourselves,
you know, more educated users with more complex task. So to guide our work with this, we look
at how we use MeSH which is a controlled vocabulary published by the national library of medicine
and social tags from delicious.com which has undergone some changes over the past few months.
So that's a little bit of a challenge for us. To really--service filters, well users
are searching in an interface we design. We'll see that in a second. And also, you know,
as we design this interface, we saw that maybe we could provide the expert PubMed resources
from the National library of Medicine which are medical journal articles really high level
stuff that, you know, most of us probably couldn't understand lay user. And also yahoo
results which would be the more general search engine results. So, in the experiment, we
called 20 students from Drexel University a 10 lay--we call them lay users, students
with no medical training at all and 10 healthy medical students with at least two terms of
training. So, in one of the scenarios, we asked them to apply two filters before answering
the question. The other two scenarios, we just asked them to answer a question but no
additional instructions. So, here's a prototype, fairly simple. A query sent to two APIs PubMed
and the ya [INDISTINCT] pertaining with these result [INDISTINCT] results and mesh for the
PubMed sent to an interface intentionally designed to be simple so the user can understand
it and our users commented that the simplicity was actually a benefit. On the left you'll
see combined the [INDISTINCT] data we retrieved. So, we have MeSh and we have the delicious
tags here. We didn't tell them which was which. We wanted to see which they selected on their
own and they were ordered by frequency. On the right we have the yahoo results combined
with PubMed you can see it's really obvious which are which but users kind of some of
them liked it some of them didn't and they would explore into even PubMed results which
we will see in a second. So, of course in the lab with them and the system is very helpful
and satisfying. Everybody loves it. Which made me feel great. But how useful is that,
I'm not sure. Some of the comments were actually I think really useful that we see, you know,
users could look at the filters as we call them, find terms and actually I saw one user
highlighting with her mouse the terms and then looking in the search results to try
to find that terms somewhere in there. So, they kind of had a feeling this is a health
search interfaces or something you know, I can really find new and any interesting let's
call them signposts here. So, as our users looked at yahoo and PubMed results, you know,
we see that they view a lot of yahoo results, viewed a lot of PubMed results, but really
the answers came from the general search engine. Why is that? Well, even for our--they're not
necessarily experts but they're more trained. When you go to PubMed it's not most user-friendly
system. At least for the tasks we were having--yeah. So, here are the terms they clicked, you can
see lay users and a couple of medically trained, fairly similar. This is for finding the risks
and benefits of gastric bypass surgery treatment outcome risks, risk factors. Fairly useful.
But how many of these are delicious tags and how many of these are MeSH? What turned out
there were no delicious tags left, at least for this one task. But why is that? You would
think that the lay users would find the terms, you know, from the wisdom of the crowds and
off we go with these great signposts. Reality is health, research, tags like that is the
most common, they're not really useful for this. So, we think maybe, using MeSH control
vocabulary assign them into general health resources could be useful. And at the very
least--thanks for working note, at least this student learned a lot the guy keep everything
so nothing else. >> Without further adieu, I'm going to introduce
the team from L3S and ready is all yours. >> Hi, thanks. So, I'll start with the first
task, which it's about MapReduce and the idea is that--so I'll just--I don't know if you
can read it so I'll just point out a little bit through what it's about. So, there was
a block post where David De Witts said that [INDISTINCT] produce is not novel and it more
employed more than 20 years ago. And the idea is to find publications that support this
where I'm--well, say that-so nothing come--so that this is not true by not finding them.
So, what I'll present is our system which is called phi-search in spite of the logo
there. So, we were changing that. So, it's based on what the fisted DBL system that we
had. And the idea is that--so starting with the MapReduce since we know that it's about
for data processing, we can input that query. And I can point out at this stage that actually
they said I have to challenged health task and then also with future ideas because of--for
example, there is this thing with going from my MapReduce to parallel data processing which
now we know it. But it's actually more on matter of putting in my MapReduce and the
system should use something like--I don't know, it would be the anthologist so are some
synonym list or--yeah. That's how we to come up with this thing. All right. So, we input
this query--I'm also going to work a little bit around the system. So, on the left side
here we have--we have the facets like, you know them from all other systems. It's--this
system is based on solar so this was easy to do--what we have on in this part here It's
a thing which we are now working on so which has suggestions for the query terms in which
filled basically. So, what type you might want to search. And it's also based on the
facets so for other queries, it might be a little bit more meaningful. Just--I just show,
you know, right. So, and then going to the results, we have a rank list like, you are
familiar with what we've done--it also made the fields maybe a little bit more simpler
by putting authors, editors, contributors together in a by field. So, which--it's a
little bit more natural language. We have also in which is something like--it can be
publisher source, it can be even the year and conference and so on. And when opening
actually the result and it would--would going to the separate fields and tell you exactly
what you are looking for but for the overview it was easier to keep it simple. Right. And
the interesting fact for this task is actually we know that a blog was from De Witt we have
also a results pretty much on top. And if we click on the author's name then another
search will be triggered with by and the author. And we also been told that a query--or that
publication should be more than 20 year's old. So, we can also do this. This is the
another idea which we will be working on not to say year less than something but saying
something like 20 year's old or actually before 1990, maybe it's also to go more into this
natural language kind of queries. Right. So, coming back to the task, we can actually find
here this publication which--well, if we read the broad post then we also see that the results
are also in there so we can check this and see that this result on the first place is
actually the one that we are looking for. All right. Then, going to the--to another
search task, we took this people who like this person also like which is about collaborative
filtering and the question is, has anyone applied collaborative filtering to people
search? And what we're going to do here is create a query which--so we have different--we
support the loosing query syntax which normally comes with solar. So we can have a phrase
query which we do collaborative filtering right as a [INDISTINCT] and then we have--we
can have Boolean queries where we have--it's people search but people can also be person.
So, we can do searching person or people, so it works and then search. That--and what
I want to show here is actually also this Topical Cluster Inc. which works on the results
that are being shown. So when I'll--I'm going to go to 100 results. That means, online this
clustering is being performed and that uses caret two, clustering engine with the lingual
algorithm, and you get this topics which are based actually on the titles of the results.
And if you look at them, one of them is social networks. And we can go to this because it
goes into this direction of people search and narrows down the results. What happens
when I click on it is that it's actually added to the query. But if I look at the query,
it's still the same one but at the end social network is being added. And the terms which
are here are [INDISTINCT] with end. So, all of them have to appear. Right. And then--all
right, that some other functions would be like if--we don't have an advance search box,
so we want to keep it as simple as possible and for the beginning we have this query syntax
button which then opens a short explanation how you can search and what it actually--or
some fields which you can use and how--some example queries and some operators that can
be used. The idea would be then in the future to whom--support this kind of--let's say,
natural language queries, also having some words which we can detect that the user has
an intent of searching with some different operators and transform the query in accordance.
And so that it's--we want to keep it as simple as possible and have just this one search
box and don't get it too complicated for the user. Right. And again, looking at the results,
we will find--we find this publication referral, where do we have some anti-collaborative filtering
we call from. We have the link prediction problem for social networks. So, there we
go and we find this kind of publications about collaborative filtering for. Right. So, that
was it for my part. If you have anymore suggestions, ideas, anything, I would be glad to hear about
them. So, we can have them, then sometimes do about it.
>> Yes. I encourage... >> I don't know if it's now with questions
here. . >> ...I encourage people to ask questions.
>> Okay. Great. >> Yeah, ask questions.
>> Oh, please. All right. >> Okay. I wonder if [INDISTINCT]
>> Yeah. >> [INDISTINCT]
>> It doesn't know concepts beforehand, just to--creates them from the titles of results.
So in this case, we have 100 results which I return and in the titles of those results,
social networks appears pretty frequent and that's why it's put as--so also documents
side being clustered, and from that cluster, social networks appears in the title. So those
documents and then it's made as labor of that cluster.
>> [INDISTINCT] >> Yeah. Yeah.
>> [INDISTINCT] >> Yeah.
>> [INDISTINCT] from this topic, I wanted both [INDISTINCT] more browsing and we got
both, so we then edit to use at a core? >> Yeah. You can do that. What happens now
is actually, if I click on mining, then it will--I don't know if browsing will appear
again because the clusters are generated again from the results. So it's like...
>> Oh, it's like [INDISTINCT] access? >> Yeah. But it's--I mean, the idea would
be... >> [INDISTINCT] that by hand [INDISTINCT]
>> Yeah. Yes. So you can do like mining or browse in [INDISTINCT] yes?
>> [INDISTINCT] >> You see, that's always the problem. I mean,
your--you want... >> [INDISTINCT]
>> That's thing you can never be sure actually that you didn't find what you are looking
for. Maybe--that's all you--if the system is good enough, it will help you maybe to
find--let's say, 99% of the cases what data you are looking for. But there is still there
this uncertainty that you... >> [INDISTINCT]
>> So it--well, in order to improve, recall it would be--so one of the things that it
would--to not to improve the system actually is try to use the [INDISTINCT] the problem
there is, they have to be--actually, the main specific to help more. And if they are not,
they might be too general. But let's see about that. We actually have works and also some
query translations, so we can search--we input the query in English or in some other language.
And then it translates it into different other languages that you specify. So, because we
have documents from--but not in--that's here now in this data but not that data that we
have here because we're in different languages. So that is also a problem. Yeah?
>> A question is for backup, for everyone. Part of your success is due to the fact that
you're able [INDISTINCT] that you actually start with a theory within the first stage.
[INDISTINCT] How are we supposed to take that into operation in [INDISTINCT]
>> I'll answer that thing. It is--to me, which is that obviously a system that requires that
is--that is--were as flexible with respect to that, is better than one where we are taking
[INDISTINCT] case. We try to have queries were it would be, oh, I just know the answer
for everyone and that would be interesting. The instructions to participants, is that
they will to use their own domain knowledge and to recall a paper that [INDISTINCT] very
well in that. Part of the point in a system is that [INDISTINCT] that. So, if it is you're
feeling that the system only works for somebody with a very high level of domain knowledge,
that's a negative. Even if you're feeling that the system is no better for somebody
who has more domain knowledge that would also be a negative. So it should be the degree
to which they adapt. It is up to the presenters to convince you of the flexibility of that
with a limited time. >> So that's why I was saying it--this task
actually has [INDISTINCT] also for the future to know how to make them--to improve the system.
So, thank you. >> Okay. Thank you.
>> I believe... >> Should I leave the laptop?
>> Anyway [INDISTINCT] everyone is going to have to remodel it.
>> Okay. That's... >> [INDISTINCT] is doing the--everybody is
in the local system. Thank you. >> [INDISTINCT] there.
>> Yeah. Well, I want to keep a plugged in. >> [INDISTINCT]
>> Sir? >> Go ahead and plug...
>> Plug it. >> TUNKELANG: [INDISTINCT] borrow it since
she didn't actually had a minute or two in his previews presentation to introduce the
system. Now, okay. >> [INDISTINCT]
>> Is there [INDISTINCT] here or [INDISTINCT] >> Behind the [INDISTINCT] on the wall.
>> Yeah. >> Okay. Go ahead and unplug it, Cathy.
>> TUNKELANG: While you wait. If you have laptops or netbooks or iPods, I do encourage
you to look at the description of the challenge if you feel that, will inform your [INDISTINCT]
now, it's--you know, one thing I want to say with the corrective to the previous and all
the challenge participants. Obviously, I could have given them a task on there that would
have been completely impossible and just seeing how many days or weeks they spent. Of course
that's--it's very difficult to... >> True, that's all we had.
>> TUNKELANG: That's true actually. But it's very--it's very difficult to demonstrate the
failure to find something. So you do have to use a little bit of imagination, I think,
in watching this as to seeing what the experience would be like, what's the support you get.
And the questions are sufficiently, subject to interpretation. But that's actually something
to take into account as well. >> With the [INDISTINCT] All yours.
>> All right. Thanks. So I'm going to do the map reduce task first and I just started it
and typed in the query. Bob and Grace [INDISTINCT] them and map reduce in quotes, those terms
Bob and Grace and gamma come directly from the blog post that motivates the task. And--so
I did that. I apologize for the layout here. The screen really wants to have more pixels
than this projector is giving it. But I'll do a little bit here. I'll just size this
thing with this over a little bit to give a little bit more space. Okay. So, this is--we've
running under this interface, so there's nothing magical the search. It's just plain [INDISTINCT].
And this search--so this is the system I talked about earlier. And you can see we have an
article here on this algorithm and it's--the URL of the PDF doesn't exist. This is a common
problem with this collection because it's kind of stale. But I can display the text
that was in indexed because we did get the text and that's cached. And so we can look
at this article. And it does, in fact, talk about the GRACE Hash Join which is a parallel
database technique. So I'm going to mark this thing as useful. And I'll slide this back
so that you can see more of the results. This thing is--it doesn't appear to be the case.
We can look at it. Again, it may or may not--oh, it's downloaded as a compressed postscript
file, not very useful. So, again, we can just look at the text and there it is. I could
also open it up in a separate window, but I won't bother you with that because... So
I've said, this is not, in fact, what I want to see, so I can just hide it. And then I
can look--and that means that document won't be shown by default in subsequent queries
but it's still available. So here's an--here's a paper that compares four different parallel
join algorithms; just kind of related to what we're talking about. So I'm going to say,
yeah, yeah, that's good too. And here we have another document that's abstract logic flow
execution model for parallel databases. We can look at that, not found, hello, et cetera.
I'm going to--that's also useful. Now what I'm going to do is a query expansion step.
I've taken these three documents that I decided are useful and I'm going to find other documents
that are materially similar to those three. And the system runs and you can see that here
it's building up a query history, so using the terms that I had typed plus the documents
I had selected, it now finds some results. And it's showing me as it runs--so it's found
these documents that I've already seen before. I can tell that because, one thing, I've already
marked them as relevant. And then I can see the document number three is a new one. I
haven't seen that--I mean, it was retrieved before, this is where the histogram comes
in, but it was retrieved at a low rank. In fact, it was it was retrieved at rank number
13. That was below the fold, I never saw it. But now it's promoted to a higher rank. And
so, again, it's some--it's an interesting thing, I can save it. And now I'm just--so
these documents are, in fact, sort of about what was being talked about, although, again,
maybe it's too easy just taking words directly from the blog post. But what I'm going to
do just for sanity's check, I've [INDISTINCT] around another query, yeah, and that's going
to produce zero results, that's fine. Or it's going to reproduce results for a parallel.
And that's, you know, light. So I don't have spelling correction here, sorry. I'm going
to rerun that query. And--so, now you can see that it is producing completely new documents,
at least in the top--and in fact, I can tell right now that--looking at this, so that's--I
haven't seen any of these documents before. They might have been retrieved so I need that
data, so separate pass. But at least I have not looked at any of them. What I can now
do is take the results of all of these queries and combine them and do that rank fusion operation
I talked about earlier. And so, now you can see that these documents have scored high
on the queries, yet high ranking here--high ranks here. And as I go down the list here,
number five, for example, is a paper by DeWitt and Gray. I didn't--I know--we know from the
previous presentation that DeWitt was the author of that thing. So here again it's not
found. Again, we can look at the PDF--I'm sorry, the text. And what was there was a
40--fancy 404 page. Right? And so I can say, well, actually it was a postscript file that
I archived. Anyway. So, I've save these things, I can now look at my summary. And this is
what I've done in this session. Right. Starting with the query here, I--sorry, it went up and it
saved some documents. It saved some documents. And this gives me a sense of what's happened
in the session. I'm now going to go and switch to the second test which is the facet our
query. And again, I started it with the query which was the title of the paper by Bilken
et al. And the paper itself actually appears in the search results. Here, number four.
I don't know why it's number four and not number one. I'm not going to worry about it
but I'm going to mark it as usual. I can also just select it directly and say, "Okay, find
the other things that are like this because I want to kind of get a sense for this space.
And it's going to run a query and now I--you see tweets mentioning what I'm doing up here.
It's not yet integrated into the tool. So, I'm going to look through--so I selected this
document and it's found some other things. What I'm going to do is I'm going to look
through this but I'm not going to see in here anything that captures my eye or something
that is, in fact, relevant to what I'm doing, so I'm going to try a different query. I'm
going to clear this. That's a query history that I have to date here. It shows that I
saved one relevant document that's--I'm not going to worry about the--this purpose. I'm
going to type a query called--type named facet at search history, because I think that it's
kind of what I'm after. So now I've run the second query, I get [INDISTINCT] a bunch of
documents. You can see that they're all new. I can kind of scroll through this. And there's
a flamenco paper which we can look at. They're going to be alive, yes. There's the PDF. I
can actually scroll through this and read it if I want. I'm--and I'm going to--yeah,
this is good. You know, it gives me some idea of things. I'm actually going to look at this
one here, which is a paper by some people in this room. I placed one of them, I think.
No, it's not you. I know [INDISTINCT]. And this paper, I'm going to size this out larger
so you guys can see this--the text better. Hello. Now, the screen doesn't let me get
any larger than this. Okay. I'm going to open this in a new window so that we can see it.
So this paper talks about personalized interactive facet. It's search and I can look at it and
see what else it's supposed to do next. Get some idea of the terms, and what I'm going
to do is go back to this and change my query to faceted metadata because that's what it
was talking about, information, cannot type. Retrieval, a type, interface. So just--some
terms that were mentioned in here that I think--a reason [INDISTINCT] to the query. I can now
look through these results. I'm going to size this back down so that you can see more of
the search results. And--let's see. Oh, look at this thing. And this document--and this
is where Marcia Bates will be proud of me. This document in reference six, so it talks
about--I guess--I'm sorry, I need to open it again in a separate tab so you can read
it. But the first sentence here says that faceted taxonomy is a set of taxonomy, each
one describing that the main adventures from a different--preferably orthogonal point of
view, reference six. Any guesses as to what reference six is? Those of you who in library
science? That's right. All right. And that is the next query I'm going to run. And when
I do that--oops, and what I'm going to do actually is go to the references as in copy
it out so I don't make any typing errors. And here is the reference six is [INDISTINCT]
the[INDISTINCT] classification. Copy that, put in here, I'm going to get rid of the [INDISTINCT]
just because, you know, who knows but I'm going to get rid of the quotes. I'm being
generous about it. And this first result, as it turns out, mentions this contributions
to facets and classification on the second page over here and I believe that that is
sort of inspiration if you will. I'll put those in library science. So, that gives you
a quick sense of what the system can do. How much time do I have? Okay. So, that gives
you a quick sense of what the system can do. >> [INDISTINCT] you to be able to take questions.
>> Yes, sir. >> I have questions of [INDISTINCT] system
[INDISTINCT] >> Did they in fact cited?
>> Yeah, on time. I don't know but the [INDISTINCT] about citation [INDISTINCT]
>> Yeah. Right. So absolutely, one of the things that limited my ability to search here
was implementation time. And the--what the citation metadata was available and should
be available and I think that you might see other systems that actually take advantage
of it. I chose not to use that. But, yes, particularly for corporate where that metadata
is available you should just be able to [INDISTINCT] on that search, the metadata searches and
presumably finances more quickly. This just shows that you can use different techniques
to get an effective in effect to the result. And now I recognized that name having done
the search. I didn't--I mean, I should probably have read the paper more thoroughly but I
figured that would have been take more longer than to try to poke around and try to just
[INDISTINCT] to understand it. Anyway, thank you. That's right. Well, not for this task.
No, no, no. It's all about search. Yeah. >> [INDISTINCT]
>> Okay. The answer is nobody can because as I said I concentrated--when I built the
system I concentrated on the user interface and on this exploratory stuff. The backend
is just vanilla Lucene. Now, Lucene supports--you can do spelling correction out of Lucene and
it's just a small matter of programming to get that out. I chose not to spend my time
doing that for this first pass of a system. It's certainly something that should be done.
The spelling corrections, you know, the ability to handle synonyms, all of that stuff can
be put in the index and--or improve the system. Just--it was limited resources with I could
do. Yeah. >> [INDISTINCT]
>> Yes, I... >> [INDISTINCT]
>> That's right, so you just... >> [INDISTINCT]
>> Yeah. So I could compensate for that way by running multiple variants and doing fusion
of results as Mark suggests. I mean, that's a perfectly reasonable strategy. It's probably
better not to force the person to do that and to have the system do it automatically,
but I could certainly compensate for that, this way. Yes? Are we in time? Anymore questions?
>> I have question. [INDISTINCT] >> Yes. Yes. It's completely staple. You can
log off to come back 10 years later and if the servers up, it will be up. Okay. Thank
you. >> [INDISTINCT] to ask.
>> Hi, everyone. It's pleasure to be here. I am very impressed with the quality of the
talent here and where were you guys 18 months ago? So that what I want to know. In fact,
I would... >> [INDISTINCT]
>> Okay. We can't even spell HCIR a year ago. It's a magic. Okay. I'm here at[INDISTINCT]
with Visual Purple. All the difficult questions that would ask, is that you directed to this
gentleman, Brian Olson, our CTO. He's the technical brains. The fellow that is responsible
for all the magic sauce in which we're going to see today is fellow named David Ostby and
he's not here. He's our chief scientist and we keep him in a locked room in undisclosed
location. David and I have been working for about 20 years together. And with the frame
what you're going to see today where it came from and why we're up here is where--we have
a bit of a jaded past. We had been in the entertainment software world for about 15
years. In the last 12, 13 years we've been in the simulation world doing training, mostly
for the government. And through those efforts we got the opportunity to work with a lot
of really interesting folks with some really interesting problems. We've done a lot of
training for analyst. So, the tool that you're going to see today is a tool that really came
from demand of the user. The usee--it's in the whole process has been driven by the user.
We haven't had the luxury to do much study and we take a--oh, what's the term? There
are many terms out there but we do an incremental development approach to life. So, we will
rapid--we did rapid prototyping. We tried and it works great. We had opened that direction.
If it isn't, we stop there and that cul-de-sac we turned around and we head where we needed
to go. Yeah. I know that's the buzz word. Yeah. Exactly. Yeah. I know it had many different
monikers over the years. I called it the pragmatic approach. And since we're not adventure [INDISTINCT]
company and we have clients, we have to be a very mindful of that. In fact, what you're
going to see today is a--we didn't do anything really cool or special with respect to the
presentation of the XML data, so I apologize that upfront. But it's the core system. We
basically configured Jester to look at a database instead of looking at the web because Jester
lives out there today in the hands of analyst for what they call OSEN for open source intelligence.
The--this little globe you see up here and the [INDISTINCT] the [INDISTINCT] walls and
you can go to the website and play the video and all that but I don't want to waste a minute
on that. It's a minute and 20 seconds. I don't have time to waste. But the point is we stripped
out the thumbs because we're displaying with this database--okay, this discreet database.
We did that in conjunction with some other work where we're piloting the system in the
classified environments, so the timing on this was really apropos. And also I say, upfront
that there's no worry about the [INDISTINCT] domain expertise involved in this because
I'm the guy that did the searches and I haven't had a clue what the heck you guys are talking
about, okay. Even though you see--I mean, I couldn't spell--I couldn't spell anthology
today. Even though you see this globe or this--we call it a globe. You guys call it a sphere.
In fact, we agonize over what do we call this, sphere or globe. We agonize over what is this.
Is this--is this search? Is this Google on steroids? I mean, people would ask me, what
are you guys doing? And I would try to explain it and I always sort of devolve, "Oh, yeah,
it's Google on steroids." And in reality it's exploratory search, I guess is the right term.
We used concepts search to begin with but the reality is we're working with analyst
and oftentimes these folks are presented extremely difficult problems to solve in a very short
periods of time. So with that, well, Visual Purple--you're good at a purple shirt--but
Visual, we should somehow be able to let people lens into the data in a way that they can
quickly go walk through it and be able to do something what they call actual intelligence
with them. Okay. So, that was our whole--our admission on this. And I will go into the
core technology came about for some unusual reasons sometime ago and the reason Jester
can even exist today is something we call Looking Glass and that is a LAC like approach
to the data. So we rapidly--we have a high speed, an automated high speed functionality
or capability with Looking Glass and that's what Jester was, unstructured data and it
just choose it up. So when we're like LAC, we don't have to worry about the computational
overhead and we don't have worry about having a large corporate to play with, so low data--low
data environments, it loves that and it lives for unstructured data. So what better way
than to turn it out--turn it out in a [INDISTINCT] to the--to the web and then of course decides
through database. So, let's--I can't resist. I did two--let's talk about the indexing.
We are not indexing experts by any structure of imagination. The indexing became a side
project that went down about three levels of priority until I finally beat up on some
people because the competition was coming up. And the problem with that is we just basically
start [INDISTINCT] day and a half or two days total between the server crashes, et cetera.
And so we didn't do any fancy with the data. We didn't clean up. We just got up there and
we figured, we'll see if Jester can deal with this in a meaningful way and you want to be
the judge on that. So, nothing fancy on that. Let's see. Is there anything else that I'm
missing? Well, just kind of get into it here. Let's go look at the first question. I did
two questions. And I had one hour to do this. Once we got--in fact, I did these searches
when the database is so being indexed. So, the first one, I was a little intrigued [INDISTINCT]
will I graph and draw it on a sphere. Okay. So, you know, just--the biggest challenge
for an analyst especially and they're in unfamiliar domain which I was is asking the right question
and the question is always changing, right? That's one of the big problems out there.
So we wanted to come up with a way of working through the data. So I want--I did basically
did graph layout algorithms, drawing graphs, surface sphere. We don't support Boolean operators.
We have this concept we call loose corroding. And what does that mean? That means that we're
looking at all the relationships of all the terms across the corporate, okay, in real
time and in your real time. So, let's see here. So, we're lowering down here. And I've--actually,
I've got a saved search but I'm going to actually do...