Configuring URL Parameters in Webmaster Tools


Uploaded by GoogleWebmasterHelp on 14.08.2012

Transcript:
>>Maile Ohye: Hi. I'm Maile Ohye. I'm a member of Google's webmaster support team. And I'd
like to help you better understand how to use the URL parameters feature in webmaster
tools.
URL parameters is a fairly advanced feature. So some of this information might be more
complex than one would expect. Before you further watch this video, please check out
the URL parameters page to see if you have a message from Google explaining that we already
believe we have high crawl-coverage of your site. And therefore no adjustments to this
feature are necessary. The message would say "Currently Googlebot isn't experiencing problems
with coverage of your site. So you don't need to configure URL parameters. Incorrectly configuring
parameters can result in pages from your site being dropped from our index. So we don't
recommend you use this tool unless necessary."
For those of you who have that message, you're good to go. And no further viewing is even
necessary. But for those of you who don't have that message please keep watching. And
one of the main takeaways is that improper actions on the URL parameters feature can
result in pages no longer appearing in search. Again, it's an advanced feature.
The settings in the URL parameters are used by Google as a helpful hint in our crawling
process. For stronger directives, you want to use things like a robots.txt disallow.
Or a meta noindex. But using the URL parameters hint is still very helpful.
In 2010, the Google store only sold about 160 products. And that seems fine and fairly
easy to crawl. But the thing is, that of these 160-ish products, it actually created 380,000
URLs. These URLs were created by things like different types of navigation. So Googlebot,
in terms of crawling your site, doesn't just look at, say, 200 unique URLs. But actually
has to determine which URLs to crawl of the 380,000 that were created. You can see how
Googlebot might want to be more efficient in crawling by looking at these two URLs.
Now, the first one says, "category equals YouTube." Let's say that URL to 20 unique
items on the page. But on the second URL, it's "category equals YouTube and size equals
medium." So it's the exact same items. But now just say, filtered down to five. Because
of the "size equals medium" parameter. So in this way Google would rather just crawl
the first URL and reach all 20 of the items. Rather than crawling both URLs and seeing
a redundant five items.
Essentially, with your input in URL parameters, it helps us to understand your site better.
So we can crawl more efficiently. By crawling more efficiently, we don't crawl as many duplicates.
And that will save you bandwidth. And helps us to focus on your unique content. Rather
than crawling the duplicate of information repeatedly. But if you want URLs removed,
you can go to URL Removals in Webmaster Tools. Again, the URL parameters feature is to crawl
more efficiently. It's not about removals or explicit robots.txt disallowing.
Another background piece of information that I'd like to mention is that page-level markup
is still taken into consideration in tandem with URL parameters. So if you have page-level
markup. Like rel, "canonical", or rel, "next" "prev", or rel, "hreflang", that's fine and
can still be used by Google. Even if you are using URL parameters. Just make sure that
we can still crawl your page. Meaning that it's not robots.txt disallowed. Or you haven't
set it to not be crawled in URL parameters. As long as we can crawl your page, we can
still use the page-level markup.
Since we've covered the background information, let's now talk about the types of URLs that
are eligible for this feature. Here's one from the Google store. It says, "category
equals office." Other URLs would be things like "category equals wearables" or "category
equals wearables and size equals medium." These URLs are eligible for the feature because
they come in key value pairs or name value pairs. What it looks like is "key equals value"
and then perhaps an ampersand. And then "key two equals value two." And Google, when we
see these parameters will treat this URL as equivalent to this URL. Because the ordering
in the parameters doesn't matter.
URLs that are ineligible for this feature are those that don't use the key value configuration.
So if a site uses a bunch of plus signs to separate their parameters. Or they just use
a directory structure. Or they use their own type of encoding. None of these types of URLs
can actually be used. Because this feature requires the name value pairs.
Alright. I know that was a long intro. But now let's get started with the feature. Step
one is to specify parameters that don't change the page's content. So you can ask yourself,
"Do I have parameters that don't affect page content?" Things like a session ID. An affiliate
ID. Or a tracking ID. These types of parameters don't change page content. And so in the feature,
you can actually mark them as "does not change content." And once you've said that, Webmaster
Tools will put one representative URL as the setting. And then Googlebot will act accordingly.
Once step one is completed for all the parameters that don't actually change page-content, then
let's move on to step two. Which comes in two parts. The first part is to specify the
parameters that change page content. So you'll select. "Yes, this changes, reorders, or "narrows"
page content." And then you can have a type of page content effect. Whether that sorts,
"narrows", specifies, etcetera. And we'll cover more of this in depth. Then the next
part is step two, 2B. which is to specify Googlebot's preferred behavior. So given that
parameter, how would you like Googlebot to actually crawl those URLs. And we'll talk
more about this as well.
[music in background]
The first parameter I'd like to cover is the sort parameter. And we're covering this first.
But it's a fairly complicated parameter and setting. The sort parameter is something like
"sort equals price ascending" or "rank by equals bestselling." Any of these types of
parameters that just change the order that the content is presented. These are sort parameters.
Once you identify a sort parameter or sort parameters on your site, the next part is
to specify Googlebot's preferred behavior for when they see URLs with this parameter.
Now this can get pretty tricky. So I have two scenarios here. Let's go through the first
scenario. You could ask "Is the sort parameter optional throughout my entire site?" Meaning
that the sort parameter is never displayed by default. But only with manual selection.
If you can answer "Yes" to that question, that "Yes, it's optional throughout my entire
site." Then go on to question two. "Can Googlebot discover everything useful when the sort parameter
isn't displayed?" Meaning that we can actually crawl all of your items even if no sort parameter
is present on the URLs. If that answer is "Yes" and the first answer is "Yes" then it's
likely that with your sort parameter you could specify "crawl No URLs." Once you've applied
this setting, please verify that the sample URLs displayed in Webmaster Tools are in fact
not canonicals. So they're just duplicates. And that the canonical URLs, the URLs that
you really want crawled and indexed, can be reached by regular user navigation.
If the first sort parameter recommendation didn't apply to your site, then hopefully
the second recommendation will. The second recommendation is, if the same sort values
are used consistently site-wide. The questions to ask yourself are, "Are the same sort values
used consistently across my entire site?" So a negative example of this, where the user
would say "No" is if you're a Webmaster selling things like coins and coin albums. So for
your coins, you might have "sort equals" with the value "year issued." But "sort equals
year issued" doesn't apply to the selling of your coin albums. So it's not used consistently.
If the answer to the first question is "Yes," Then you can ask yourself the second question.
Which is, "When a user changes the sort value, is the total number of items unchanged?" If
that answer is also "Yes" then it's likely that with your sort parameter you can specify
"only crawl URLs with value x" where X is one of the sorting values that's used site-wide.
If neither of those recommendations apply to your sort parameter, then perhaps select
"Let Googlebot decide."
The second parameter that I'd like to cover is the "narrows" parameter. "narrows" filters
the content on the page by showing a subset of the total items. So you probably see this
on an e-commerce site. Where in the navigation, a user is able to select, if they only want
to see items that are less than 25 dollars. Or 25 dollars to 49.99. aAll of this is narrowing
the content of the total items.
Examples of the "narrows" parameter are "size equals medium." "Less than equals 25." Or
"Color equals blue." If the "narrows" parameter shows less useful content. Shows content that's
just a subset of the content from the more useful URL, which doesn't include the "narrows"
parameter. Then you might be able to specify "Crawl No URLs." For example, a useful URL
is "category equals YouTube." And the less useful URL is "category equals YouTube and
size equals medium."
Here, I might specify "size" as a "narrows" parameter. And then because it has less useful
content, I can say "crawl No URLs." But before I specify "Crawl No URLs" it's good to verify
a few things first. First, be sure that the "narrows" parameter won't also filter out
useful pages that you'd like crawled and surfaced in search results. So if you have brand or
category pages that you'd like to show to users, be sure that when you select "Crawl
No URLs" that those brand and category pages won't be affected. Second, verify that example
URLs that might be displayed in Webmaster Tools are really URLs that provide non-useful
content. When compared to the parent URL. So again, you see content like "Category equals
YouTube and size equals medium." And you know that the "size equals medium" that narrows
parameter just isn't' useful.
If the behavior of "Crawl No URLs" just isn't optimal for your site because it affects too
many important brand or category pages then perhaps let Googlebot decide.
The next parameter is "Specifies." "Specifies" determines the content displayed on a page.
For example, "item id equals android t-shirt" or "sku equals 495." The "Specifies" parameter
is responsible for the actual content. So you'll likely select "Crawl Every URL." After
"specifies" is "translates."
Unless you want to exclude certain languages from being crawled and available in search
results, like auto-generated translations. Unless that's the case, then it's likely you'll
select "Crawl Every URL."
As an aside, one best practice that I'd like to mention, and this is by no means a requirement,
is to put your translated original content not in a URL parameter, but actually in a
sub-folder or a sub-directory. The reason why is that in a sub-directory or sub-folder,
it helps Google to better understand your site's structure. And that this applies to
translated or regional content.
The last parameter is "Paginates." "Paginates" displays one component page of a multi-page
sequence. Examples are "page equals three." "Viewitems equals ten through 30." Or "start-index
is 20." With paginates, because you want us to crawl every page to reach all of your items,
it's nearly always "Crawl Every URL."
Congratulations. You've finally gotten through the discussion about the different parameters
and the desired Googlebot behavior. And now you can go back to your site and repeat this
process for the different parameters that you have. You don't have to do this for every
single parameter. But just including a few, and specifying the right Googlebot behavior
can really help with your crawl efficiency.
We're almost done. But you might be asking one last question. Which is, "What about multiple
parameters in one URL?" For example, "sku equals 234 page equals three and sort by equals
price" etcetera. How does URL parameters work when there are multiple parameter settings?
The answer is to remember that URL parameters is about crawling your site more efficiently.
So you can imagine that all URLs that we know about for your site. So if you're the Google
store, all 380,000 URLS begin as eligible for crawling. And then we work as a process
of elimination. Not inclusion. So we take our knowledge of your site. Combined by each
of your settings in URL parameters to slowly weed away the URLs that shouldn't be crawled.
Until at the very end, we have a smaller subset of good URLs.
To recap. If you're now more comfortable with the URL Parameters, please utilize this feature
for more efficient crawling of your site. You can start with specifying the parameters
that do not change page content. Those should be easier to do. And then after you have done
that step, the next step is to specify parameters that change the page content. And remember,
if you can't determine, don't guess. Please "let Googlebot decide."
One more time, the sorts parameter. If the sorts parameter for your site never exists
in a URL by default, then "Crawl No URLs." Or if site-wide, the same sort value is always
used, then "Crawl URLs with value x." For "Narrows" if your "narrows" parameter causes
non-useful filtering for searchers, like size or price, then you might select "Crawl No
URLs." Just be sure to double-check that none of your important pages will be affected.
For "specifies" it's usually "Crawl Every URL." And the same applies to "Translates"
and "Paginates."
Yay! I think we've covered a lot of information about URL parameters. Here are links to more
resources if you want further help. Thank you and best of luck.