Build & Submit a Sitemap

This page describes how to build a sitemap and make it available to Google.
Learn more about sitemaps here.

  1. Decide which sitemap format you want to use.
  2. Create the sitemap, either automatically or manually.
  3. Make your sitemap available to Google by adding it to your
    robots.txt file or directly submitting it to Search Console.

Sitemap formats

Google supports several sitemap formats:

Google expects the
standard sitemap protocol in all formats. Google does
not currently consume the <priority> attribute in sitemaps.

All formats limit a single sitemap to 50MB (uncompressed) and 50,000 URLs. If you have a
larger file or more URLs, you will have to break your list into multiple sitemaps. You can
optionally create a sitemap index
file (a file that points to a list of sitemaps) and submit that single index file to Google.
You can submit multiple sitemaps and/or sitemap index files to Google.

XML

Here is a very basic XML sitemap that includes the location of a single URL:

<?xml version="1.0" encoding="UTF-8"?>
<urlset >
  <url>
    <loc>http://www.example.com/foo.html</loc>
    <lastmod>2018-06-04</lastmod>
  </url>
</urlset>

You can find more complex examples and full documentation at
sitemaps.org.

You can see examples of
sitemaps that specify alternate language pages
and sitemaps for news, image, or video files.

If you have a blog with an RSS or Atom feed, you can submit the feed’s URL as a sitemap.
Most blog software is able to create a feed for you, but recognize that this feed only
provides information on recent URLs.

  • Google accepts RSS 2.0 and Atom 1.0 feeds.
  • You can use an
    mRSS (media RSS) feed to
    provide Google details about video content on your site.

Text

If your sitemap includes only web page URLs, you can provide Google with a simple text
file that contains one URL per line. For example:

http://www.example.com/file1.html
http://www.example.com/file2.html

Guidelines for text file sitemaps

  • Encode your file using UTF-8 encoding.
  • Don’t put anything other than URLs in the sitemap file.
  • You can name the text file anything you wish, provided it has a .txt extension (for
    instance, sitemap.txt).

Sitemap extensions for additional media types

Google supports extended sitemap syntax for the following media types. Use these extensions to
describe video files, images, and other hard-to-parse content on your site to improve
indexing.

General sitemap guidelines

  • Use consistent, fully-qualified URLs. Google will crawl your URLs exactly as listed.
    For instance, if your site is at https://www.example.com/, don’t specify a URL
    as https://example.com/ (missing www) or
    ./mypage.html (a relative URL).
  • A sitemap can be posted anywhere on your site, but a sitemap affects only
    descendants of the parent directory
    . Therefore, a sitemap posted at the site root
    can affect all files on the site, which is where we recommend posting your sitemaps.
  • Don’t include session IDs from URLs in your sitemap. This reduces duplicate crawling of
    those URLs.
  • Tell Google about alternate language versions of a URL using
    hreflang annotations.
  • Sitemap files must be UTF-8 encoded, and
    URLs escaped appropriately
    .
  • Break up large sitemaps into smaller sitemaps: a sitemap can contain up to 50,000
    URLs and must not exceed 50MB uncompressed. Use a
    sitemap index file to list all
    the individual sitemaps and submit this single file to Google rather than submitting
    individual sitemaps.
  • List only
    canonical URLs

    in your sitemaps. If you have two versions of a page, list in the sitemap only the one you
    prefer to appear in search results. If you have two versions of your site (for example, www
    and non-www), decide which is your preferred site, and put the sitemap there, and add
    rel=canonical or redirects on the other site.
  • If you have different URLs for mobile and desktop versions of a page, we
    recommend pointing to only one version in a sitemap. However, want to point to both URLs,
    annotate
    your URLs to indicate the desktop and mobile versions.
  • Use sitemap extensions for pointing to additional media
    types
    such as video, images, and news.
  • If you have alternate pages for different languages or regions, you can use
    hreflang in either a sitemap or html tags
    to indicate the alternate URLs.
  • Non-alphanumeric and non-latin characters.
    We require your sitemap file to be UTF-8 encoded (you can generally do this when
    you save the file). As with all XML files, any data values (including URLs) must use entity
    escape codes for the characters listed in the table below. A sitemap can contain only ASCII
    characters; it can’t contain extended ASCII characters or certain control codes or special
    characters such as * and {}. If your sitemap URL contains these
    characters, you’ll receive an error when you try to add it.

    Character Symbol Escape Code
    Ampersand & &amp;
    Single Quote ' &apos;
    Double Quote " &quot;
    Greater Than > &gt;
    Less Than < &lt;

    In addition, all URLs (including the URL of your sitemap) must be encoded for readability by
    the web server on which they are located and URL-escaped. However, if you are using any sort
    of script, tool, or log file to generate your URLs (anything except typing them in by hand),
    this is usually already done for you. If you submit your sitemap and you receive an error
    that Google is unable to find some of your URLs, check to make sure that your URLs follow
    the RFC-3986 standard for URIs, the
    RFC-3987 standard for IRIs, and the
    XML standard.

    Here is an example of a URL that uses a non-ASCII character (ü), as well as a
    character that requires entity escaping (&):

    http://www.example.com/ümlat.html&q=name

    Here is that same URL, ISO-8859-1 encoded (for hosting on a server that uses that encoding)
    and URL escaped:

    http://www.example.com/%FCmlat.html&q=name

    Here is that same URL, UTF-8 encoded (for hosting on a server that uses that encoding) and
    URL escaped:

    http://www.example.com/%C3%BCmlat.html&q=name

    Here is that same URL, entity escaped:

    http://www.example.com/%C3%BCmlat.html&amp;q=name
  • Remember that sitemaps are a recommendation to Google about which pages you think are
    important; Google does not pledge to crawl every URL in a sitemap.
  • Google ignores <priority> and <changefreq> values.
  • Google uses the <lastmod> value if it’s consistently and verifiably (for
    example by comparing to the last modification of the page) accurate.
  • The position of a URL in a sitemap is not important; Google does not crawl URLs in the order in
    which they appear in your sitemap.

Create a sitemap

When creating a sitemap, you’re telling search engines about which URLs you prefer to show in
search results. These are the
canonical URLs. If you
have the same content accessible under different URLs, choose the URL you prefer
and include that in the sitemap instead of all URLs that lead to the same content.

Once you’ve decided which URLs to include in the sitemap, pick one of the following ways to
create a sitemap, depending on your site architecture and size:

Let your CMS generate a sitemap for you

If you’re using a CMS such as WordPress, Wix, or Blogger, it’s likely that your CMS has
already made a sitemap available to search engines. Try searching for information about how
your CMS generates sitemaps, or how to create a sitemap if your CMS doesn’t generate a sitemap
automatically. For example, in case of Wix, search for “wix sitemap”.

For all other site setups, you will need to generate the sitemap yourself.

Manually create a sitemap

For sitemaps with less than a few dozen URLs, you may be able to manually create a sitemap.
For this, open a text editor such as
Windows Notepad or
Nano (Linux, MacOS), and follow a
syntax described in the Sitemap Formats section.

You can manually create larger sitemaps, but it’s a tedious process.

Automatically generate a sitemap

For sitemaps with more than a few dozen URLs, you will need to generate the sitemap. There are
various tools that can generate
a sitemap.
However, the best way is to have your website software generate it for you. For example, you
can extract your site’s URLs from your website’s database and then export the URLs to either
the screen or actual file on your web server. Talk to your developers or server manager about
this solution. If you need inspiration for the code, check out our old collection of
third-party sitemap generators.

Keep in mind that sitemaps can’t be larger than 50 MB.
Learn more about
managing large sitemaps.

Submit your sitemap to Google

Google doesn’t check a sitemap every time a site is crawled; a sitemap is checked only the
first time that we notice it, and thereafter only when you ping us to let us know that it’s
changed. Alert Google about a sitemap only when it’s new or updated; don’t submit
or ping unchanged sitemaps multiple times.

If you have updated pages in the sitemap, mark
them with the <lastmod> field.
Other XML files have a similar field, such as <updated> for Atom XML.
You can also learn how to compute this date.

There are a few different ways to make your sitemap available to Google:

  • Submit a sitemap using the
    Sitemaps report.
  • Use the ping tool. Send a GET request in your browser or the command line
    to this address, specifying the full URL of the sitemap. Be sure that the sitemap file is
    accessible:

    https://www.google.com/ping?sitemap=FULL_URL_OF_SITEMAP

    Example:

    https://www.google.com/ping?sitemap=https://example.com/sitemap.xml
  • Insert the following line anywhere in your robots.txt file, specifying the
    path to your sitemap. We will find it the next time we crawl your robots.txt file:

    Sitemap: https://example.com/my_sitemap.xml
  • Use WebSub
    if you use Atom/RSS for your sitemap and want to broadcast your changes to other search
    engines in addition to Google.

Troubleshooting sitemaps

See the
sitemaps troubleshooting guide.

Related Articles

Leave a Reply

Back to top button