Sitemap URL Extractor
Extract all URLs from any sitemap.xml — including sitemap indexes that reference multiple child sitemaps.
How the URL Extractor Works
- 1.Fetches the sitemap URL you provide and resolves XML encoding (& → &).
- 2.Detects if it is a Sitemap Index and fetches up to 10 child sitemaps concurrently.
- 3.Parses all <url> blocks for loc, lastmod, changefreq, and priority values.
- 4.Returns up to 5,000 URLs with all metadata. Download as CSV for full analysis.
What is a Sitemap URL Extractor?
A sitemap URL extractor reads an XML sitemap file and returns a structured list of every URL it contains, along with metadata like lastmod, changefreq, and priority. It is different from a sitemap finder — the finder tells you where your sitemap lives, the extractor tells you what is inside it.
When your sitemap is a sitemap index (a file that points to multiple child sitemaps), this tool automatically fetches and merges up to 10 child sitemaps, so you get one comprehensive list without manually opening each XML file.
Use Cases for Sitemap URL Extraction
Content audits: Export all indexed URLs, then reconcile against your live site to spot orphaned or deprecated pages.
Indexing checks: Compare your sitemap URLs against Google Search Console coverage reports to identify pages submitted but not indexed.
Migration planning: Extract every URL before a domain migration or CMS switch to build a complete redirect map.
Competitor research: Extract URLs from competitor sitemaps (if publicly accessible) to analyse their content structure and publishing velocity.
Frequently Asked Questions
What is the difference between a sitemap extractor and a sitemap finder?
A sitemap finder locates where your sitemap file is hosted (e.g. /sitemap.xml). A sitemap URL extractor goes one step further and reads the file, returning every URL inside it along with its metadata.
How do I extract URLs from a sitemap index?
Paste the sitemap index URL into this tool. It automatically detects the <sitemapindex> element, fetches up to 10 child sitemaps concurrently, and merges all URLs into a single list.
Can I export sitemap URLs to Excel or Google Sheets?
Yes — click "Download CSV" once the extraction is complete. Open the CSV in Microsoft Excel or import it into Google Sheets for filtering, sorting, and further analysis.
What does lastmod mean in an XML sitemap?
The <lastmod> tag tells search engines when the page was last modified. Search engines use it to prioritise re-crawling frequently updated pages. A stale or missing lastmod means Google may not recrawl updated content as quickly.
How many URLs can this sitemap extractor handle?
The tool returns up to 5,000 URLs per request and processes up to 10 child sitemaps from a sitemap index. For sites with more URLs, download the CSV in batches by extracting each child sitemap individually.
Extract URLs from WordPress, Shopify & Other CMS Sitemaps
Paste the sitemap URL below — here are the default sitemap locations for the most common platforms:
WordPress (Yoast SEO)
/sitemap_index.xmlYoast generates a sitemap index. Extract the index first, then extract URLs from each sub-sitemap (posts, pages, products).
WordPress (Rank Math)
/sitemap_index.xmlSame path as Yoast. Rank Math also supports /sitemap.xml as a fallback.
Shopify
/sitemap.xmlShopify generates a sitemap index covering products, collections, pages, and blogs. This extractor supports sitemap indexes automatically.
Webflow
/sitemap.xmlWebflow outputs a flat sitemap.xml listing all published CMS items and static pages.
Next.js (next-sitemap)
/sitemap.xml or /sitemap-0.xmlnext-sitemap generates index + numbered sub-sitemaps. Extract the index URL to get all pages.
Squarespace
/sitemap.xmlSquarespace generates a single sitemap.xml — paste it directly to extract every URL.