Tutorial

Beginner Friendly

15 min read

Website Crawler Tutorial

Learn how to use the SEO by the Hour website crawler to analyze any website for SEO insights, technical issues, and content opportunities.

Interactive Tutorial Slides

New

Navigate through our comprehensive visual guide using the interactive slide deck below. Use the navigation controls, thumbnails, or keyboard arrows to explore each step of the crawler tutorial.

Introduction

1 of 9

1 / 9

Website Crawler Tutorial: Master SEO Analysis

Learn how to use the SEO by the Hour website crawler to analyze any website for SEO insights, technical issues, and content opportunities.

Use arrow keys ← → to navigate slides, or click thumbnails to jump to specific slides

Pro Tip

Click the fullscreen button for an immersive viewing experience, or use the download button to save individual slides for reference.

Detailed Written Guide

For a comprehensive written walkthrough, continue reading the detailed sections below.

1. What is the Website Crawler?2. Getting Started 3. Running Your First Crawl 4. Advanced Settings 5. Analyzing Results 6. Saving and Exporting Data 7. Best Practices 8. Troubleshooting

1What is the Website Crawler?

The SEO by the Hour website crawler is a powerful tool that automatically visits and analyzes web pages on any website. It extracts valuable SEO data including page titles, meta descriptions, headings, content length, and technical information.

SEO Analysis

Analyze titles, meta descriptions, headings, and content structure

Technical Insights

Check status codes, response times, and technical issues

Content Audit

Review content length, word counts, and content gaps

Pro Tip

The crawler respects robots.txt files and implements rate limiting to be respectful to target websites.

2Getting Started

Accessing the Crawler

Navigate to the crawler tool using the sidebar navigation or by visiting /crawler directly.

Navigation Path

Sidebar → Crawler → Website Crawler Dashboard

System Requirements

Modern web browser (Chrome, Firefox, Safari, Edge)
Stable internet connection
JavaScript enabled

3Running Your First Crawl

1Enter the Website URL

In the URL input field, enter the website you want to crawl. You can enter:

Full URL: https://example.com
Domain only: example.com
Subdomain: blog.example.com

The crawler will automatically add "https://" if no protocol is specified.

2Set Page Limit

Choose how many pages you want to crawl (1-500 pages). For your first crawl, we recommend starting with 10-20 pages.

Recommended Limits:

Small sites: 10-50 pages
Medium sites: 50-150 pages
Large sites: 150-500 pages

3Start the Crawl

Click the "Start Crawl" button to begin the analysis. The crawler will:

1Connect to the website
2Discover pages to crawl
3Extract content and metadata
4Analyze SEO elements
5Present results

Crawling time varies based on site size and page limit. Expect ~10 pages per minute.

4Advanced Settings

Crawl Type Options

URL Crawl (Default)

Starts from the entered URL and follows internal links to discover pages.

Recommended for most sites

Sitemap Crawl

Uses the website's XML sitemap to find pages to crawl.

Best for large sites

Respect Robots.txt

When enabled, the crawler will check and follow the website's robots.txt file rules.

We recommend keeping this enabled to be respectful to website owners.

5Analyzing Results

Understanding the Results Table

Once the crawl completes, you'll see a comprehensive table with the following columns:

URL Column

• Full page URL
• Clickable external link
• Truncated for readability

Title Column

• Page title tag content
• Shows "No title" if missing
• Critical for SEO ranking

H1 Column

• Main heading content
• Should match title theme
• Important for content structure

Status Column

• HTTP response codes
• Green: 200 (OK)
• Red: 400/500 (Errors)

Word Count

• Total words on page
• Indicates content depth
• Helps identify thin content

SEO Analysis Tips

Look for pages with missing titles, duplicate H1s, very low word counts (<300 words), or error status codes for quick SEO wins.

6Saving and Exporting Data

Save to Project

Click the "Save Results" button to save your crawl data to a project for future reference.

1Click "Save Results" button
2Enter a descriptive name
3Select or create a project
4Confirm to save

Project Benefits

• Access saved crawls anytime
• Compare crawls over time
• Share results with team members
• Build a knowledge base

7Best Practices

Do's

• Start with smaller page limits for testing
• Keep "Respect Robots" enabled
• Save important crawls to projects
• Use descriptive names when saving
• Crawl during off-peak hours for large sites
• Review results systematically

Don'ts

• Don't crawl the same site repeatedly in short periods
• Don't set extremely high page limits unnecessarily
• Don't ignore robots.txt restrictions
• Don't crawl sites you don't have permission to analyze
• Don't rely solely on automated analysis

Ethical Crawling

Always ensure you have permission to crawl websites, especially for commercial analysis. Respect rate limits and robots.txt files to maintain good web citizenship.

8Troubleshooting

Common Issues & Solutions

Crawl Fails to Start

• Check if the URL is valid and accessible
• Ensure the website is online
• Try with "https://" prefix
• Check your internet connection

No Pages Found

• Website may block crawlers
• Check robots.txt restrictions
• Try switching to sitemap crawl mode
• Verify the site has internal links

Crawl Takes Too Long

• Reduce the page limit
• Website may have slow response times
• Try crawling during off-peak hours
• Cancel and retry with fewer pages

Missing Data in Results

• Some pages may lack title tags or H1s
• JavaScript-heavy sites may not render fully
• Check if pages require authentication
• Verify page structure is standard HTML

Need Help?

If you continue experiencing issues, try refreshing the page or contact support with details about the website you're trying to crawl and the specific error message.

Ready to Start Crawling?

Now that you've seen the visual guide and read the detailed instructions, try the crawler on your own website or a competitor's site to discover SEO opportunities and technical issues.

Website Crawler Tutorial

Interactive Tutorial Slides

Website Crawler Tutorial: Master SEO Analysis

Pro Tip

1What is the Website Crawler?

SEO Analysis

Technical Insights

Content Audit

Pro Tip

2Getting Started

Accessing the Crawler

Navigation Path

System Requirements

3Running Your First Crawl

1Enter the Website URL

2Set Page Limit

Recommended Limits:

3Start the Crawl

4Advanced Settings

URL Crawl (Default)

Sitemap Crawl

5Analyzing Results

Understanding the Results Table

SEO Analysis Tips

6Saving and Exporting Data

Save to Project

Project Benefits

7Best Practices

Ethical Crawling

8Troubleshooting

Crawl Fails to Start

No Pages Found

Crawl Takes Too Long

Missing Data in Results

Need Help?

Ready to Start Crawling?

Activity Timeline