Structured data is standardized markup added to HTML that helps search engines understand the content and context of a page. By labeling specific pieces of information (like product prices, review ratings, or event dates), structured data enables search engines to display rich results such as star ratings, recipe cards, and event listings directly in search results.
Without structured data, search engines must infer what a page contains from the HTML structure and text alone. With structured data, you explicitly tell the search engine: “this number is a price”, “this text is a review rating”, or “this date is an event start time.” This explicit labeling reduces ambiguity and unlocks enhanced search features.
Common structured data formats
- JSON-LD - JavaScript Object Notation for Linked Data, the format recommended by Google. It is embedded in a
<script>tag in the page<head>or<body>. JSON-LD is preferred because it keeps the markup separate from the visible HTML content, making it easier to maintain and less likely to break the page layout. - Microdata - HTML5 attributes added directly to existing HTML elements. Uses
itemscope,itemtype, anditempropattributes to annotate content. - RDFa - Resource Description Framework in Attributes, an older W3C standard. Still supported but rarely used for new implementations.
Google recommends JSON-LD and uses it as the primary format for rich results. Most modern CMS plugins and SEO tools generate JSON-LD automatically.
Common schema types
| Type | Used for | Rich result example |
|---|---|---|
Organization | Company name, logo, contact info | Knowledge panel with logo and social links |
LocalBusiness | Business address, hours, phone | Local pack with map, hours, and directions |
Product | Price, availability, reviews | Product snippet with price and star rating |
Article | Headline, author, publish date | Top stories carousel with image and date |
FAQPage | Questions and answers | Expandable FAQ accordion in search results |
HowTo | Step-by-step instructions | Numbered steps with images and time estimates |
Event | Date, location, ticket info | Event snippet with date, location, and tickets |
BreadcrumbList | Navigation hierarchy | Breadcrumb trail below the page title |
Review | Product or service ratings | Star rating and review count |
Person | Author or individual info | Author knowledge panel |
VideoObject | Video metadata | Video thumbnail with duration and upload date |
JobPosting | Employment opportunities | Job listing with salary and location |
Benefits of structured data
- Rich snippets - Enhanced search results with images, ratings, prices, and other details that make your result more prominent and clickable
- Knowledge panels - Information boxes on the right side of Google results that display key facts about your organization or person
- Voice search - Better compatibility with voice assistants that need precise, structured answers to questions like “what time does Example Cafe close?”
- Entity understanding - Helps search engines connect content to real-world concepts, improving relevance for semantic searches
- Carousel eligibility - Articles, recipes, and courses can appear in horizontal carousels at the top of results
- Higher CTR - Rich results typically have higher click-through rates than plain blue links because they provide more information at a glance
JSON-LD example
Here is a simple Product schema in JSON-LD:
<script type="application/ld+json">{ "@context": "https://schema.org", "@type": "Product", "name": "Wireless Noise-Canceling Headphones", "image": "https://example.com/images/headphones.jpg", "description": "Premium over-ear headphones with 30-hour battery life.", "brand": { "@type": "Brand", "name": "AudioTech" }, "offers": { "@type": "Offer", "url": "https://example.com/products/headphones", "priceCurrency": "USD", "price": "249.99", "availability": "https://schema.org/InStock" }, "aggregateRating": { "@type": "AggregateRating", "ratingValue": "4.5", "reviewCount": "128" }}</script>This markup explicitly labels the product name, image, price, availability, and rating, enabling Google to display a rich product snippet.
Validating structured data
Structured data must follow the Schema.org vocabulary and Google’s guidelines. Common validation tools include:
- Google’s Rich Results Test - Enter a URL or paste markup to see which rich results are eligible
- Schema Markup Validator - Checks syntax and required properties against Schema.org definitions
- Google Search Console - Shows structured data errors and warnings across your entire site
- crawler.sh - Extracts JSON-LD blocks from crawled pages for bulk analysis
Validation catches common mistakes like missing required properties, incorrect data types (putting text where a number is expected), and malformed JSON that prevents parsing.
Common structured data mistakes
- Marking up content that is not visible to users, which violates Google’s guidelines
- Using the wrong schema type, such as
Articlefor a product page - Missing required properties, like
authorforArticleorpriceforProduct - Inconsistent data between structured data and visible content, such as different prices
- Nesting schemas incorrectly or using
@typevalues that do not exist in Schema.org - Adding structured data to pages where it is not appropriate, like stuffing
Reviewmarkup on every page regardless of content
How crawler.sh handles structured data
crawler.sh extracts JSON-LD <script> blocks from every crawled page and includes them in the page output. During SEO analysis, the presence and validity of structured data can be checked across an entire site. This is useful for:
- Verifying all product pages include
Productschema - Checking that
Organizationschema exists on the homepage - Identifying pages with invalid or missing structured data
- Auditing
BreadcrumbListmarkup for consistency across the site - Finding pages where structured data is present but the visible content does not match
- Detecting malformed JSON that would prevent search engines from parsing the markup
Because crawler.sh runs locally and processes pages in bulk, you can audit structured data across thousands of pages without API rate limits or cloud service costs.