What it is
crawler.sh flags a page as having a content freshness issue when its datePublished or dateModified metadata is missing, stale, inconsistent across sources, or self-contradictory. This is check #24 in the SEO analysis.
The check inspects up to five date sources per page and flags six distinct sub-issues:
Date sources inspected:
- JSON-LD
datePublishedanddateModifiedinsideArticle,BlogPosting,NewsArticle,WebPage,Report, or related types (including@graphentries). - Open Graph
<meta property="article:published_time">and<meta property="article:modified_time">. - HTTP
Last-Modifiedresponse header. - Readability-extracted dates (when content extraction is enabled).
Sub-issues flagged:
- Missing freshness signals - HTML page has no date in any of the five sources.
- Stale content - the most recent valid date is older than 730 days (configurable with
--stale-after-days). - Inconsistent freshness signals - two same-kind dates (e.g. two modified dates) disagree by more than 7 days.
- Invalid date format - a non-empty date string fails to parse as RFC 3339, RFC 2822, or ISO 8601.
- dateModified before datePublished - a logical impossibility, usually a CMS bug.
- Missing structured data dates - the page has Open Graph or HTTP
Last-Modifieddates but no JSON-LDdatePublished/dateModified, weakening rich-result eligibility.
Why it matters for SEO
Search engines weight recency. Google’s freshness algorithm (a refinement of QDF, “query deserves freshness”) boosts up-to-date content for time-sensitive queries: news, product reviews, how-to guides, statistics, and anything where users expect current information. A page with no date signals at all forces search engines to guess from the URL, content patterns, or sitemap lastmod. That guess is often wrong.
Concrete impacts:
- Rich results - Article and BlogPosting rich-result eligibility requires JSON-LD with
datePublished. Pages missing it cannot show the published-date sitelink under a search snippet. - Sitemap trust - if your sitemap claims
lastmod: 2025-12-01but the page has no on-page date or shows2018-04-12in JSON-LD, search engines distrust the sitemap signal across the whole site. - Discover and News - both Google Discover and Google News require a clear, recent date to surface a page. Missing or invalid dates exclude content from these surfaces entirely.
Why it matters for AEO
AI answer engines like ChatGPT, Perplexity, and Google AI Overviews favor recent sources when synthesizing answers. They use datePublished and dateModified to:
- Decide whether to cite a page at all (an undated 2017 page rarely makes the cut).
- Order multiple sources by recency when listing them.
- Disclose the freshness of cited information to the user.
Pages without date metadata are systematically demoted as candidates. Worse, a page with conflicting dates (OG says 2024-01, JSON-LD says 2019-03) can be skipped entirely because the engine cannot trust either signal.
How to fix it
1. Add JSON-LD with both dates (the highest-trust signal):
<script type="application/ld+json"> { "@context": "https://schema.org", "@type": "Article", "headline": "Your article title", "datePublished": "2026-01-15T10:00:00Z", "dateModified": "2026-05-04T14:30:00Z", "author": { "@type": "Person", "name": "Author Name" } }</script>2. Add Open Graph article date tags as a complementary signal:
<meta property="article:published_time" content="2026-01-15T10:00:00Z" /><meta property="article:modified_time" content="2026-05-04T14:30:00Z" />3. Configure your server to send Last-Modified for static and CDN-cached pages. Most static-site generators and CDNs do this automatically when based on the file’s modification time.
Consistency rules:
dateModifiedmust be greater than or equal todatePublished. If the page has not been updated since publishing, setdateModifiedequal todatePublished.- All sources should agree within a few days. If your CMS updates JSON-LD on every save but only writes the OG tag on initial publish, the two will drift. Update both together.
- Use ISO 8601 with timezone (RFC 3339).
2026-05-04works but2026-05-04T10:00:00Zis unambiguous.
What crawler.sh reports
In the CLI, freshness issues appear across six sections of crawler seo output: Missing freshness signals, Stale content (>N days), Inconsistent freshness signals, Invalid date format, dateModified before datePublished, and Missing structured data dates. Configure the staleness threshold with --stale-after-days N (default 730).
In the desktop app, the same six issues appear in the SEO Issues card. A dedicated Content Freshness dashboard card shows the median page age, the percentage of pages updated in the last 90 days and the last year, and a per-page list with color-coded “updated X ago” badges.
Tip: start by fixing the Missing freshness signals category. Adding JSON-LD
datePublished/dateModifiedto undated pages is the highest-leverage change, especially for blog posts, news articles, and documentation.