0.003     2026-06-11 02:22:02Z

 - Detect: captcha signal no longer false-positives on cookie-banner /
   privacy-policy mentions of reCAPTCHA. A content-rich page now only walls when
   a markdown captcha marker co-occurs with captcha-PROMPT language ("complete
   the captcha to continue", "I'm not a robot", "verify you are human", …); thin
   pages still wall on any marker (JS-rendered gate) and html-only markers on
   rich pages still never wall

 - Detect: signals now also flags WAF / bot-management gates that REDIRECT to a
   challenge URL instead of embedding a widget. When the page's final_url
   matches a known challenge endpoint, blocked is raised for Cloudflare /
   DataDome / PerimeterX (/cdn-cgi/challenge, __cf_chl, /challenge-platform/,
   datadome, geo.captcha-delivery.com, /px/captcha, perimeterx) and captcha for
   provider verification endpoints (google.com/recaptcha, /recaptcha/api,
   hcaptcha.com). Additive and OR-ed in; cosmetic redirects (http->https, www,
   trailing slash) and an absent final_url never trigger

0.002     2026-05-30 22:57:39Z

 - Result: expose response_headers (lowercased keys) from the origin HTTP fetch,
   round-trips through to_hash/from_hash JSON persistence; empty hash default

0.001     2026-05-29 23:36:25Z

 - Initial release
 - WWW::Crawl4AI: Perl client and fallback orchestrator for Crawl4AI
 - WWW::Crawl4AI::Client: UA-agnostic REST client (/crawl, /md, /crawl/job,
   /crawl/job/{task_id}, /health) with request/parse/convenience flavours
 - WWW::Crawl4AI::Request: BrowserConfig/CrawlerRunConfig payload builder
 - Visible strategy chain: plain, browser, stealth, cloakbrowser (CDP), proxy,
   callback — escalated in cost/complexity order
 - CloakBrowser strategy: per-domain fingerprint seed is now a deterministic
   32-bit FNV-1a hash of the host (CloakBrowser requires a numeric seed and
   rejects raw host strings with HTTP 400)
 - WWW::Crawl4AI::Result with attempt history, signals, backend and cost_class
 - Result link accessors: urls (deduped, absolute, fragment-stripped),
   internal_links, external_links, links — no reaching into raw
 - deep_crawl: breadth-first crawl following each page's links through the full
   strategy chain (max_pages / max_depth / same_host / url_filter / on_page)
 - Single-URL action endpoints on the Client (and delegated from WWW::Crawl4AI):
   screenshot / pdf (raw bytes), html (preprocessed), execute_js (page +
   js_result), llm (LLM Q&A), token (JWT) — each with request/parse/convenience
   flavours like the rest
 - WWW::Crawl4AI::Detect: service detection + content-quality classification
   (js_required / blocked / captcha / thin_html)
 - WWW::Crawl4AI::Error structured error model (transport/api/job/content)
 - bin/www-crawl4ai-doctor and bin/www-crawl4ai-test-url
 - examples/docker-compose.yml (+ proxy escalation variant)
