# BEBECHIC storefront — AI-fetch readiness (handoff)

> **Scope:** this is guidance for the **storefront** repo `bebechic.vn` (Next.js 15 App Router + Supabase + Cloudflare Pages) — NOT this CDN repo. It distills the 02/07/2026 audit into an actionable checklist. Goal: the storefront is both *fetchable* (AI bots get in) and *readable* (they parse product + price + stock), so it gets cited by answer engines and transacted by shopping agents.

Priority order for a commerce site (different from a docs site): **access → SSR render → structured data** carry ~90% of the value. `llms.txt` and markdown negotiation are forward-investment, do last.

## P0 — Access layer (blocker; do first)

- **Cloudflare → AI Crawl Control** (was AI Audit) → Crawlers tab → **Agent = Allow, Search = Allow**. Note the 15/09/2026 default: new/Free-tier domains block Agent+Training on ad-bearing pages — a DTC store usually isn't ad-bearing, but verify, don't assume.
- **Bot Fight Mode / Super Bot Fight Mode:** ensure it doesn't challenge `ChatGPT-User` / `Claude-User` fetches. Add verified-bot exceptions if on.
- **WAF/Managed rules:** no rule blocking UAs containing `bot`/`GPT`/`Claude`.
- **`public/robots.txt`** (permission layer — llms.txt is NOT permission): allow `ChatGPT-User`, `OAI-SearchBot`, `Claude-User`, `Claude-SearchBot`, `PerplexityBot`, `Google-Extended`, `GPTBot`, `ClaudeBot`; `Disallow: /checkout /cart /account /api/`. Add `Sitemap:`.

## P1 — Render product data in HTML (SSR/SSG)

Most AI crawlers don't run JS. If a product page is client-rendered, bots see an empty shell → no price/stock/description.

- Product pages: **SSG + ISR** (`revalidate`) or SSR. Price / `availability` / name / description must be in the returned HTML, not behind `useEffect` or a client Supabase fetch.
- Fetch data in **Server Components**; don't `createClient` on the client for core content.
- Test: `curl -sA "Claude-User" https://bebechic.vn/san-pham/abc | grep -Ei "299|instock|<tên>"` — no price/name → bot can't see it either.

## P1 — Structured data (highest ROI for commerce)

Server-render JSON-LD (via `generateMetadata` or in-page) so bots read it:

- **`Product` + `Offer`** on every product page: `priceCurrency: "VND"`, `price`, `availability` (`InStock`/`OutOfStock`), `sku`, `brand`. Pull live from Supabase at build/revalidate — a wrong price in schema is worse than none.
- **`AggregateRating`** if reviews exist.
- **`Organization`** on the homepage (name, logo → use `https://cdn.bebechic.vn/images/logos/…`, `sameAs` socials).
- **`FAQPage`** for size guide / policies.
- Validate at validator.schema.org + Google Rich Results Test.

## P2 — Discovery (lower priority)

- `sitemap.xml` including product URLs.
- `llms.txt` as a **catalog pointer** (agents, not search): link canonical category + policy pages (đổi trả, vận chuyển, size guide). Don't mass-generate per-page `.md` (duplicate content).
- Optional markdown content-negotiation: serve `.md` to AI UAs / `Accept: text/markdown` via a Pages `_middleware.ts`, with `x-robots-tag: noindex`.

## Verify (ground truth, not config-guessing)

```bash
curl -sI -A "ChatGPT-User" https://bebechic.vn | head -1                       # access: expect 200
curl -sA  "Claude-User"  https://bebechic.vn/san-pham/abc | grep -Ei "price|instock|299"   # render
curl -sA  "GPTBot"       https://bebechic.vn/san-pham/abc | grep -A2 "application/ld+json"  # schema
```
Then check **AI Crawl Control → Crawlers** to see which bots actually hit and whether any got 403.

---

*Brand assets (logo, mascots, tokens) for the storefront come from this CDN — see [/docs/context.md](https://cdn.bebechic.vn/docs/context.md). This checklist covers plumbing only, not brand rules.*
