Technical Documentation — LightSite AI Platform

Complete technical documentation for LightSite AI's Generative Engine Optimization platform. This page covers integration methods, what LightSite deploys on your domain, how AI crawlers use your machine-readable layer, endpoint categories, verification tools, and implementation FAQs.

Overview

LightSite AI deploys a machine-readable discovery layer on your domain that helps AI assistants like ChatGPT, Gemini, Claude, and Perplexity understand, extract, and cite your business. This layer includes structured data (JSON-LD), an AI-specific sitemap, machine-readable endpoints, and a Skills API manifest — all published alongside your existing website without changing its design or content.

Integration Methods

LightSite supports multiple integration paths depending on your tech stack:

  • Direct script install — add a single script tag to your site's <head> section. No build step required.
  • Tag manager — deploy through Google Tag Manager or any tag management platform.
  • CMS integration — works with Shopify, WordPress, Webflow, Wix, and any CMS that allows head injection.
  • Custom integration — for headless CMS setups, SPAs, or custom-built sites. Full API access available.

Most teams are live within minutes. No developer sprints, no migration, no waiting for deploys.

What Gets Created on Your Domain

After integration, LightSite publishes the following assets on your domain:

  • AI sitemap — a specialized sitemap that points AI crawlers to your structured, machine-readable endpoints rather than raw HTML pages.
  • JSON-LD structured data — schema.org markup injected into your pages covering Organization, Product, FAQ, Review, and other entity types relevant to your business.
  • Machine-readable endpoints — clean, parseable URLs on your domain that return structured business data AI systems can consume directly.
  • Skills manifest (skills.json) — a machine-readable file that declares the skills your website supports, allowing AI agents to discover and invoke them.
  • Discovery endpoints — including ai-plugin.json, openapi.json, and WACP protocol files for agent-to-website communication.

Endpoint Categories

LightSite creates structured endpoints for the following data categories:

  • business.profile — company name, description, founding date, location, contact information, and trust signals.
  • products.search — product catalog with descriptions, categories, and specifications that AI assistants can query.
  • faq.answer — structured FAQ data that AI systems can use to answer questions about your business directly.
  • testimonials.list — customer reviews and testimonials in structured format for trust signal extraction.
  • qa.search — natural-language question answering that lets AI agents search your content semantically.
  • pages.list — structured index of your key pages with metadata for discovery.

Skills API — Agentic Web Infrastructure

The Skills API allows your website to actively communicate with AI agents through structured, callable endpoints. Instead of relying on AI systems to scrape and interpret your HTML, you declare what your website can do — and agents invoke those capabilities directly.

LightSite monitors which agents call which skills, what natural-language queries they send, and what data they extract. This gives marketing teams a new source of demand intelligence: real questions from real AI assistants, asked on behalf of real users.

Early adoption data shows 90.5% of Meta AI interactions use the Skills API when available, with over 633 skill invocations tracked across production deployments.

How AI Crawlers Use Your Machine-Readable Layer

When an AI system encounters your domain, the discovery flow works as follows:

  1. The crawler checks your AI sitemap and discovery endpoints (ai-plugin.json, skills.json).
  2. It identifies available structured data endpoints and their categories.
  3. It fetches structured business data from your machine-readable endpoints — clean JSON rather than parsing raw HTML.
  4. The extracted facts enter the AI system's context window or knowledge base, making your brand citable in AI-generated answers.

LightSite's empirical research across 5 million bot requests showed that machine-readable pages achieve +12% extraction success, +17% crawl depth, and +13% crawl rate from AI bots including ChatGPT, Anthropic, and Perplexity crawlers.

Verification and Monitoring

LightSite provides built-in tools to verify that your machine-readable layer is working:

  • Extraction proof — see exactly what data AI systems extracted from your endpoints.
  • Bot analytics — track which AI crawlers visit your site, how often, and which endpoints they access.
  • Skill usage monitoring — see which skills agents invoke, what queries they send, and how they use your data.
  • Crawl monitoring — real-time visibility into AI bot crawl patterns, frequency, and depth across your domain.

Implementation FAQ

Do I need developer support to install LightSite?

Usually no. If you have access to your site's <head> section or a tag manager, you can deploy in minutes. For complex setups (headless CMS, custom middleware), the LightSite team provides implementation support.

Does LightSite change my website's design or page layout?

No. The machine-readable layer is published separately from your visitor-facing pages. Your design, content, and user experience remain unchanged.

How long does deployment take?

Most teams are live within 5 to 15 minutes. The structured data, AI sitemap, and machine-readable endpoints are published immediately after configuration.

Does LightSite work with Shopify, WordPress, or custom-built sites?

Yes. LightSite works alongside any platform — Shopify, WordPress, Webflow, Wix, custom-built sites, or headless CMS setups. It does not require a plugin or CMS-specific integration.

Does it conflict with my existing schema markup or SEO plugins?

No. LightSite's machine-readable layer is additive. It publishes separate AI-optimized endpoints and an AI sitemap alongside whatever structured data you already have.

Can I control what data is exposed to AI systems?

Yes. You configure your business context — which information to publish, which endpoints to enable, and which skills to make available. Nothing is published without your explicit configuration.

Related Resources