SEO & GEO

What Is llms.txt, and How to Check Yours

llms.txt is a proposed standard that gives AI models a clean, curated map of your most important content. Here is what it is, what goes in it, and how to check and create one.

StackOptic Research Team10 Apr 20267 min read
The llms.txt file — a curated content map for AI models

If robots.txt and sitemap.xml are how websites talk to search crawlers, llms.txt is an emerging attempt to talk to AI models. It is a proposed standard — a simple file that hands large language models a clean, curated map of your most important content, so the material they draw on to answer questions is the material you would choose for them. This guide explains what llms.txt is, how it differs from the files you already know, what goes inside it, whether it actually helps, and how to check and create your own.

It is a natural companion to the broader question of whether AI crawlers can access your site and to GEO generally.

What llms.txt is

llms.txt is a plain-text file, written in Markdown, placed at the root of your domain at yourdomain.com/llms.txt. Its job is to give AI models a concise, curated overview of your site: what your site is about, and where the content that matters most actually lives. The reasoning is simple. When an AI crawler visits a large site, it does not inherently know which pages are important and which are noise; if it spends its attention on thin or outdated pages, any answer it generates about you is built on weaker material. llms.txt solves that by letting you point models at your best, most authoritative content directly — a curator's note rather than a free-for-all.

Markdown was chosen deliberately: it is human-readable, trivially machine-parseable, and the same lightweight format models are extensively trained on (it is what powers countless README files). That makes an llms.txt file easy for both a person to maintain and a model to ingest.

How it differs from robots.txt and sitemap.xml

It is easy to confuse the three files, but they do different jobs. robots.txt is about permission — it tells crawlers which paths they may and may not fetch. sitemap.xml is about coverage — it lists your URLs so crawlers can discover them all. llms.txt is about understanding and prioritisation — it describes, in plain language, what your site is and which content is most worth using. robots.txt and sitemap.xml are written for search engines and are well established; llms.txt is written specifically for AI models and is still a proposal. They are complementary: a complete setup has all three, each doing its own job. None of them replaces another, and llms.txt in particular is additive — a layer of guidance on top of the access and discovery that robots.txt and sitemap.xml already provide.

What goes inside an llms.txt file

The proposed structure is deliberately minimal. At the top sits a single H1 with the name of your site or product. Beneath it, an optional blockquote gives a short summary of what the site is and who it serves. Then come one or more sections (H2 headings) containing lists of links to your most important pages, each link followed by a brief description of what it covers. A common pattern is to group links under headings like "Docs", "Guides", "Products" or "About", and many sites add an "Optional" section for secondary material a model can skip if it is short on context. The companion llms-full.txt convention goes further by inlining the full text of key pages into a single file, so a model can consume everything in one fetch rather than following links. The guiding principle throughout is curation: include what represents you best, describe it clearly, and leave out the clutter.

Is it a real standard yet?

Be honest with yourself about the status. llms.txt is a proposed standard with growing grassroots adoption — many documentation sites, SaaS products and CMS plugins now generate one — but the major AI providers have not all formally committed to reading or honouring it. That means you should treat any direct ranking or citation benefit as unproven rather than guaranteed. So why bother? Because it is cheap to add, it cannot hurt, and it aligns perfectly with the wider GEO principle of making your best content easy for machines to find, understand and trust. Adopting it now is low-risk future-proofing: if and when the engines lean on it, you are ready; if they do not, you have lost nothing but a few minutes. Plenty of forward-looking teams make exactly that bet.

How to check yours

Checking is the easy part. Type yourdomain.com/llms.txt into a browser. If a Markdown document loads — starting with a single H1 and listing your key pages — you have one, and you should confirm it is valid Markdown, that the links resolve, and that it reflects your current important content rather than a stale snapshot. If you get a 404, you do not have one yet. Because the file is just text at a known path, this check takes seconds, and it is worth repeating periodically, since a file written a year ago may point at pages you have since moved or retired. Tools that assess AI/GEO readiness — StackOptic among them — will check for the file and flag whether it is present and sensible as part of a broader review.

How to create one

You have two routes. The manual route is to write the Markdown yourself: an H1 with your site name, a one-paragraph blockquote summary, and grouped lists of your most important URLs with short descriptions, saved as llms.txt at your site root. For most sites this takes well under an hour and gives you full editorial control. The automated route is to use a generator or a CMS plugin that builds the file from your existing content and structure — convenient for large or frequently changing sites, though you should still review what it produces so the file genuinely highlights your best material rather than dumping everything. Whichever route you choose, treat the file as living documentation: update it when your key pages change, just as you would a sitemap.

llms.txt for different kinds of sites

What belongs in your llms.txt depends on what your site is. A documentation site or developer tool benefits most — it can point models straight at its reference docs, guides and API pages, which is exactly the high-value content people ask AI about, and where llms.txt adoption is currently strongest. A SaaS product site should surface its core product pages, key feature explanations, pricing and support content, so an AI describing the product gets it right. A blog or publisher can list pillar articles and category hubs rather than every post. An e-commerce store might highlight its top categories, buying guides and policies. The common thread is the same in every case: include the pages that best represent what you do and answer what people actually ask, and leave out the thin, transactional or duplicate pages that would only dilute a model's understanding of you.

A simple llms.txt example

A minimal file is short and readable. It opens with a title, a one-line summary, and grouped links:

# Acme Analytics

> Acme Analytics is a privacy-first web analytics platform for small teams.

## Docs
- [Getting started](https://acme.example/docs/start): install and first dashboard
- [Tracking API](https://acme.example/docs/api): events, properties and limits

## Guides
- [GDPR-compliant analytics](https://acme.example/guides/gdpr): how Acme stays compliant

## About
- [Pricing](https://acme.example/pricing): plans and limits
- [Security](https://acme.example/security): data handling and certifications

That is the whole idea: a curated, annotated table of contents a model can read in one pass. Keep the descriptions short and factual, and order the sections from most to least important.

How llms.txt fits your wider AI strategy

It is worth keeping llms.txt in proportion. On its own it is a small, optional file with unproven direct impact; as part of a coherent GEO approach it is a sensible finishing touch. The heavy lifting is still done by allowing AI crawlers, structuring content for extraction, and earning trust through sourcing and authority. Think of llms.txt as the "table of contents" layer that sits on top of all that: once your content is accessible, well-structured and credible, llms.txt simply makes it easier for a model to find your best of it quickly. Add it after the fundamentals are in place, not instead of them, and revisit it whenever your most important pages change.

Common mistakes

  • Treating it as enforcement. Like robots.txt, llms.txt is guidance; it does not force any model to do anything.
  • Listing everything. The value is curation; a file that includes every URL defeats the purpose.
  • Letting it go stale. Dead links and retired pages undermine the file's usefulness.
  • Expecting guaranteed gains. It is an emerging standard — adopt it as sensible hygiene, not a silver bullet.

Go deeper

Want to know if your llms.txt, robots rules and structure are AI-ready? StackOptic checks all of it in one report — free, no sign-up.

Frequently asked questions

What is llms.txt?

llms.txt is a proposed web standard: a plain-text file, written in Markdown and placed at your site's root (yourdomain.com/llms.txt), that gives large language models a concise, curated map of your most important content. Rather than letting AI crawlers guess what matters on your site, it points them to your best pages with short descriptions, so the content they use to answer questions is the content you would choose.

How is llms.txt different from robots.txt and sitemap.xml?

robots.txt tells crawlers what they may and may not access; sitemap.xml lists all your URLs for discovery; llms.txt curates and describes your most important content specifically for AI models, in human- and machine-readable Markdown. robots.txt controls access, sitemap.xml aids coverage, and llms.txt aids understanding and prioritisation — they are complementary, not replacements.

How do I check if my site has an llms.txt file?

Visit yourdomain.com/llms.txt in a browser. If you see a Markdown document starting with a single H1 title, you have one; if you get a 404, you do not. Check that it is valid Markdown, that the links work, and that it reflects your current important content. Some GEO and SEO tools, including StackOptic, check for and assess llms.txt as part of an AI-readiness review.

Does llms.txt actually improve AI visibility?

It is an emerging, proposed standard that the major AI providers have not all formally committed to honouring yet, so treat any visibility benefit as unproven rather than guaranteed. That said, it is cheap to add, does no harm, and fits the broader GEO principle of making your best content easy for machines to find and understand. Many teams adopt it as low-risk future-proofing.

How do I create an llms.txt file?

Write a Markdown file with a single H1 title (your site or product name), an optional blockquote summary, and one or more sections of links to your most important pages, each with a short description. Save it as llms.txt at your site root. You can write it by hand, or use a generator or CMS plugin that builds it from your content — then keep it updated as your key pages change.

Analyse any website with StackOptic

Get the full technology stack, performance, security and SEO report in seconds — free.

Analyse a website

Related articles