Post Title

---
title: "Designing Your Site for AI Agents: llms.txt, RSS, and Semantic HTML"
description: "How to make your personal site readable by AI agents and LLMs, implementing llms.txt, structured RSS, semantic HTML, and machine-readable metadata"
date: 2026-03-29
category: Engineering
readingTime: "3 min read"
---


The web is changing. Browsers are not the only things reading your site anymore. AI agents, LLMs, and automated tools are consuming web content at scale. If your site is not designed for machine readability alongside human readability, you are invisible to a growing segment of the web.

This site, guden.tr, is built with AI agents in mind. This is how and why.

## The Problem: AI Agents Cannot Read Your Landing Page

Traditional websites are designed for humans. Animations, hero sections, gradient backgrounds: they look great in Chrome but mean nothing to an LLM. When an AI agent visits your site to answer a question like "What does Umut Güden work on?", it needs structured content, not visual design.

Most AI agents do not render JavaScript. They do not interpret CSS. They process text and structure. If your content is buried in client-rendered components behind loading spinners, it does not exist to AI.

## llms.txt: A Standard for AI Readability

The `llms.txt` specification (inspired by `robots.txt`) provides a structured, plain-text overview of your site specifically for LLMs and AI agents.

This site serves `llms.txt` at `/llms.txt`:

```
# Umut Güden

> 18-year-old Founder of HMD Corporation, a multinational holding company
> with 6+ years of experience and 25+ projects reaching 17.5M+ end users.
## Site: https://guden.tr

## Pages
- [Home](https://guden.tr/)
- [Projects](https://guden.tr/projects)
- [Blog](https://guden.tr/blog)
- [Contact](https://guden.tr/contact)

## Blog Posts
- [Building Discord Bots at Scale](https://guden.tr/blog/building-discord-bots-at-scale) - ...
  Raw markdown: https://guden.tr/api/blog/building-discord-bots-at-scale/md

## Feeds
- RSS: https://guden.tr/rss.xml
- llms.txt: https://guden.tr/llms.txt

## Contact
- Email: contact [at] umutguden [dot] tr
- GitHub: https://github.com/umutguden
```

The key features:

1. **Site structure:** every page listed with its URL
2. **Content inventory:** every blog post with description and direct link
3. **Raw content access:** each blog post has a raw markdown endpoint that returns the original `.md` file, stripping all HTML rendering
4. **Contact information:** structured for machine parsing (email obfuscated for spam protection)

## Raw Markdown Endpoints

The most unique feature: `/api/blog/[slug]/md` returns the raw markdown content of any blog post. No HTML wrapping, no navigation, no CSS, just the frontmatter and content.

```typescript
// src/app/api/blog/[slug]/md/route.ts
export async function GET(
  _request: Request,
  { params }: { params: { slug: string } }
) {
  const post = getPostBySlug(params.slug);
  if (!post) return new NextResponse('Not found', { status: 404 });

  return new NextResponse(post.rawContent, {
    headers: { 'Content-Type': 'text/markdown; charset=utf-8' },
  });
}
```

This is purpose-built for AI consumption. An LLM can fetch the raw markdown, parse it directly, and use the content without any HTML extraction or cleanup. The content-to-noise ratio is 100%.

## Semantic HTML: Structure as Meaning

Every page on this site uses semantic HTML elements:

```html
<article>
  <header>
    <h1>Post Title</h1>
    <time datetime="2026-03-28">March 28, 2026</time>
  </header>
  <section class="article-content">
    <h2>Section Heading</h2>
    <p>Content...</p>
  </section>
  <footer>
    <nav aria-label="Post tags">...</nav>
  </footer>
</article>
```

- `<article>` tells machines "this is a self-contained piece of content"
- `<header>` / `<footer>` provide structural boundaries
- `<time datetime="...">` provides machine-parseable dates
- `<nav aria-label="...">` describes navigation purpose
- `<section>` groups related content

For AI agents that do process HTML, semantic elements provide structure that `<div>` soup cannot.

## Structured Data: JSON-LD

Every page includes JSON-LD structured data:

```json
{
  "@context": "https://schema.org",
  "@type": "BlogPosting",
  "headline": "Designing Your Site for AI Agents",
  "author": {
    "@type": "Person",
    "name": "Umut Güden",
    "url": "https://guden.tr"
  },
  "datePublished": "2026-03-28",
  "description": "...",
  "mainEntityOfPage": "https://guden.tr/blog/ai-agents-and-personal-sites"
}
```

Google, Bing, and AI search products use structured data to understand content. A blog post with JSON-LD markup is more likely to appear in AI-generated summaries because the metadata is explicit, not inferred.

## RSS: The Original Machine-Readable Feed

RSS predates AI hype by decades, but it has become newly relevant. AI agents use RSS feeds to:

- **Discover content** without crawling every page
- **Track updates** with standardised publish dates
- **Consume full content** when the feed includes full text (not just excerpts)

This site's RSS feed at `/rss.xml` includes the full HTML content of every post. An AI agent can subscribe to the feed, receive new posts automatically, and process them without visiting the site at all.

## The Meta-Strategy: Be Findable, Be Readable, Be Useful

Designing for AI agents is not about optimising for one specific AI. It is about making content accessible to any machine that processes text:

1. **Be findable:** `llms.txt`, sitemap, RSS, semantic HTML
2. **Be readable:** raw markdown endpoints, clean HTML, structured data
3. **Be useful:** real content, accurate metadata, working links

The same principles that make a site good for accessibility (semantic HTML, proper headings, descriptive links) also make it good for AI. The overlap is nearly complete.

## What I Would Add Next

- **Embeddings endpoint:** pre-computed vector embeddings of each post for RAG applications
- **Content API:** a proper JSON API returning structured post data for programmatic access
- **Update feeds:** `llms.txt` with a changelog section showing what is new since last visit

The web is becoming multi-audience. Designing for humans only is leaving value on the table. Design for machines too. The effort is minimal and the reach is expanding.