---
title: "Designing Your Site for AI Agents: llms.txt, RSS, and Semantic HTML"
description: "How to make your personal site readable by AI agents and LLMs - implementing llms.txt, structured RSS, semantic HTML, and machine-readable metadata."
date: 2026-03-29T00:00:00.000Z
category: Engineering
readingTime: "3 min read"
---


The web is changing. Browsers aren't the only things reading your site anymore. AI agents, LLMs, and automated tools are consuming web content at scale. If your site isn't designed for machine readability alongside human readability, you're invisible to a growing segment of the web.

This site - guden.tr - is built with AI agents in mind. Here's how and why.

## The Problem: AI Agents Can't Read Your Landing Page

Traditional websites are designed for humans. Animations, hero sections, gradient backgrounds - they look great in Chrome but mean nothing to an LLM. When an AI agent visits your site to answer a question like "What does Umut Güden work on?", it needs structured content, not visual design.

Most AI agents don't render JavaScript. They don't interpret CSS. They process text and structure. If your content is buried in client-rendered components behind loading spinners, it doesn't exist to AI.

## llms.txt: A Standard for AI Readability

The `llms.txt` specification (inspired by `robots.txt`) provides a structured, plain-text overview of your site specifically for LLMs and AI agents.

This site serves `llms.txt` at `/llms.txt`:

```
# Umut Güden

> 18-year-old Founder & Chairman of HMD Corporation, a multinational holding company
> with 6+ years of experience and 25+ projects reaching 17.5M+ end users.
## Site: https://guden.tr

## Pages
- [Home](https://guden.tr/)
- [Projects](https://guden.tr/projects)
- [Blog](https://guden.tr/blog)
- [Contact](https://guden.tr/contact)

## Blog Posts
- [Building Discord Bots at Scale](https://guden.tr/blog/building-discord-bots-at-scale) - ...
  Raw markdown: https://guden.tr/api/blog/building-discord-bots-at-scale/md

## Feeds
- RSS: https://guden.tr/rss.xml
- llms.txt: https://guden.tr/llms.txt

## Contact
- Email: contact [at] umutguden [dot] tr
- GitHub: https://github.com/umutguden
```

The key features:

1. **Site structure** - every page listed with its URL
2. **Content inventory** - every blog post with description and direct link
3. **Raw content access** - each blog post has a raw markdown endpoint that returns the original `.md` file, stripping all HTML rendering
4. **Contact information** - structured for machine parsing (email obfuscated for spam protection)

## Raw Markdown Endpoints

The most unique feature: `/api/blog/[slug]/md` returns the raw markdown content of any blog post. No HTML wrapping, no navigation, no CSS - just the frontmatter and content.

```typescript
// src/app/api/blog/[slug]/md/route.ts
export async function GET(
  _request: Request,
  { params }: { params: { slug: string } }
) {
  const post = getPostBySlug(params.slug);
  if (!post) return new NextResponse('Not found', { status: 404 });

  return new NextResponse(post.rawContent, {
    headers: { 'Content-Type': 'text/markdown; charset=utf-8' },
  });
}
```

This is purpose-built for AI consumption. An LLM can fetch the raw markdown, parse it directly, and use the content without any HTML extraction or cleanup. The content-to-noise ratio is 100%.

## Semantic HTML: Structure as Meaning

Every page on this site uses semantic HTML elements:

```html
<article>
  <header>
    <h1>Post Title</h1>
    <time datetime="2026-03-28">March 28, 2026</time>
  </header>
  <section class="article-content">
    <h2>Section Heading</h2>
    <p>Content...</p>
  </section>
  <footer>
    <nav aria-label="Post tags">...</nav>
  </footer>
</article>
```

- `<article>` tells machines "this is a self-contained piece of content"
- `<header>` / `<footer>` provide structural boundaries
- `<time datetime="...">` provides machine-parseable dates
- `<nav aria-label="...">` describes navigation purpose
- `<section>` groups related content

For AI agents that do process HTML, semantic elements provide structure that `<div>` soup cannot.

## Structured Data: JSON-LD

Every page includes JSON-LD structured data:

```json
{
  "@context": "https://schema.org",
  "@type": "BlogPosting",
  "headline": "Designing Your Site for AI Agents",
  "author": {
    "@type": "Person",
    "name": "Umut Güden",
    "url": "https://guden.tr"
  },
  "datePublished": "2026-03-28",
  "description": "...",
  "mainEntityOfPage": "https://guden.tr/blog/ai-agents-and-personal-sites"
}
```

Google, Bing, and AI search products use structured data to understand content. A blog post with JSON-LD markup is more likely to appear in AI-generated summaries because the metadata is explicit, not inferred.

## RSS: The Original Machine-Readable Feed

RSS predates AI hype by decades, but it's become newly relevant. AI agents use RSS feeds to:

- **Discover content** without crawling every page
- **Track updates** with standardized publish dates
- **Consume full content** when the feed includes full text (not just excerpts)

This site's RSS feed at `/rss.xml` includes the full HTML content of every post. An AI agent can subscribe to the feed, receive new posts automatically, and process them without visiting the site at all.

## The Meta-Strategy: Be Findable, Be Readable, Be Useful

Designing for AI agents isn't about optimizing for one specific AI. It's about making content accessible to any machine that processes text:

1. **Be findable** - `llms.txt`, sitemap, RSS, semantic HTML
2. **Be readable** - raw markdown endpoints, clean HTML, structured data
3. **Be useful** - real content, accurate metadata, working links

The same principles that make a site good for accessibility (semantic HTML, proper headings, descriptive links) also make it good for AI. The overlap is nearly complete.

## What's I'd Add Next

- **Embeddings endpoint** - pre-computed vector embeddings of each post for RAG applications
- **Content API** - a proper JSON API returning structured post data for programmatic access
- **Update feeds** - `llms.txt` with a changelog section showing what's new since last visit

The web is becoming multi-audience. Designing for humans only is leaving value on the table. Design for machines too - the effort is minimal and the reach is expanding.

---

*Want your site to be AI-friendly? Start with three things: `llms.txt`, RSS with full content, and semantic HTML. That covers 90% of the value.*
