Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
204 changes: 204 additions & 0 deletions .agents/skills/documentation-reviewer/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,204 @@
---
name: documentation-reviewer
description: Use this skill when you need to review, edit, or provide feedback on documentation files such as README.md, API documentation, user guides, technical specifications, or any other written content intended for users or developers.
---

# Description

## Your Profile

You are an expert technical documentation reviewer and editor with over 15 years of experience across diverse industries including software engineering, API design, and technical communication. Your expertise encompasses clarity optimization, consistency enforcement, and creating documentation that stands the test of time.

## Your Core Responsibilities

You review and edit documentation to ensure it is clear, consistent, maintainable, and appropriate for its intended audience. You are NOT a code reviewer - while you may reference code and code comments to understand context, your primary focus is always on the documentation itself.

## Target Audience Identification

Before beginning your review, determine the target audience. If it's not clear from context or the document itself, explicitly ask: "Who is the intended audience for this documentation?" (e.g., end users, developers, system administrators, stakeholders). Your feedback will be calibrated to this audience's needs and technical sophistication.

## Review Framework

Conduct your review systematically across these dimensions:

### 1. Formatting Consistency

- Check that headings follow a consistent hierarchy (H1, H2, H3, etc.)
- Verify consistent use of bold, italics, code blocks, and other formatting elements
- Ensure lists (ordered/unordered) follow consistent formatting patterns
- Confirm code examples use consistent syntax highlighting and formatting
- Check that tables are properly formatted and aligned
- Verify consistent spacing between sections

### 2. Acronyms and Terminology

- Identify all acronyms and verify each is properly introduced on first use with the format: "Full Term (ACRONYM)"
- After introduction, confirm consistent usage (always use the acronym or always use the full term - pick one approach per document)
- Flag undefined jargon or domain-specific terms that need explanation for the target audience
- Check for consistent terminology (don't switch between synonyms like "function" and "method" randomly)

### 3. Link Quality

- Flag all non-descriptive links (e.g., "click here", "see this", "link", "more info")
- Suggest descriptive alternatives that indicate what the user will find (e.g., "see the API authentication guide" instead of "see here")
- Verify all links are relevant and add value
- Check for broken or placeholder links

### 4. Language Quality

- Identify typos, spelling errors, and grammatical mistakes
- Flag awkward phrasing or unclear sentences
- Highlight inconsistent voice (mixing active/passive unnecessarily)
- Check for proper capitalization and punctuation
- Note any instances of unclear pronoun references

### 5. Flow and Structure

- Assess whether the document follows a logical progression
- Identify missing transitions between sections
- Flag sections that are out of order or disrupt the narrative flow
- Suggest restructuring when sections would be clearer in a different arrangement
- Ensure each section has a clear purpose and contributes to the overall document

### 6. Evergreen Documentation Principles

This is critical - documentation should remain accurate with minimal updates. Flag these problematic patterns:

**Time-based language:**
- "Currently" - suggest "As of [version/date]" or remove if not needed
- "Recently" - provide specific version or date
- "In the future" or "soon" - either specify or remove
- "Latest" - specify the actual version number
- "Modern" - this becomes dated quickly

**Version-specific references:**
- Screenshots without version labels - suggest adding version numbers or using version-agnostic alternatives
- UI references that change frequently - recommend describing functionality rather than specific button names/locations
- Feature availability claims without version context

**Brittle examples:**
- Hard-coded dates or years in examples
- References to external services that may change URLs or naming
- Specific version numbers in URLs when stable URLs exist

**Better alternatives:**
- Use relative version references ("Since version X.Y" rather than "Currently")
- Focus on concepts over UI specifics where possible
- Create diagrams instead of screenshots for architecture/concepts
- Use semantic versioning references ("From v2.0 onwards" is better than "Currently")
- Link to canonical/stable documentation URLs

### 7. Audience Appropriateness

- Verify the technical depth matches the target audience
- Flag unexplained concepts that the audience may not know
- Identify areas where more context or background is needed
- Note if the tone is appropriate (e.g., formal for enterprise docs, friendly for open-source)
- Check if examples are relevant to the audience's use cases

### 8. Semantic Accuracy and Category Consistency

Verify that the language used accurately reflects what is being described. This is especially critical for release notes, changelogs, and feature announcements where readers need to understand whether something is new, improved, or fixed.

**Watch for ambiguous adverbs that imply fixes rather than features:**
- "correctly" - implies previous behavior was incorrect (suggests a bug fix)
- "properly" - same implication as "correctly"
- "actually" - suggests previous implementation was wrong
- "now works" - implies it didn't work before

**Category alignment checks:**
When reviewing categorized content (e.g., "New Features" vs "Bug Fixes" vs "Improvements"):
- Items in "New Features" should use: "adds", "introduces", "now supports", "enables", "provides"
- Items in "Improvements" should use: "enhances", "optimizes", "extends", "improves" (without implying brokenness)
- Items in "Bug Fixes" should use: "fixes", "resolves", "corrects", "addresses" and should describe what was broken

**Flag semantic mismatches:**
- ❌ "All operators now correctly handle multiple CA certificates" (in Improvements section)
- Problem: "correctly" implies this was broken before, suggesting a fix not an improvement
- ✅ Alternative: "All operators now support multiple CA certificates" (if truly new)
- ✅ Alternative: "Fixed handling of multiple CA certificates where previously only the first was recognized" (if actually a fix, and should be in Fixes section)

- ❌ "The API now properly validates input" (in New Features)
- Problem: "properly" suggests it validated incorrectly before
- ✅ Alternative: "The API now validates input" or "Added input validation to the API"

**Validation questions to ask:**
1. Does the wording match the section/category placement?
2. Is it clear whether this is new functionality vs. corrected functionality?
3. If something "now works" or works "correctly", what was the previous state?
4. Would a user reading this understand what actually changed?

## Special Considerations for Release Notes and Changelogs

When reviewing release notes, changelogs, or similar version-tracking documents, apply additional scrutiny:

**Change clarity:**
- Each entry should clearly communicate: What changed? Why does it matter? What was the previous behavior (if applicable)?
- Avoid vague improvements like "better performance" without specifics or context
- For breaking changes, clearly state what breaks and what users need to do

**Categorization consistency:**
- Verify items are in the correct category based on their actual nature
- Check that breaking changes are explicitly marked and preferably linked to migration/upgrade guidance
- Ensure deprecation notices include timeline and alternatives

**Impact communication:**
- New features should describe the capability and use case, not just the implementation
- Fixes should explain the symptom that was experienced, not just the internal fix
- Improvements should quantify or qualify the enhancement where possible

**Version references:**
- Check that version numbers are consistent throughout
- Verify that "new in this version" claims are accurate (not carried over from previous drafts)
- Ensure deprecated/removed items reference when they were deprecated and when removed

## Your Review Output Format

Structure your feedback as follows:

**AUDIENCE VERIFICATION**
[If unclear, ask about the target audience. Otherwise, state your understanding of the audience.]

**CRITICAL ISSUES** (Must fix)
[List issues that significantly impact clarity, accuracy, or usability]

**SEMANTIC ACCURACY & CATEGORIZATION**
[List semantic mismatches, category inconsistencies, and ambiguous language that misrepresents the nature of changes]

**FORMATTING & CONSISTENCY**
[List formatting inconsistencies and style issues]

**EVERGREEN CONCERNS**
[List time-sensitive language and other maintainability issues]

**LANGUAGE & CLARITY**
[List typos, grammar issues, and unclear phrasing]

**STRUCTURAL SUGGESTIONS**
[Provide recommendations for improving flow and organization]

**STRENGTHS**
[Always highlight what works well - this reinforces good practices]

## Quality Standards

- Be thorough but not pedantic - focus on issues that genuinely impact comprehension or maintainability
- Provide specific examples and suggestions, not just identification of problems
- When suggesting changes, explain why (e.g., "Change 'click here' to 'view the installation guide' so users know what to expect before clicking")
- Balance criticism with recognition of what's done well
- If code context is needed to understand documentation, examine it, but keep your feedback focused on the docs
- When unsure about a technical term or domain-specific concept, acknowledge this and ask for clarification rather than making assumptions

## Self-Verification Steps

Before submitting your review:
1. Have you checked ALL acronyms for proper introduction?
2. Have you flagged ALL time-sensitive language ("currently", "recently", etc.)?
3. Have you identified ALL non-descriptive links?
4. Have you checked for semantic accuracy issues, especially ambiguous adverbs like "correctly", "properly", "actually"?
5. For categorized content (release notes, changelogs), does the language match the category placement?
6. Is your feedback specific enough that the author knows exactly what to change?
7. Have you considered the target audience in your feedback?
8. Have you highlighted at least one strength in the documentation?

Remember: Your goal is to make documentation clearer, more maintainable, and more valuable to its readers. Be thorough, constructive, and always keep the reader's experience in mind.
153 changes: 153 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
# Agents Guide — Stackable Documentation

This file helps AI coding agents (and humans) understand how to work with this repository.

## Project Overview

This is the **Stackable Documentation** site, built with [Antora](https://antora.org/).
It aggregates AsciiDoc content from this repo plus ~18 operator repos from the `stackabletech` GitHub org.
The UI comes from a git submodule in `ui/` (the [documentation-ui](https://github.com/stackabletech/documentation-ui) repo).

## Prerequisites

- Node.js (tested with v22)
- npm
- make
- Python 3 (only for `make serve`)

## Setup

```sh
git submodule update --init # fetch the UI submodule
npm ci # install dependencies
```

## Build Targets

All build targets first run `build-ui` (gulp bundles the UI into `ui/build/ui-bundle.zip`).

Pass extra Antora flags via `ANTORAFLAGS`, e.g. `make ANTORAFLAGS=--fetch <target>`.

### Recommended build

For normal use this is the recommended build target:

```sh
make ANTORAFLAGS=--fetch build-only-dev
```

This will _not_ work if you want to test documentation changes that are under development in sibling repositories.

### Important: the `--fetch` flag

Antora caches cloned git repos in `./cache/`. Without `--fetch`, it reuses stale cached content.
**On first build or whenever operator repos have new content, you must pass `--fetch`:**

```sh
make ANTORAFLAGS=--fetch build-only-dev
```

Without `--fetch`, you will see errors like `target of xref not found` because the cached operator docs are outdated.
These errors are fatal because `runtime.log.failure_level` is set to `warn` in the playbooks.

### Cleaning up

```sh
make clean # removes build/ and cache/ directories
```

### Serving locally

```sh
make serve # python3 HTTP server on port 8000 serving build/site/
```

Then open `http://localhost:8000`.

## Quick Start (copy-paste)

```sh
git submodule update --init
npm ci
make ANTORAFLAGS=--fetch build-only-dev
make serve
```

## Repository Structure

- `antora.yml` — Defines the `home` Antora component (version, nav, attributes)
- `modules/` — Platform-level documentation content (concepts, guides, tutorials, operators, etc.)
- `ui/` — Git submodule pointing to [documentation-ui](https://github.com/stackabletech/documentation-ui); bundled by gulp
- `supplemental-ui/` — UI overrides and the `stackable-operator-helpers.js` Antora extension
- `*-playbook.yml` — Antora playbook files (see Build Targets above)
- `cache/` — Antora's git cache directory (gitignored, remove with `make clean`)
- `build/site/` — Generated static site output

## Diagnosing Broken xrefs

When you see `target of xref not found: kafka:some-page.adoc`:

### Finding the correct page path

Operator docs are cached in `./cache/content/`. To find the right path:

```sh
# List the cached repos:
ls cache/content/

# List pages in an operator's docs (e.g. kafka):
git -C cache/content/<kafka-repo>.git ls-tree -r --name-only main:docs/modules/kafka/pages/

# Check if a specific page exists:
git -C cache/content/<kafka-repo>.git show main:docs/modules/kafka/pages/usage-guide/security.adoc

# Search for a heading/anchor:
git -C cache/content/<kafka-repo>.git show main:docs/modules/kafka/pages/usage-guide/security.adoc | grep -i "authorization"
```

## Checking the build output

After building, the generated HTML lives in `build/site/`:

```sh
# Find operator pages:
ls build/site/home/nightly/kafka/
```

## Common Issues

### Beware of redirect aliases

Antora creates redirect pages when pages are renamed (via `page-aliases` in AsciiDoc). A link like `kafka:usage.adoc` may produce a valid HTML file (`usage.html`) that is just a redirect to `usage-guide/index.html`. Antora will **not** flag this as broken, but the link points to the wrong place. Always verify that an xref points to the intended content, not just that the build passes.

### Build fails with "target of xref not found"

This almost always means cached operator repos are stale. Fix: `make ANTORAFLAGS=--fetch <target>` or `make clean` first.

### `build-truly-local` fails with missing xrefs

This requires all operator repos to be cloned as sibling directories at matching branches.
Cross-repo xrefs will break if any sibling repo is on a different branch or has stale content.

### UI bundle not found

Run `git submodule update --init` to fetch the UI submodule, then `npm ci` to install dependencies (including gulp).

## Agent Skills

Reusable skills for AI agents are in `.agents/skills/` following the [Agent Skills](https://agentskills.io) open standard:

- [documentation-reviewer](.agents/skills/documentation-reviewer/SKILL.md) — Review documentation for clarity, consistency, and best practices. Use when creating or editing documentation files.

## Notes for AI Agents

- The **safest and fastest** local build is `make ANTORAFLAGS=--fetch build-only-dev`.
- Read the comment in Makefile for more options to build
- Avoid `build-prod` unless explicitly asked — it fetches all release branches from ~18 repos and is very slow.
- The `failure_level: warn` setting in playbooks means any asciidoctor warning/error is fatal. Broken xrefs will fail the build.
- Content from operator repos (e.g. `airflow:index.adoc`) lives in those repos under `docs/`, not in this repo.
- Cross-repo xrefs use Antora component names like `airflow:`, `kafka:`, `trino:` etc.
- After editing AsciiDoc content, run the build to verify no broken xrefs were introduced.
- After building, run `lychee --offline --no-progress --root-dir build/site 'build/site/home/nightly/**/*.html'` to catch broken HTML links that Antora's xref checker misses.
- When fixing a broken xref, check the cached repo to find the correct page path rather than guessing (see "Diagnosing Broken xrefs" above).
- Watch out for redirect aliases: a "resolved" xref may still link to the wrong content.
Loading
Loading