docs: add comprehensive MkDocs Material documentation site#69
docs: add comprehensive MkDocs Material documentation site#69
Conversation
eywalker
commented
Mar 4, 2026
- Set up MkDocs Material with mkdocstrings for auto-generated API docs
- Add Getting Started section: installation, quickstart, first pipeline guide
- Add Concepts section: architecture, datagrams, streams, identity/hashing, provenance, schema/column configuration
- Add User Guide section: sources, function pods, operators, pipelines, caching/persistence, execution models
- Add API Reference section with mkdocstrings directives for all public modules
- Rewrite README.md as a proper project landing page with badges and quick example
- Add docs dependency group to pyproject.toml (mkdocs, mkdocs-material, mkdocstrings)
- Add "orcapod" lowercase-p naming convention to CLAUDE.md and .zed/rules
There was a problem hiding this comment.
Pull request overview
This PR introduces a full MkDocs Material documentation site (with mkdocstrings-based API reference) and updates project metadata/README so the docs are discoverable and the repository has a clearer landing page.
Changes:
- Add MkDocs Material site configuration (
mkdocs.yml) and a structured docs tree (Getting Started / Concepts / User Guide / API Reference). - Add a
docsdependency group inpyproject.tomland publish the documentation URL in project metadata. - Rewrite
README.mdand document the “orcapod” lowercase‑p naming convention inCLAUDE.mdand.zed/rules.
Reviewed changes
Copilot reviewed 35 out of 35 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| pyproject.toml | Adds documentation URL and a docs dependency group (MkDocs tooling). |
| mkdocs.yml | Introduces MkDocs Material configuration, plugins, and navigation. |
| docs/index.md | Adds docs landing page with feature overview and next-step links. |
| docs/getting-started/installation.md | Adds installation and optional dependency guidance. |
| docs/getting-started/quickstart.md | Adds a hands-on quickstart walkthrough. |
| docs/getting-started/first-pipeline.md | Adds an end-to-end “first pipeline” tutorial. |
| docs/concepts/architecture.md | Documents high-level architecture and core abstractions. |
| docs/concepts/datagrams.md | Explains datagrams/tags/packets and naming conventions. |
| docs/concepts/streams.md | Documents stream abstraction, schema, and materialization. |
| docs/concepts/identity.md | Explains content hash vs pipeline hash identity chains. |
| docs/concepts/provenance.md | Documents source-info and system-tag provenance semantics. |
| docs/concepts/schema.md | Documents Schema/ColumnConfig and column prefix conventions. |
| docs/user-guide/sources.md | User guide for source types, identity, registration, validation. |
| docs/user-guide/operators.md | User guide for operators and convenience methods. |
| docs/user-guide/function-pods.md | User guide for packet functions, pods, and execution models. |
| docs/user-guide/pipelines.md | User guide for Pipeline lifecycle, labeling, composition. |
| docs/user-guide/caching.md | Deep dive into caching/persistence tiers and cache modes. |
| docs/user-guide/execution.md | Documents sync/async execution models and executors. |
| docs/api/index.md | Adds API reference landing page linking to generated sections. |
| docs/api/types.md | mkdocstrings directives for core types. |
| docs/api/sources.md | mkdocstrings directives for source APIs. |
| docs/api/streams.md | mkdocstrings directives for stream APIs. |
| docs/api/datagrams.md | mkdocstrings directives for datagram/tag/packet APIs. |
| docs/api/function-pods.md | mkdocstrings directives for FunctionPod APIs. |
| docs/api/packet-functions.md | mkdocstrings directives for packet function APIs. |
| docs/api/operators.md | mkdocstrings directives for operator APIs and base classes. |
| docs/api/nodes.md | mkdocstrings directives for operator/function node APIs. |
| docs/api/pipeline.md | mkdocstrings directives for Pipeline and source node APIs. |
| docs/api/databases.md | mkdocstrings directives for database backends. |
| docs/api/errors.md | mkdocstrings directives for public error types. |
| docs/api/configuration.md | mkdocstrings directives for global configuration. |
| README.md | Rewrites README into a project landing page with example + docs links. |
| CLAUDE.md | Adds the “orcapod lowercase‑p” naming convention guidance. |
| .zed/rules | Adds the same naming convention guidance for editor rules. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| with pipeline: | ||
| joined = source_a.join(source_b, label="join_data") | ||
| risk_pod(joined, label="compute_risk") | ||
| cat_pod(risk_stream, label="categorize") | ||
|
|
There was a problem hiding this comment.
The example uses risk_stream but it’s never defined in the snippet. This makes the labeling example non-runnable/confusing; it likely should capture the output of risk_pod(...) (or reference pipeline.compute_risk after compilation) before passing it to cat_pod.
| # Rebuild pipeline with updated source | ||
| pipeline2 = Pipeline(name="risk_pipeline", pipeline_database=db) | ||
| with pipeline2: | ||
| joined = patients_v2.join(labs_v2, label="join_data") |
There was a problem hiding this comment.
In the incremental compilation example, labs_v2 is referenced but never defined. Either define labs_v2 (analogous to patients_v2) or keep using the original labs source so the snippet is complete and runnable.
| joined = patients_v2.join(labs_v2, label="join_data") | |
| joined = patients_v2.join(labs, label="join_data") |
| ```python | ||
| from orcapod import DerivedSource | ||
|
|
||
| # After running a pipeline node | ||
| derived = function_node.as_source() | ||
|
|
||
| # Use as input to a downstream pipeline | ||
| downstream_joined = Join()(derived, other_source) | ||
| ``` |
There was a problem hiding this comment.
The DerivedSource example calls Join()(derived, other_source) but Join isn’t imported in the snippet. Add the missing import (or use the stream/source convenience method) so readers can copy/paste the example without a NameError.
| ```bash | ||
| # Install all development dependencies | ||
| pip install -e ".[test,dev]" | ||
| # or | ||
| pip install -r requirements-dev.txt | ||
| # Install dev dependencies | ||
| uv sync --group dev | ||
|
|
||
| # Run tests | ||
| uv run pytest tests/ | ||
|
|
||
| # Run tests with coverage | ||
| uv run pytest tests/ --cov=src --cov-report=term-missing | ||
|
|
||
| # Build documentation locally | ||
| uv run mkdocs serve | ||
| ``` |
There was a problem hiding this comment.
uv sync --group dev won’t install MkDocs because MkDocs is declared under the separate docs dependency group in pyproject.toml. As written, uv run mkdocs serve will fail unless the user also installs the docs group (e.g., uv sync --group dev --group docs, or adjust groups so MkDocs is included where expected).
| markdown_extensions: | ||
| - admonition | ||
| - pymdownx.details | ||
| - pymdownx.superfences: | ||
| custom_fences: | ||
| - name: mermaid | ||
| class: mermaid | ||
| format: !!python/name:pymdownx.superfences.fence_code_format | ||
| - pymdownx.highlight: |
There was a problem hiding this comment.
Docs include Mermaid diagrams (```mermaid fences in several pages), but the MkDocs config only defines a custom fence and doesn’t load Mermaid’s JS runtime. Without adding Mermaid support (e.g., extra_javascript to load Mermaid, or a Mermaid plugin), those diagrams will render as plain code blocks.
|
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
306b630 to
ad23ac5
Compare
- Fix undefined `risk_stream` variable in pipelines.md labels example - Define `labs_v2` in first-pipeline.md incremental computation example - Add missing `Join` import in sources.md DerivedSource example - Add `uv sync --group docs` before `mkdocs serve` in README.md - Fix Mermaid fence format to use fence_mermaid_format for diagram rendering https://claude.ai/code/session_01SGish2hvMbyPkhCorDCFmX
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 37 out of 38 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| markdown_extensions: | ||
| - admonition | ||
| - pymdownx.details | ||
| - pymdownx.superfences: | ||
| custom_fences: |
There was a problem hiding this comment.
docs/index.md uses Material icon shortcodes like :material-download:. Those only render if pymdownx.emoji is enabled (commonly with the Material emoji index/generator); otherwise they appear as literal text in the built site. Consider adding pymdownx.emoji to markdown_extensions (with the Material-recommended config) so the homepage cards render correctly.
| ## Documentation | ||
|
|
||
| For development, you can install all optional dependencies: | ||
| Full documentation is available at the [orcapod docs site](https://walkerlab.github.io/orcapod-python/). | ||
|
|
||
| - [Getting Started](https://walkerlab.github.io/orcapod-python/getting-started/installation/) — Installation and quickstart | ||
| - [Concepts](https://walkerlab.github.io/orcapod-python/concepts/architecture/) — Architecture and design principles | ||
| - [User Guide](https://walkerlab.github.io/orcapod-python/user-guide/sources/) — Detailed guides for each component | ||
| - [API Reference](https://walkerlab.github.io/orcapod-python/api/) — Auto-generated API documentation |
There was a problem hiding this comment.
The README documentation links still point to the GitHub Pages URL (walkerlab.github.io/orcapod-python), but this PR also adds Documentation = https://orcapod.org/ and configures site_url: https://orcapod.org/ + docs/CNAME. To avoid confusing users (and to keep pyproject.toml/MkDocs/README consistent), update these README links to the custom domain (optionally keeping the GitHub Pages URL as a fallback).
| - uses: astral-sh/setup-uv@v4 | ||
|
|
||
| - name: Install docs dependencies | ||
| run: uv sync --group docs |
There was a problem hiding this comment.
The docs deployment workflow installs the full project (not just MkDocs), so it likely needs the same CI prerequisites as run-tests.yml (notably graphviz/libgraphviz-dev for pygraphviz, and an explicit Python version via actions/setup-python). Also consider using uv sync --locked --group docs to ensure the build fails if uv.lock is out of date, matching the reproducibility guarantees used in the test workflow.
| - uses: astral-sh/setup-uv@v4 | |
| - name: Install docs dependencies | |
| run: uv sync --group docs | |
| - uses: actions/setup-python@v5 | |
| with: | |
| python-version: "3.11" | |
| - uses: astral-sh/setup-uv@v4 | |
| - name: Install system dependencies | |
| run: | | |
| sudo apt-get update | |
| sudo apt-get install -y graphviz libgraphviz-dev | |
| - name: Install docs dependencies | |
| run: uv sync --locked --group docs |
- Set up MkDocs Material with mkdocstrings for auto-generated API docs - Add Getting Started section: installation, quickstart, first pipeline guide - Add Concepts section: architecture, datagrams, streams, identity/hashing, provenance, schema/column configuration - Add User Guide section: sources, function pods, operators, pipelines, caching/persistence, execution models - Add API Reference section with mkdocstrings directives for all public modules - Rewrite README.md as a proper project landing page with badges and quick example - Add docs dependency group to pyproject.toml (mkdocs, mkdocs-material, mkdocstrings) - Add "orcapod" lowercase-p naming convention to CLAUDE.md and .zed/rules https://claude.ai/code/session_01SGish2hvMbyPkhCorDCFmX
Covers GitHub Pages setup, DNS configuration for orcapod.org, local development, troubleshooting, and how to extend the docs. https://claude.ai/code/session_01SGish2hvMbyPkhCorDCFmX
- Fix undefined `risk_stream` variable in pipelines.md labels example - Define `labs_v2` in first-pipeline.md incremental computation example - Add missing `Join` import in sources.md DerivedSource example - Add `uv sync --group docs` before `mkdocs serve` in README.md - Fix Mermaid fence format to use fence_mermaid_format for diagram rendering https://claude.ai/code/session_01SGish2hvMbyPkhCorDCFmX
- Add pytest-codeblocks to dev dependencies - Add docs/conftest.py to patch exec namespace with __name__ (fixes orcapod's get_function_signature when functions are defined in exec) - Annotate all 83 Python code blocks across 13 doc files with continuation (<!--pytest-codeblocks:cont-->) or skip markers - Fix DictSource examples to use row-oriented list[dict] (not column-oriented dict, which is not the actual API) - Fix ListSource examples to use actual API (name + tag_function) - Fix Datagram.select/drop to use variadic args, not list - Fix Schema.select/drop to use variadic args, not list - Add missing imports in standalone code blocks - Skip bash blocks (installation/build commands) and illustrative examples that reference undefined variables Result: 20 passed, 57 skipped, 0 failed https://claude.ai/code/session_01SGish2hvMbyPkhCorDCFmX
Installs system dependencies (graphviz, libgraphviz-dev) and runs uv sync --group dev to ensure linters and tests work in remote sessions. https://claude.ai/code/session_01SGish2hvMbyPkhCorDCFmX
Comprehensive update to reflect all recent codebase changes: - Add channels.py, pipeline/, core/nodes/, core/executors/, execution_engines/ - Document async execution model (AsyncExecutableProtocol, channels, orchestrator) - Update core abstractions (Node hierarchy, PersistentSource, @function_pod decorator) - Expand project layout with all protocol files, hashing files, database files, utils - Add new test directories (test_channels/, test_pipeline/) - Update class names (StaticOutputOperatorPod, GraphTracker, BasicTrackerManager) https://claude.ai/code/session_01SGish2hvMbyPkhCorDCFmX
91442be to
bd7c048
Compare