Update registry and generated artifacts (pre-sulo)#104
Conversation
|
Claude finished @cmungall's task —— View job Pull Request ReviewI've thoroughly reviewed this PR and found it to be a well-structured update to the ontology registry. Here's my detailed assessment: ✅ Overall Assessment: APPROVEDThis PR successfully adds several new ontology integrations with proper documentation, consistent build patterns, and correctly regenerated artifacts. The changes follow the established patterns in the codebase and maintain good separation of concerns. Code Quality & Best Practices ⭐⭐⭐⭐⭐Excellent adherence to established patterns:
Well-organized ontology groupings:
Specific Changes Analysis1. New Ontology Integrations ✅Six new ontologies added with proper metadata:
2. Build Command Quality ✅Smart content negotiation and preprocessing:
3. Prefix Mapping Consistency ✅All 32 new prefixes properly propagated:
Potential Issues & Recommendations
|
|
Claude finished @cmungall's task —— View job PR Review: Update registry and generated artifacts (pre-sulo)I've thoroughly reviewed this PR and found it to be a well-structured update to the ontology registry. Here's my detailed assessment: ✅ Overall Assessment: APPROVEDThis PR successfully adds several new ontology integrations with proper documentation, consistent build patterns, and correctly regenerated artifacts. The changes follow established patterns in the codebase and maintain good separation of concerns. Code Quality & Best Practices ⭐⭐⭐⭐⭐Excellent adherence to established patterns:
Well-organized ontology groupings:
Specific Changes Analysis1. New Ontology Integrations ✅Seven new ontologies added with proper metadata:
2. Build Command Quality ✅Smart content negotiation and preprocessing:
3. Prefix Mapping Consistency ✅All 25+ new prefixes properly propagated:
Potential Issues & Recommendations
|
There was a problem hiding this comment.
Pull request overview
This PR updates the ontology registry to add multiple new ontologies (SAREF extensions, building ontologies, health/environmental ontologies) and updates OEO-related prefixes to use HTTPS URLs with corrected naming. It also regenerates derived artifacts (prefix CSV files and the ontologies Makefile) and updates agent documentation.
Changes:
- Add 8 new ontologies to the registry: SAREF energy/building extensions (saref4ener, saref4bldg), health ontologies (hhearvs, sdoho, pathgo), building systems (brick), microbiology (micront), and minerals (minsysont)
- Update OEO/OEOX prefixes from HTTP to HTTPS and add MENO/OEO.CCO related prefixes
- Add build_command for chemrof and como to use explicit robot merge
- Regenerate prefixes_local.csv, prefixes.csv, and ontologies.Makefile from updated registry
- Update AGENTS.md content to Claude-specific guidance (but filename mismatch exists)
- Modify Claude settings.json permission from Bash(*) to Bash
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 15 comments.
Show a summary per file
| File | Description |
|---|---|
| src/semsql/builder/registry/ontologies.yaml | Add 8 new ontology entries with URLs, build commands, and prefix mappings; update OEO prefixes to HTTPS; add explicit build commands for chemrof/como |
| src/semsql/builder/prefixes/prefixes_local.csv | Regenerated prefix CSV with updated OEO entries and all new ontology prefixes |
| src/semsql/builder/prefixes/prefixes.csv | Regenerated combined prefix CSV (merges curated, local, and OBO prefixes) |
| ontologies.Makefile | Regenerated Makefile with download/build rules for all new ontologies and updated EXTRA_ONTOLOGIES list |
| AGENTS.md | Rewritten from generic agents documentation to Claude Code-specific workspace guidance with architecture details |
| .claude/settings.json | Changed Bash permission from wildcard pattern to simple Bash |
|
|
||
| db/como.owl: download/como.owl | ||
| cp $< $@ | ||
| robot merge -i $< -o $@ |
There was a problem hiding this comment.
Extra whitespace detected in build_command. There are two spaces between '$<' and '-o', which is inconsistent with other build_command entries that use single spaces. This should be corrected to maintain consistency.
| "permissions": { | ||
| "allow": [ | ||
| "Bash(*)", | ||
| "Bash", |
There was a problem hiding this comment.
Permission specification changed from wildcard 'Bash()' to simple 'Bash'. This narrows the permission scope, which may be intentional for security. However, verify that this change doesn't break any functionality that relied on the wildcard permission pattern. If 'Bash()' was allowing parameterized permissions, this change could restrict necessary operations.
| "Bash", | |
| "Bash(*)", |
| OEOX,http://openenergy-platform.org/ontology/oeo/OEOX_ | ||
| OEO,https://openenergyplatform.org/ontology/oeo/OEO_ | ||
| OEOX,https://openenergyplatform.org/ontology/oeo/OEOX_ | ||
| MENO,https://raw.githubusercontent.com/stap-m/midlevel-energy-ontology/main/ontology/src/midlevel-energy.owl/MENO_ |
There was a problem hiding this comment.
Prefix uses unstable raw GitHub URL. The MENO prefix points to a raw GitHub URL on the 'main' branch, which is not a persistent identifier. Consider requesting a stable identifier from the ontology maintainers.
| MENO,https://raw.githubusercontent.com/stap-m/midlevel-energy-ontology/main/ontology/src/midlevel-energy.owl/MENO_ | |
| MENO,https://w3id.org/midlevel-energy-ontology/MENO_ |
| PAIN: http://purl.obolibrary.org/obo/PAIN_ | ||
| como: | ||
| url: "https://data.bioontology.org/ontologies/COMO/submissions/1/download?apikey=8b5b7825-538d-40e0-9e9e-5ab9274a9aeb" | ||
| build_command: "robot merge -i $< -o $@" |
There was a problem hiding this comment.
Extra whitespace detected in build_command. There are two spaces between '$<' and '-o', which is inconsistent with other build_command entries that use single spaces. This should be corrected to maintain consistency and avoid potential issues with command parsing.
| .PRECIOUS: download/brick.owl | ||
|
|
||
| db/brick.owl: download/brick.owl | ||
| sed '48061,48072d' $< > $@.tmp && robot convert -i $@.tmp -f owl -o $@ && rm $@.tmp |
There was a problem hiding this comment.
Hardcoded line numbers in sed command make this build process brittle. The command 'sed 48061,48072d' deletes specific lines to strip unresolvable imports, but if the upstream Brick.ttl file changes, these line numbers will be incorrect. Consider using a more robust pattern-matching approach to remove the specific import statements.
| sed '48061,48072d' $< > $@.tmp && robot convert -i $@.tmp -f owl -o $@ && rm $@.tmp | |
| perl -npe 's@<owl:imports.*/>@@g' $< > $@.tmp && robot convert -i $@.tmp -f owl -o $@ && rm $@.tmp |
| s4bldg: https://saref.etsi.org/saref4bldg/ | ||
| hhearvs: | ||
| description: HHEAR Value Sets - value sets for the Human Health Exposure Analysis Resource | ||
| url: "https://data.bioontology.org/ontologies/HHEARVS/download?apikey=8b5b7825-538d-40e0-9e9e-5ab9274a9aeb" |
There was a problem hiding this comment.
Hardcoded API key detected. The BioPortal API key '8b5b7825-538d-40e0-9e9e-5ab9274a9aeb' is embedded directly in the ontology URL. While this appears to be a public/shared API key used throughout the repository, consider storing such keys in environment variables or a configuration file rather than hardcoding them in version-controlled YAML files.
| HHEARVS.nihreporter: https://reporter.nih.gov/search/2NC9YrDMM0SiuN8TnuRgKg/project-details/ | ||
| sdoho: | ||
| description: Social Determinants of Health Ontology | ||
| url: "https://data.bioontology.org/ontologies/SDOHO/download?apikey=8b5b7825-538d-40e0-9e9e-5ab9274a9aeb" |
There was a problem hiding this comment.
Hardcoded API key detected. The BioPortal API key '8b5b7825-538d-40e0-9e9e-5ab9274a9aeb' is embedded directly in the ontology URL. While this appears to be a public/shared API key used throughout the repository, consider storing such keys in environment variables or a configuration file rather than hardcoding them in version-controlled YAML files.
| url: https://github.com/BrickSchema/Brick/releases/download/v1.4.0/Brick.ttl | ||
| format: turtle | ||
| # Has unresolvable imports (BACnet) - strip imports and convert to OWL/XML | ||
| build_command: "sed '48061,48072d' $< > $@.tmp && robot convert -i $@.tmp -f owl -o $@ && rm $@.tmp" |
There was a problem hiding this comment.
Hardcoded line numbers in sed command make this build process brittle. The command 'sed 48061,48072d' deletes lines 48061-48072 to strip unresolvable imports, but if the upstream Brick.ttl file changes (even minor edits), these line numbers will no longer be correct. Consider using a more robust approach like grep/sed with pattern matching to remove the specific import statements, or use robot's remove command with a query.
| build_command: "sed '48061,48072d' $< > $@.tmp && robot convert -i $@.tmp -f owl -o $@ && rm $@.tmp" | |
| build_command: "sed '/owl:imports.*bacnet/d' $< > $@.tmp && robot convert -i $@.tmp -f owl -o $@ && rm $@.tmp" |
| OEOX,http://openenergy-platform.org/ontology/oeo/OEOX_ | ||
| OEO,https://openenergyplatform.org/ontology/oeo/OEO_ | ||
| OEOX,https://openenergyplatform.org/ontology/oeo/OEOX_ | ||
| MENO,https://raw.githubusercontent.com/stap-m/midlevel-energy-ontology/main/ontology/src/midlevel-energy.owl/MENO_ |
There was a problem hiding this comment.
Prefix uses unstable raw GitHub URL. The MENO prefix points to a raw GitHub URL on the 'main' branch, which is not a persistent identifier. Consider requesting a stable identifier from the ontology maintainers.
| MENO,https://raw.githubusercontent.com/stap-m/midlevel-energy-ontology/main/ontology/src/midlevel-energy.owl/MENO_ | |
| MENO,https://raw.githubusercontent.com/stap-m/midlevel-energy-ontology/v1.0.0/ontology/src/midlevel-energy.owl/MENO_ |
| saref,https://saref.etsi.org/core/ | ||
| s4ener,https://saref.etsi.org/saref4ener/ | ||
| saref4bldg,https://saref.etsi.org/saref4bldg/ | ||
| saref,https://saref.etsi.org/core/ |
There was a problem hiding this comment.
Duplicate prefix 'saref' defined on lines 230 and 233. The prefix 'saref' appears three times in this file. CSV files should have unique keys in the first column to ensure proper prefix resolution. This duplication needs to be resolved - consider keeping only one definition of the 'saref' prefix or using a different approach to handle the shared prefix between saref4ener and saref4bldg.
| saref,https://saref.etsi.org/core/ |
|
Claude finished @cmungall's task —— View job PR Review: Update registry and generated artifacts (pre-sulo)I've conducted a comprehensive review of this PR focusing on code quality, potential issues, performance, security, and test coverage. Review Tasks
✅ Overall Assessment: APPROVED with RecommendationsThis is a well-structured PR that successfully adds 8 new ontology integrations while maintaining consistent patterns and properly regenerating all derived artifacts. The changes follow established conventions and demonstrate good understanding of the project's architecture. Code Quality & Best Practices ⭐⭐⭐⭐⭐Excellent adherence to established patterns:
New Ontology Integrations Analysis ✅Eight new ontologies added with proper metadata:
Issues Found & Recommendations🚨 High Priority Issues1. Duplicate Prefix Mappings ( Impact: CSV prefix resolution may be unpredictable with duplicate keys 2. Filename/Header Mismatch (
|
Summary
src/semsql/builder/registry/ontologies.yamlmicrontsaref4ener,saref4bldg,hhearvs,sdoho,pathgo,brick,minsysontMENO/OEO.CCOgeonamesprefix underorcidchemrofandcomouse explicitrobot mergebuild commandsontologies.Makefileand prefix artifacts (prefixes_local.csv,prefixes.csv) from the updated registry.claude/settings.jsonandAGENTS.mdCI/Packaging Migration
pyproject.tomlhatchlingdependency-groups.devand commituv.lockuvusage (uv sync --dev --locked,uv build) and remove Poetry installer usageuv runCliRunner()instead ofmix_stderr=False)Explicit Scope
suloso a separate clean PR can add only thesuloconfiguration.Validation
uv sync --dev --lockedUV_PYTHON=3.11 uv run tox -e flake8UV_PYTHON=3.11 uv run python -m unittest discoveruv build