10 Architecture Design Description
The architecture of this system is driven by a single constraint: context windows are scarce. Modern LLMs process 100K+ tokens, but effective reasoning requires focused context. Per [1], “Context Engineering” is the discipline of optimizing LLM inputs for quality outputs.
10.1 Context Management as Architectural Driver
Per [2, Sec. 2.3.5.4], architecture decisions should trace to stakeholder concerns. The primary concern driving this architecture is context efficiency: enabling AI agents to reason effectively about SysML v2 models within constrained token budgets.
10.1.1 The Problem
A typical systems engineering project context consumes 40K+ tokens for project awareness (requirements, architecture decisions, constraints, current task state). This leaves limited tokens for model reasoning. When an AI agent must also understand a SysML v2 model, naive approaches (loading entire files) quickly exhaust available context.
10.1.2 Four Context Management Strategies
Literature analysis (Section 3.1) identified four proven strategies:
| Strategy | Description | Example Pattern |
|---|---|---|
| Avoidance | Minimize context needs through tiny, focused prompts | Sentence-level LLM calls [3] |
| Staged Decomposition | Pipeline with intermediate representations | JSON between stages [4] |
| Progressive Narrowing | Similarity + reachability-based pruning | Subgraph extraction [5] |
| Multi-Agent Partitioning | Divide context across specialized agents | Pipeline agents |
10.1.3 Architectural Response
This system enables all four strategies:
- Fine-grained query tools support avoidance (get exactly what you need)
- JSON intermediate representation enables staged decomposition
- Subgraph extraction implements progressive narrowing
- Tool design supports multi-agent workflows via stateless calls
10.2 System Context Diagram
Per [2, Sec. 2.3.5.4], the context diagram defines the system boundary and external interfaces.
+-----------------------------------------+
| External Systems |
+-----------------------------------------+
|
+------------------------------+------------------------------+
| | |
v v v
+-------------+ +-------------+ +-------------+
| MCP Client | |Git Provider | | SysML v2 |
| (Claude, | | (GitLab, | | API Server |
| VS Code) | | GitHub,etc) | | |
+------+------+ +-------------+ +-------------+
| ^ ^
| MCP Protocol | REST API | REST API
| (stdio/HTTP) | (HTTPS) | (HTTPS)
v | |
+----------------------------------------------------------------------+
| open-mcp-sysml |
| +----------------------------------------------------------------+ |
| | System Boundary | |
| | | |
| | Tools: sysml_parse, sysml_validate, sysml_list_definitions, | |
| | repo_list_files, repo_get_file | |
| | Resources: sysml://examples/*, repo://{project}/{path} | |
| | | |
| +----------------------------------------------------------------+ |
+----------------------------------------------------------------------+
External Interfaces:
| Interface | Protocol | Direction | Description |
|---|---|---|---|
| MCP Client | MCP 2024-11-05 (stdio/HTTP) | Bidirectional | AI tool integration |
| Git Provider API | REST (HTTPS) | Outbound | Repository file access (provider-agnostic) |
| SysML v2 API | REST (HTTPS) | Outbound | Model query and validation (planned) |
10.2.1 Operational Modes
| Mode | Transport | Use Case | Authentication |
|---|---|---|---|
| Local Development | stdio | Individual engineer with Claude/VS Code | Git provider PAT |
| Team Server | HTTP (planned) | Shared server for team access | PAT per request |
| CI/CD Pipeline | HTTP (planned) | Automated validation | CI job token |
10.2.2 Use Cases
UC-1: AI-Assisted Model Review
- Engineer opens Claude Desktop with MCP server configured
- Asks: “List all requirement definitions in the vehicle model”
- MCP server fetches model via
repo_get_file, parses viasysml_parse - Claude presents findings with
sysml_list_definitionsand suggests improvements - Engineer requests changes via iterative tool calls
UC-2: Model Validation in CI/CD
- Developer commits SysML v2 model changes
- CI pipeline calls
sysml_validatevia stdio - Validation results reported in merge request
UC-3: Exploratory Model Query
- New team member asks natural language questions about model structure
- MCP server uses
sysml_queryto search elements - AI explains model architecture and relationships
10.3 Novel Contributions
This project addresses gaps in the open-source MBSE ecosystem. While proprietary solutions exist, the community lacks accessible building blocks for AI-augmented systems engineering.
| Contribution | Gap Addressed | Community Benefit |
|---|---|---|
| tree-sitter-sysml | No open-source SysML v2 grammar in tree-sitter ecosystem | Syntax highlighting, IDE support, MCP integration |
| kebnf-to-tree-sitter | Manual grammar maintenance as OMG spec evolves | Automated updates, formal traceability |
| open-mcp-sysml | No MBSE MCP server in 7,364+ public repos | AI-augmented MBSE workflows |
10.3.1 tree-sitter-sysml
The first open-source SysML v2 grammar for tree-sitter, enabling:
- Syntax highlighting in any tree-sitter-compatible editor
- Incremental parsing for responsive IDE experiences
- WASM compilation for browser-based tooling
- Contribution path to tree-sitter GitHub organization
- GitLab contribution path via vendor/grammars/
Status: Production ready. 125 corpus tests passing, 99.6% coverage across 275 external files (OMG Training, OMG Examples, GfSE, Advent of SysML v2). Context-sensitive definition bodies implemented. Technical debt documented.
10.3.2 kebnf-to-tree-sitter
Automated grammar generation from OMG KEBNF specifications:
- One-shot conversion preserves spec traceability
- Semantic mapping document captures conversion decisions
- Enables automated updates when OMG publishes new specs
- Research contribution on grammar transposition for MBSE languages
Status: Parser and emitter complete (640/640 KEBNF rules). Generated grammar has 335+ conflicts requiring resolution. Iterative conflict resolution strategy selected—fixing conflicts one at a time with documented rationale for INCOSE paper.
10.3.3 open-mcp-sysml
MCP server enabling AI agents to work with SysML v2 models:
- Fine-grained query tools for context-efficient access
- Provider-agnostic repository access (GitLab as reference implementation)
- Tree-sitter integration for reliable parsing
- Stateless tool design supporting multi-agent workflows
Status: Phase 1 Complete (Feb 13, 2026). Tree-sitter integration working with 5 MCP tools: sysml_parse (L0/L1/L2 detail levels), sysml_validate, sysml_list_definitions, repo_list_files, repo_get_file. 22 tests passing. Phase 2 PRD ready for 7 token reduction strategies. Ready for benchmark vignette testing.
10.3.4 sysml-grammar-benchmark (NEW)
Automated grammar benchmark dashboard comparing SysML v2 parser implementations:
- Objective comparison across standardized test corpora
- GitLab CI automation with scheduled runs
- Quarto + D3 static dashboard on GitLab Pages
- Community contribution path for new parsers and corpora
Status: Repository scaffolded with PRD. Dashboard and CI pipeline pending implementation.
10.4 Architecture Alternatives
Per [2, Sec. 2.3.5.4], candidate architectures were evaluated before selection.
| Alternative | Evaluation |
|---|---|
| Python + FastMCP | Rejected: additional runtime dependency |
| TypeScript + official SDK | Rejected: heavier deployment footprint |
| Go + go-sdk | Initially selected, then superseded |
| Rust + rmcp + tree-sitter | Selected: GKG alignment, community grammar |
Parser Architecture Pivot: Replaced planned hand-rolled parser with tree-sitter grammar. Key factors:
| Factor | Hand-rolled | Tree-sitter |
|---|---|---|
| Error recovery | Must implement | Built-in |
| Incremental parsing | Must implement | Built-in |
| Multi-platform | Rust only | C, WASM, Rust, Go, Python, Swift |
| Community reuse | Project-specific | tree-sitter org contribution |
| GitLab integration | N/A | vendor/grammars/ path |
10.5 Technology Stack
| Component | Technology | Rationale |
|---|---|---|
| Language | Rust 1.85+ | GKG alignment, memory safety, single static binary |
| MCP SDK | rmcp (official) | Official Rust SDK, tokio async runtime |
| Git Client | Provider-agnostic trait | Future GKG integration; GitLab as first implementation |
| Parser | tree-sitter-sysml | Community grammar, multi-platform |
| Transport | stdio (HTTP planned) | stdio for local dev; HTTP transport planned for Phase 2 |
| Container | Buildah/Podman | OCI-compliant, rootless, CI-friendly |
| Documentation | Quarto | Markdown-native, GitLab Pages compatible |
10.6 Repository Structure
gitlab.com/dunn.dev/open-mcp-sysml/ # GitLab Group (monorepo root)
+-- capstone/ # SE Documentation (Quarto Book)
+-- open-mcp-sysml/ # Rust MCP Server
| +-- crates/
| +-- sysml-parser/ # tree-sitter wrapper (parse, validate, list)
| +-- repo-client/ # Provider-agnostic Git interface
| +-- mcp-server/ # MCP server binary
+-- tree-sitter-sysml/ # Brute-Force Grammar (99.6% coverage)
+-- kebnf-to-tree-sitter/ # Spec-Driven Grammar Converter
+-- sysml-grammar-benchmark/ # Grammar Comparison Dashboard (NEW)
Key Design Decisions:
- Dual-path grammar strategy: Brute-force for immediate value, spec-driven for research
open-mcp-sysmlconsumes grammar via Rust bindings- Dual CI: GitHub Actions for tree-sitter ecosystem, GitLab CI for coverage tracking
10.7 Component Architecture
+-------------------------------------------------------+
| MCP Client (Claude, etc.) |
+-------------------------------------------------------+
|
v
+-------------------------------------------------------+
| Transport Layer |
| (stdio / HTTP) |
+-------------------------------------------------------+
|
v
+-------------------------------------------------------+
| mcp-server crate |
| +-----------+ +-----------+ +-----------+ |
| | Tools | | Resources | | Prompts | |
| +-----------+ +-----------+ +-----------+ |
+-------------------------------------------------------+
|
+-------------------+-------------------+
v v v
+-------------+ +---------------+ +-------------+
| repo-client | | tree-sitter | | SysML v2 |
| crate | | Rust bindings | | API Client |
+-------------+ +---------------+ +-------------+
| | |
v v v
+-------------+ +---------------+ +-------------+
|Git Provider | |tree-sitter- | | SysML v2 |
| API | |sysml grammar | | API Server |
+-------------+ +---------------+ +-------------+
Component Responsibilities:
| Component | Responsibility |
|---|---|
| mcp-server | MCP protocol handling, tool dispatch |
| sysml-parser | Rust API wrapping tree-sitter for parse, validate, list_definitions |
| repo-client | Provider-agnostic Git operations (GitLab reference implementation) |
| tree-sitter-sysml | Grammar definition, generated parser |
10.8 Context-Aware Tool Design
Per literature recommendations, tools are organized by context cost. The implemented tools (Phase 1) prioritize low-context operations; future phases will add higher-context analysis and generation tools.
10.8.1 Implemented Tools (Phase 1)
sysml_parse:
description: Parse SysML v2 text and extract elements
input:
source: string
detail_level: "L0" | "L1" | "L2" # L0: names only, L1: +types, L2: full AST
output: Element[]
context_cost: L0 ~100 tokens, L1 ~300 tokens, L2 ~2000 tokens
sysml_validate:
description: Validate SysML v2 syntax via tree-sitter parse diagnostics
input:
source: string
output: ValidationResult (errors, warnings)
context_cost: ~300 tokens
sysml_list_definitions:
description: List all definitions in SysML v2 text
input:
source: string
output: Definition[] (name, type)
context_cost: ~200 tokens
repo_list_files:
description: List files in Git repository
input:
project: string
path: string?
ref: string?
sysml_only: boolean?
output: FileEntry[]
context_cost: ~300 tokens
repo_get_file:
description: Read file content from Git repository
input:
project: string
path: string
ref: string?
output: FileContent
context_cost: varies by file size10.8.2 Planned Tools (Phase 2+)
sysml_query:
description: Query elements by type/properties
status: Planned
repo_commit:
description: Commit file changes to repository
status: PlannedDesign Principles:
- Stateless calls: No session pollution between tool invocations
- Server-side pruning: Return minimal sufficient context
- JSON intermediate representation: Enable staged decomposition
- Syntax validation: Catch errors early on all generated content
10.9 Implementation Patterns for Progressive Context Budgeting
The tool design principles in Section 10.8 establish what to build; this section addresses how to implement these patterns using proven techniques from the rapidly evolving ecosystem of MCP servers and AI agents. These patterns emerge primarily from practitioner implementations rather than academic literature—many creators are building rapidly without publishing formal papers.
OpenAI’s “Harness Engineering” report [6] provides production-scale validation of these patterns: a team built a million-line product with zero manually-written code by treating repository-local knowledge as the system of record, enforcing architectural invariants mechanically, and using progressive disclosure (a short AGENTS.md as map, not encyclopedia). Their experience confirms that context management patterns are not academic abstractions but production necessities.
This survey reflects the ecosystem as of February 2026. Given the pace of change, we recommend revisiting these patterns monthly.
10.9.1 MCP Server Patterns
Three implementation patterns have emerged for managing context budgets within MCP tool servers, each offering 90%+ token reduction from naive implementations.
10.9.1.1 Two Meta-Tool Architecture
Instead of exposing all tools directly to the agent (consuming 15,000+ tokens), expose only two meta-tools [7]:
┌─────────────────────────────────────────────────────┐
│ Client Context Window │
├─────────────────────────────────────────────────────┤
│ Tools available: 2 (~800 tokens) │
│ ┌─────────────────────────────────────────────────┐│
│ │ get_tools_in_category(category_path) ││
│ │ → Returns tool names + descriptions in category ││
│ └─────────────────────────────────────────────────┘│
│ ┌─────────────────────────────────────────────────┐│
│ │ execute_tool(tool_path, parameters) ││
│ │ → Executes any tool by hierarchical path ││
│ └─────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────┘
│
▼ Agent navigates tool tree on demand
┌─────────────────────────────────────────────────────┐
│ Server Tool Registry (~15,000 tokens if exposed) │
│ └─ sysml/ │
│ ├─ query/ (get_element, search, ...) │
│ ├─ analysis/ (extract_subgraph, trace, ...) │
│ └─ generate/ (suggest_completion, ...) │
└─────────────────────────────────────────────────────┘
Token savings: ~95% (15,000 → 800 tokens) Trade-off: Adds one round-trip for tool discovery per category
10.9.1.2 Cache ID + Summary Pattern
Return concise summaries with cache identifiers; provide details on demand [8]:
Phase 1: Query returns summary + cache ID
─────────────────────────────────────────
sysml_list_elements(package="Vehicle")
→ {
cacheId: "elem-abc123",
summary: { totalElements: 47, parts: 12, ports: 8, ... },
quickAccess: [ "Vehicle::PowerSubsystem", "Vehicle::Chassis", ... ]
}
(~500 tokens)
Phase 2: Details fetched only when needed
─────────────────────────────────────────
sysml_get_details(cacheId: "elem-abc123", filter: "parts-only")
→ { elements: [ ... full element definitions ... ] }
(~2000 tokens, only when required)
Token savings: ~97% for list operations (18,700 → 540 tokens baseline)
Complementary patterns:
- RTFM documentation: Tool descriptions minimal (<10 words); agents call
rtfm(toolName)for full documentation on demand defer_loadingflag: Platform-native lazy discovery (Claude-specific)
10.9.1.3 L0/L1/L2 Tiered Loading
Filesystem-inspired progressive context loading [9]:
| Tier | Content | Tokens | Use Case |
|---|---|---|---|
| L0 (Abstract) | Names + one-line descriptions | ~100 | Quick relevance check |
| L1 (Overview) | Structure, relationships, key properties | ~2,000 | Understand organization |
| L2 (Details) | Full element definitions, constraints | Full | Deep analysis/editing |
Application to SysML: L0 returns qualified names; L1 returns element skeletons with relationship counts; L2 returns full textual notation with documentation.
10.9.2 Complementary Techniques
Beyond MCP-specific patterns, several techniques optimize context at other layers.
10.9.2.1 Prompt Compression
LLMLingua [10] uses small language models (GPT-2, LLaMA-7B) to identify and remove non-essential tokens before the main LLM call.
Performance: Up to 20x compression with minimal quality loss RAG improvement: +21.4% accuracy using only 1/4 tokens Application: Client-side preprocessing of large retrieved documents
10.9.2.2 KV-Cache-First Design
Manus’s production experience [11] identifies KV-cache hit rate as “the single most important metric for a production AI agent”:
- Append-only context: Never modify previous actions/observations
- Stable prefixes: Avoid timestamps or dynamic content at prompt start
- Tool masking: Constrain action space via logit masking, not tool removal
Cost impact: 10x difference between cached ($0.30/MTok) and uncached ($3.00/MTok)
10.9.2.3 Frequent Intentional Compaction
HumanLayer’s workflow [12] structures development around context management:
- Research: Subagent explores codebase, returns compressed findings
- Plan: Outline steps with verification criteria (human reviews here)
- Implement: Execute plan phase-by-phase, compacting state between phases
Key insight: “A bad line of research could lead to thousands of bad lines of code”—focus human review on research and plans, not just generated code.
Context utilization: Maintain 40-60% window utilization for optimal reasoning.
10.9.3 Performance Metrics Framework
The following metrics enable comparison across implementations:
| Category | Metric | Baseline Examples | Source |
|---|---|---|---|
| Token efficiency | Compression ratio | 95-97% (MCP patterns) | [7], [8] |
| Cost impact | Cached vs uncached tokens | 10x difference | [11] |
| Latency | Tool response time | 120ms (semantic) vs 2000ms (visual) | [8] |
| Quality | Task completion rate | 24%→51% with optimization | DSPy MIPROv2 |
| Context utilization | % window used effectively | 40-60% sweet spot | [12] |
| Multi-agent gain | Single vs multi-agent success | 90.2% improvement | [13] |
10.9.4 Implementation Options for open-mcp-sysml
Based on the patterns surveyed, the following implementation options are available, each addressing different aspects of context efficiency:
Cache ID + Summary
All list/query tools return summaries with cache IDs. Add sysml_get_details(cacheId, detailType) for on-demand expansion. Expected savings: ~90% on common workflows.
L0/L1/L2 Tiered Responses
Add detail_level parameter to existing tools. Default to L1; agents request L2 only when editing.
RTFM Documentation Pattern
Minimize tool descriptions to <10 words. Add sysml_rtfm(toolName) for full documentation.
Two Meta-Tool Wrapper
Useful if tool count exceeds ~20. Consider as enhancement once tool surface stabilizes.
KV-Cache Optimization
Design response formats for append-only workflows. Ensure deterministic JSON serialization.
10.9.5 Reference Implementations
| Repository | Stars | Pattern | Language | URL |
|---|---|---|---|---|
| mcp-proxy | 3 | Two meta-tool | Go | github.com/IAMSamuelRodda/mcp-proxy |
| xc-mcp | 59 | Cache ID + RTFM + defer_loading | TypeScript | github.com/conorluddy/xc-mcp |
| OpenViking | 1,100 | L0/L1/L2 tiered loading | Python | github.com/volcengine/OpenViking |
| LLMLingua | 5,800 | Prompt compression | Python | github.com/microsoft/LLMLingua |
| 12-factor-agents | 18,200 | Workflow patterns | Docs | github.com/humanlayer/12-factor-agents |
| Beads | 16,000 | Git-backed agent memory | Go | github.com/steveyegge/beads |
| OpenAI Harness | — | Progressive disclosure + golden principles | Internal | [6] |
10.10 Interface Definitions
10.10.1 MCP Protocol Interface
The server implements MCP 2024-11-05:
initialize- Protocol handshaketools/list- Enumerate available toolstools/call- Execute a toolresources/list- Enumerate resourcesresources/read- Read a resource
10.10.2 Repository Interface Schema
{
"RepoClient": {
"methods": {
"read_file": {
"params": ["project", "path", "ref"],
"returns": "bytes"
},
"list_files": {
"params": ["project", "path", "ref"],
"returns": "FileEntry[]"
}
}
}
}10.10.3 Parser Interface Schema
{
"sysml_parse": {
"input": {
"content": "string"
},
"output": {
"success": "boolean",
"tree": "ParseTree?",
"errors": "ParseError[]"
}
}
}10.11 Dual-Path Grammar Strategy
Both grammar development paths are actively maintained:
| Path | Repository | Purpose | Status |
|---|---|---|---|
| Brute Force | tree-sitter-sysml | Practical parsing, MCP server | Tier 1 complete |
| Spec-Driven | kebnf-to-tree-sitter | Formal compliance, INCOSE paper | Tool complete, grammar in progress |
Cross-validation: Comparing outputs identifies spec interpretation errors in brute-force and practical parsing issues in spec-driven output.
Why both matter:
- Brute-force provides immediate practical value
- Spec-driven enables automated updates and formal traceability
- Comparison validates both approaches
10.12 Requirements Allocation
Per [2, Sec. 2.3.5.4], requirements are allocated to architecture elements.
| Requirement | Architecture Element | Component |
|---|---|---|
| FR-MCP-001, FR-MCP-002 | MCP Server | mcp-server crate |
| FR-MCP-003 | HTTP Transport | mcp-server crate |
| FR-REPO-001, FR-REPO-002 | Repo Client | repo-client crate |
| FR-SYS-001, FR-SYS-002 | SysML Parser | tree-sitter-sysml |
| NFR-DEP-001 | Build Configuration | Cargo.toml |
| NFR-DEP-002 | Container Image | Containerfile |
10.13 Deployment Architecture
10.13.1 Deployment Modes
| Mode | Transport | Use Case | Configuration |
|---|---|---|---|
| Local Development | stdio | Claude Desktop, VS Code | --transport stdio |
| CI/CD Integration | HTTP | GitLab CI services | --transport http --port 8080 |
| Container | HTTP | Production deployment | Docker/Podman with port mapping |
10.13.2 Development Constraints
Constraint: No local container builds on macOS (no podman machine).
Mitigation:
- Local development uses
cargo buildandcargo testdirectly - MCP protocol testing via stdio (no containers required)
- Container builds run exclusively in GitLab CI
10.14 CI/CD Pipeline
GitLab Ultimate features leveraged:
| Feature | Purpose |
|---|---|
| SAST | Static Application Security Testing for Rust |
| Dependency Scanning | Scan Cargo.lock for vulnerabilities |
| Secret Detection | Prevent accidental credential commits |
| License Compliance | Track crate licenses |
| Code Quality | Clippy integration for Rust linting |
10.14.1 tree-sitter-sysml CI/CD
Dual CI for ecosystem compatibility:
| Platform | Purpose |
|---|---|
| GitHub Actions | Tree-sitter org compatibility |
| GitLab CI | Coverage tracking, security scanning |