10 Architecture Design Description

The architecture of this system is driven by a single constraint: context windows are scarce. Modern LLMs process 100K+ tokens, but effective reasoning requires focused context. Per [1], “Context Engineering” is the discipline of optimizing LLM inputs for quality outputs.

10.1 Context Management as Architectural Driver

Per [2, Sec. 2.3.5.4], architecture decisions should trace to stakeholder concerns. The primary concern driving this architecture is context efficiency: enabling AI agents to reason effectively about SysML v2 models within constrained token budgets.

10.1.1 The Problem

A typical systems engineering project context consumes 40K+ tokens for project awareness (requirements, architecture decisions, constraints, current task state). This leaves limited tokens for model reasoning. When an AI agent must also understand a SysML v2 model, naive approaches (loading entire files) quickly exhaust available context.

10.1.2 Four Context Management Strategies

Literature analysis (Section 3.1) identified four proven strategies:

Strategy	Description	Example Pattern
Avoidance	Minimize context needs through tiny, focused prompts	Sentence-level LLM calls [3]
Staged Decomposition	Pipeline with intermediate representations	JSON between stages [4]
Progressive Narrowing	Similarity + reachability-based pruning	Subgraph extraction [5]
Multi-Agent Partitioning	Divide context across specialized agents	Pipeline agents

10.1.3 Architectural Response

This system enables all four strategies:

Fine-grained query tools support avoidance (get exactly what you need)
JSON intermediate representation enables staged decomposition
Subgraph extraction implements progressive narrowing
Tool design supports multi-agent workflows via stateless calls

10.2 System Context Diagram

Per [2, Sec. 2.3.5.4], the context diagram defines the system boundary and external interfaces.

                     +-----------------------------------------+
                     |           External Systems              |
                     +-----------------------------------------+
                                      |
       +------------------------------+------------------------------+
       |                              |                              |
       v                              v                              v
+-------------+              +-------------+              +-------------+
| MCP Client  |              |Git Provider |              |  SysML v2   |
| (Claude,    |              |  (GitLab,   |              | API Server  |
|  VS Code)   |              | GitHub,etc) |              |             |
+------+------+              +-------------+              +-------------+
       |                              ^                          ^
       | MCP Protocol                 | REST API                 | REST API
       | (stdio/HTTP)                 | (HTTPS)                  | (HTTPS)
       v                              |                          |
+----------------------------------------------------------------------+
|                           open-mcp-sysml                             |
|  +----------------------------------------------------------------+  |
|  |                        System Boundary                         |  |
|  |                                                                |  |
|  |  Tools: sysml_parse, sysml_validate, sysml_list_definitions,  |  |
|  |         repo_list_files, repo_get_file                        |  |
|  |  Resources: sysml://examples/*, repo://{project}/{path}       |  |
|  |                                                                |  |
|  +----------------------------------------------------------------+  |
+----------------------------------------------------------------------+

External Interfaces:

Interface	Protocol	Direction	Description
MCP Client	MCP 2024-11-05 (stdio/HTTP)	Bidirectional	AI tool integration
Git Provider API	REST (HTTPS)	Outbound	Repository file access (provider-agnostic)
SysML v2 API	REST (HTTPS)	Outbound	Model query and validation (planned)

10.2.1 Operational Modes

Mode	Transport	Use Case	Authentication
Local Development	stdio	Individual engineer with Claude/VS Code	Git provider PAT
Team Server	HTTP (planned)	Shared server for team access	PAT per request
CI/CD Pipeline	HTTP (planned)	Automated validation	CI job token

10.2.2 Use Cases

UC-1: AI-Assisted Model Review

Engineer opens Claude Desktop with MCP server configured
Asks: “List all requirement definitions in the vehicle model”
MCP server fetches model via repo_get_file, parses via sysml_parse
Claude presents findings with sysml_list_definitions and suggests improvements
Engineer requests changes via iterative tool calls

UC-2: Model Validation in CI/CD

Developer commits SysML v2 model changes
CI pipeline calls sysml_validate via stdio
Validation results reported in merge request

UC-3: Exploratory Model Query

New team member asks natural language questions about model structure
MCP server uses sysml_query to search elements
AI explains model architecture and relationships

10.3 Novel Contributions

This project addresses gaps in the open-source MBSE ecosystem. While proprietary solutions exist, the community lacks accessible building blocks for AI-augmented systems engineering.

Contribution	Gap Addressed	Community Benefit
tree-sitter-sysml	No open-source SysML v2 grammar in tree-sitter ecosystem	Syntax highlighting, IDE support, MCP integration
kebnf-to-tree-sitter	Manual grammar maintenance as OMG spec evolves	Automated updates, formal traceability
open-mcp-sysml	No MBSE MCP server in 7,364+ public repos	AI-augmented MBSE workflows

10.3.1 tree-sitter-sysml

The first open-source SysML v2 grammar for tree-sitter, enabling:

Syntax highlighting in any tree-sitter-compatible editor
Incremental parsing for responsive IDE experiences
WASM compilation for browser-based tooling
Contribution path to tree-sitter GitHub organization
GitLab contribution path via vendor/grammars/

Status: Production ready. 125 corpus tests passing, 99.6% coverage across 275 external files (OMG Training, OMG Examples, GfSE, Advent of SysML v2). Context-sensitive definition bodies implemented. Technical debt documented.

10.3.2 kebnf-to-tree-sitter

Automated grammar generation from OMG KEBNF specifications:

One-shot conversion preserves spec traceability
Semantic mapping document captures conversion decisions
Enables automated updates when OMG publishes new specs
Research contribution on grammar transposition for MBSE languages

Status: Parser and emitter complete (640/640 KEBNF rules). Generated grammar has 335+ conflicts requiring resolution. Iterative conflict resolution strategy selected—fixing conflicts one at a time with documented rationale for INCOSE paper.

10.3.3 open-mcp-sysml

MCP server enabling AI agents to work with SysML v2 models:

Fine-grained query tools for context-efficient access
Provider-agnostic repository access (GitLab as reference implementation)
Tree-sitter integration for reliable parsing
Stateless tool design supporting multi-agent workflows

Status: Phase 1 Complete (Feb 13, 2026). Tree-sitter integration working with 5 MCP tools: sysml_parse (L0/L1/L2 detail levels), sysml_validate, sysml_list_definitions, repo_list_files, repo_get_file. 22 tests passing. Phase 2 PRD ready for 7 token reduction strategies. Ready for benchmark vignette testing.

10.3.4 sysml-grammar-benchmark (NEW)

Automated grammar benchmark dashboard comparing SysML v2 parser implementations:

Objective comparison across standardized test corpora
GitLab CI automation with scheduled runs
Quarto + D3 static dashboard on GitLab Pages
Community contribution path for new parsers and corpora

Status: Repository scaffolded with PRD. Dashboard and CI pipeline pending implementation.

10.4 Architecture Alternatives

Per [2, Sec. 2.3.5.4], candidate architectures were evaluated before selection.

Alternative	Evaluation
Python + FastMCP	Rejected: additional runtime dependency
TypeScript + official SDK	Rejected: heavier deployment footprint
Go + go-sdk	Initially selected, then superseded
Rust + rmcp + tree-sitter	Selected: GKG alignment, community grammar

Parser Architecture Pivot: Replaced planned hand-rolled parser with tree-sitter grammar. Key factors:

Factor	Hand-rolled	Tree-sitter
Error recovery	Must implement	Built-in
Incremental parsing	Must implement	Built-in
Multi-platform	Rust only	C, WASM, Rust, Go, Python, Swift
Community reuse	Project-specific	tree-sitter org contribution
GitLab integration	N/A	vendor/grammars/ path

10.5 Technology Stack

Component	Technology	Rationale
Language	Rust 1.85+	GKG alignment, memory safety, single static binary
MCP SDK	rmcp (official)	Official Rust SDK, tokio async runtime
Git Client	Provider-agnostic trait	Future GKG integration; GitLab as first implementation
Parser	tree-sitter-sysml	Community grammar, multi-platform
Transport	stdio (HTTP planned)	stdio for local dev; HTTP transport planned for Phase 2
Container	Buildah/Podman	OCI-compliant, rootless, CI-friendly
Documentation	Quarto	Markdown-native, GitLab Pages compatible

10.6 Repository Structure

gitlab.com/dunn.dev/open-mcp-sysml/    # GitLab Group (monorepo root)
+-- capstone/                           # SE Documentation (Quarto Book)
+-- open-mcp-sysml/                     # Rust MCP Server
|   +-- crates/
|       +-- sysml-parser/              # tree-sitter wrapper (parse, validate, list)
|       +-- repo-client/                # Provider-agnostic Git interface
|       +-- mcp-server/                 # MCP server binary
+-- tree-sitter-sysml/                  # Brute-Force Grammar (99.6% coverage)
+-- kebnf-to-tree-sitter/               # Spec-Driven Grammar Converter
+-- sysml-grammar-benchmark/            # Grammar Comparison Dashboard (NEW)

Key Design Decisions:

Dual-path grammar strategy: Brute-force for immediate value, spec-driven for research
open-mcp-sysml consumes grammar via Rust bindings
Dual CI: GitHub Actions for tree-sitter ecosystem, GitLab CI for coverage tracking

10.7 Component Architecture

+-------------------------------------------------------+
|               MCP Client (Claude, etc.)               |
+-------------------------------------------------------+
                            |
                            v
+-------------------------------------------------------+
|                   Transport Layer                     |
|                   (stdio / HTTP)                      |
+-------------------------------------------------------+
                            |
                            v
+-------------------------------------------------------+
|                 mcp-server crate                      |
|  +-----------+   +-----------+   +-----------+       |
|  |   Tools   |   | Resources |   |  Prompts  |       |
|  +-----------+   +-----------+   +-----------+       |
+-------------------------------------------------------+
                            |
        +-------------------+-------------------+
        v                   v                   v
+-------------+    +---------------+      +-------------+
| repo-client |    | tree-sitter   |      |  SysML v2   |
|   crate     |    | Rust bindings |      | API Client  |
+-------------+    +---------------+      +-------------+
        |                   |                   |
        v                   v                   v
+-------------+    +---------------+      +-------------+
|Git Provider |    |tree-sitter-   |      | SysML v2    |
| API         |    |sysml grammar  |      | API Server  |
+-------------+    +---------------+      +-------------+

Component Responsibilities:

Component	Responsibility
mcp-server	MCP protocol handling, tool dispatch
sysml-parser	Rust API wrapping tree-sitter for parse, validate, list_definitions
repo-client	Provider-agnostic Git operations (GitLab reference implementation)
tree-sitter-sysml	Grammar definition, generated parser

10.8 Context-Aware Tool Design

Per literature recommendations, tools are organized by context cost. The implemented tools (Phase 1) prioritize low-context operations; future phases will add higher-context analysis and generation tools.

10.8.1 Implemented Tools (Phase 1)

sysml_parse:
  description: Parse SysML v2 text and extract elements
  input:
    source: string
    detail_level: "L0" | "L1" | "L2"  # L0: names only, L1: +types, L2: full AST
  output: Element[]
  context_cost: L0 ~100 tokens, L1 ~300 tokens, L2 ~2000 tokens

sysml_validate:
  description: Validate SysML v2 syntax via tree-sitter parse diagnostics
  input:
    source: string
  output: ValidationResult (errors, warnings)
  context_cost: ~300 tokens

sysml_list_definitions:
  description: List all definitions in SysML v2 text
  input:
    source: string
  output: Definition[] (name, type)
  context_cost: ~200 tokens

repo_list_files:
  description: List files in Git repository
  input:
    project: string
    path: string?
    ref: string?
    sysml_only: boolean?
  output: FileEntry[]
  context_cost: ~300 tokens

repo_get_file:
  description: Read file content from Git repository
  input:
    project: string
    path: string
    ref: string?
  output: FileContent
  context_cost: varies by file size

10.8.2 Planned Tools (Phase 2+)

sysml_query:
  description: Query elements by type/properties
  status: Planned

repo_commit:
  description: Commit file changes to repository
  status: Planned

Design Principles:

Stateless calls: No session pollution between tool invocations
Server-side pruning: Return minimal sufficient context
JSON intermediate representation: Enable staged decomposition
Syntax validation: Catch errors early on all generated content

10.9 Implementation Patterns for Progressive Context Budgeting

The tool design principles in Section 10.8 establish what to build; this section addresses how to implement these patterns using proven techniques from the rapidly evolving ecosystem of MCP servers and AI agents. These patterns emerge primarily from practitioner implementations rather than academic literature—many creators are building rapidly without publishing formal papers.

OpenAI’s “Harness Engineering” report [6] provides production-scale validation of these patterns: a team built a million-line product with zero manually-written code by treating repository-local knowledge as the system of record, enforcing architectural invariants mechanically, and using progressive disclosure (a short AGENTS.md as map, not encyclopedia). Their experience confirms that context management patterns are not academic abstractions but production necessities.

This survey reflects the ecosystem as of February 2026. Given the pace of change, we recommend revisiting these patterns monthly.

10.9.1 MCP Server Patterns

Three implementation patterns have emerged for managing context budgets within MCP tool servers, each offering 90%+ token reduction from naive implementations.

10.9.1.1 Two Meta-Tool Architecture

Instead of exposing all tools directly to the agent (consuming 15,000+ tokens), expose only two meta-tools [7]:

┌─────────────────────────────────────────────────────┐
│  Client Context Window                               │
├─────────────────────────────────────────────────────┤
│  Tools available: 2 (~800 tokens)                   │
│  ┌─────────────────────────────────────────────────┐│
│  │ get_tools_in_category(category_path)            ││
│  │ → Returns tool names + descriptions in category ││
│  └─────────────────────────────────────────────────┘│
│  ┌─────────────────────────────────────────────────┐│
│  │ execute_tool(tool_path, parameters)             ││
│  │ → Executes any tool by hierarchical path        ││
│  └─────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────┘
         │
         ▼ Agent navigates tool tree on demand
┌─────────────────────────────────────────────────────┐
│  Server Tool Registry (~15,000 tokens if exposed)   │
│  └─ sysml/                                          │
│     ├─ query/     (get_element, search, ...)        │
│     ├─ analysis/  (extract_subgraph, trace, ...)    │
│     └─ generate/  (suggest_completion, ...)         │
└─────────────────────────────────────────────────────┘

Token savings: ~95% (15,000 → 800 tokens) Trade-off: Adds one round-trip for tool discovery per category

10.9.1.2 Cache ID + Summary Pattern

Return concise summaries with cache identifiers; provide details on demand [8]:

Phase 1: Query returns summary + cache ID
─────────────────────────────────────────
sysml_list_elements(package="Vehicle")
→ { 
    cacheId: "elem-abc123",
    summary: { totalElements: 47, parts: 12, ports: 8, ... },
    quickAccess: [ "Vehicle::PowerSubsystem", "Vehicle::Chassis", ... ]
  }
  (~500 tokens)

Phase 2: Details fetched only when needed
─────────────────────────────────────────
sysml_get_details(cacheId: "elem-abc123", filter: "parts-only")
→ { elements: [ ... full element definitions ... ] }
  (~2000 tokens, only when required)

Token savings: ~97% for list operations (18,700 → 540 tokens baseline)

Complementary patterns:

RTFM documentation: Tool descriptions minimal (<10 words); agents call rtfm(toolName) for full documentation on demand
defer_loading flag: Platform-native lazy discovery (Claude-specific)

10.9.1.3 L0/L1/L2 Tiered Loading

Filesystem-inspired progressive context loading [9]:

Tier	Content	Tokens	Use Case
L0 (Abstract)	Names + one-line descriptions	~100	Quick relevance check
L1 (Overview)	Structure, relationships, key properties	~2,000	Understand organization
L2 (Details)	Full element definitions, constraints	Full	Deep analysis/editing

Application to SysML: L0 returns qualified names; L1 returns element skeletons with relationship counts; L2 returns full textual notation with documentation.

10.9.2 Complementary Techniques

Beyond MCP-specific patterns, several techniques optimize context at other layers.

10.9.2.1 Prompt Compression

LLMLingua [10] uses small language models (GPT-2, LLaMA-7B) to identify and remove non-essential tokens before the main LLM call.

Performance: Up to 20x compression with minimal quality loss RAG improvement: +21.4% accuracy using only 1/4 tokens Application: Client-side preprocessing of large retrieved documents

10.9.2.2 KV-Cache-First Design

Manus’s production experience [11] identifies KV-cache hit rate as “the single most important metric for a production AI agent”:

Append-only context: Never modify previous actions/observations
Stable prefixes: Avoid timestamps or dynamic content at prompt start
Tool masking: Constrain action space via logit masking, not tool removal

Cost impact: 10x difference between cached ($0.30/MTok) and uncached ($3.00/MTok)

10.9.2.3 Frequent Intentional Compaction

HumanLayer’s workflow [12] structures development around context management:

Research: Subagent explores codebase, returns compressed findings
Plan: Outline steps with verification criteria (human reviews here)
Implement: Execute plan phase-by-phase, compacting state between phases

Key insight: “A bad line of research could lead to thousands of bad lines of code”—focus human review on research and plans, not just generated code.

Context utilization: Maintain 40-60% window utilization for optimal reasoning.

10.9.3 Performance Metrics Framework

The following metrics enable comparison across implementations:

Category	Metric	Baseline Examples	Source
Token efficiency	Compression ratio	95-97% (MCP patterns)	[7], [8]
Cost impact	Cached vs uncached tokens	10x difference	[11]
Latency	Tool response time	120ms (semantic) vs 2000ms (visual)	[8]
Quality	Task completion rate	24%→51% with optimization	DSPy MIPROv2
Context utilization	% window used effectively	40-60% sweet spot	[12]
Multi-agent gain	Single vs multi-agent success	90.2% improvement	[13]

10.9.4 Implementation Options for open-mcp-sysml

Based on the patterns surveyed, the following implementation options are available, each addressing different aspects of context efficiency:

Cache ID + Summary

All list/query tools return summaries with cache IDs. Add sysml_get_details(cacheId, detailType) for on-demand expansion. Expected savings: ~90% on common workflows.

L0/L1/L2 Tiered Responses

Add detail_level parameter to existing tools. Default to L1; agents request L2 only when editing.

RTFM Documentation Pattern

Minimize tool descriptions to <10 words. Add sysml_rtfm(toolName) for full documentation.

Two Meta-Tool Wrapper

Useful if tool count exceeds ~20. Consider as enhancement once tool surface stabilizes.

KV-Cache Optimization

Design response formats for append-only workflows. Ensure deterministic JSON serialization.

10.9.5 Reference Implementations

Repository	Stars	Pattern	Language	URL
mcp-proxy	3	Two meta-tool	Go	github.com/IAMSamuelRodda/mcp-proxy
xc-mcp	59	Cache ID + RTFM + defer_loading	TypeScript	github.com/conorluddy/xc-mcp
OpenViking	1,100	L0/L1/L2 tiered loading	Python	github.com/volcengine/OpenViking
LLMLingua	5,800	Prompt compression	Python	github.com/microsoft/LLMLingua
12-factor-agents	18,200	Workflow patterns	Docs	github.com/humanlayer/12-factor-agents
Beads	16,000	Git-backed agent memory	Go	github.com/steveyegge/beads
OpenAI Harness	—	Progressive disclosure + golden principles	Internal	[6]

10.10 Interface Definitions

10.10.1 MCP Protocol Interface

The server implements MCP 2024-11-05:

initialize - Protocol handshake
tools/list - Enumerate available tools
tools/call - Execute a tool
resources/list - Enumerate resources
resources/read - Read a resource

10.10.2 Repository Interface Schema

{
  "RepoClient": {
    "methods": {
      "read_file": {
        "params": ["project", "path", "ref"],
        "returns": "bytes"
      },
      "list_files": {
        "params": ["project", "path", "ref"],
        "returns": "FileEntry[]"
      }
    }
  }
}

10.10.3 Parser Interface Schema

{
  "sysml_parse": {
    "input": {
      "content": "string"
    },
    "output": {
      "success": "boolean",
      "tree": "ParseTree?",
      "errors": "ParseError[]"
    }
  }
}

10.11 Dual-Path Grammar Strategy

Both grammar development paths are actively maintained:

Path	Repository	Purpose	Status
Brute Force	tree-sitter-sysml	Practical parsing, MCP server	Tier 1 complete
Spec-Driven	kebnf-to-tree-sitter	Formal compliance, INCOSE paper	Tool complete, grammar in progress

Cross-validation: Comparing outputs identifies spec interpretation errors in brute-force and practical parsing issues in spec-driven output.

Why both matter:

Brute-force provides immediate practical value
Spec-driven enables automated updates and formal traceability
Comparison validates both approaches

10.12 Requirements Allocation

Per [2, Sec. 2.3.5.4], requirements are allocated to architecture elements.

Requirement	Architecture Element	Component
FR-MCP-001, FR-MCP-002	MCP Server	mcp-server crate
FR-MCP-003	HTTP Transport	mcp-server crate
FR-REPO-001, FR-REPO-002	Repo Client	repo-client crate
FR-SYS-001, FR-SYS-002	SysML Parser	tree-sitter-sysml
NFR-DEP-001	Build Configuration	Cargo.toml
NFR-DEP-002	Container Image	Containerfile

10.13 Deployment Architecture

10.13.1 Deployment Modes

Mode	Transport	Use Case	Configuration
Local Development	stdio	Claude Desktop, VS Code	`--transport stdio`
CI/CD Integration	HTTP	GitLab CI services	`--transport http --port 8080`
Container	HTTP	Production deployment	Docker/Podman with port mapping

10.13.2 Development Constraints

Constraint: No local container builds on macOS (no podman machine).

Mitigation:

Local development uses cargo build and cargo test directly
MCP protocol testing via stdio (no containers required)
Container builds run exclusively in GitLab CI

10.14 CI/CD Pipeline

GitLab Ultimate features leveraged:

Feature	Purpose
SAST	Static Application Security Testing for Rust
Dependency Scanning	Scan Cargo.lock for vulnerabilities
Secret Detection	Prevent accidental credential commits
License Compliance	Track crate licenses
Code Quality	Clippy integration for Rust linting

10.14.1 tree-sitter-sysml CI/CD

Dual CI for ecosystem compatibility:

Platform	Purpose
GitHub Actions	Tree-sitter org compatibility
GitLab CI	Coverage tracking, security scanning

[1]

D. Horthy, “12-factor agents: Principles for building reliable LLM applications.” https://github.com/humanlayer/12-factor-agents, 2025.

[2]

INCOSE, INCOSE systems engineering handbook, 5th ed. Wiley, 2023.

[3]

M. A. Hendricks and A. Cicirello, “Text to Model via SysML: Automated Generation of Dynamical System Computational Models from Unstructured Natural Language Text via Enhanced System Modeling Language Diagrams,” arXiv preprint, 2025, Available: https://arxiv.org/abs/2507.06803

[4]

Z. Li, S. Husung, and H. Wang, “LLM-Assisted Semantic Alignment and Integration in Collaborative Model-Based Systems Engineering Using SysML v2,” in 2025 IEEE international symposium on systems engineering (ISSE), 2025, pp. 1–8. doi: 10.1109/ISSE65546.2025.11369983.

[5]

P. Darm, J. Xie, and A. Riccardi, “Inference-Time Intervention in Large Language Models for Reliable Requirement Verification,” arXiv preprint, 2025, Available: https://arxiv.org/abs/2503.14130

[6]

R. Lopopolo, “Harness engineering: Leveraging Codex in an agent-first world.” Accessed: Feb. 17, 2026. [Online]. Available: https://openai.com/index/harness-engineering/

[7]

S. Rodda, Mcp-proxy: Aggregating MCP proxy with progressive tool disclosure. (2025). Available: https://github.com/IAMSamuelRodda/mcp-proxy

[8]

C. Luddy, Xc-mcp: XCode CLI MCP server with progressive disclosure. (2025). Available: https://github.com/conorluddy/xc-mcp

[9]

Volcengine, OpenViking: Context database for AI agents. (2025). Available: https://github.com/volcengine/OpenViking

[10]

Microsoft Research, LLMLingua: Prompt compression for LLMs. (2024). Available: https://github.com/microsoft/LLMLingua

[11]

Y. Ji, “Context engineering for AI agents: Lessons from building manus.” Accessed: Feb. 12, 2026. [Online]. Available: https://manus.im/blog/Context-Engineering-for-AI-Agents-Lessons-from-Building-Manus

[12]

D. Horthy, “Advanced context engineering for coding agents.” Accessed: Feb. 12, 2026. [Online]. Available: https://humanlayer.dev/blog/advanced-context-engineering

[13]

Anthropic, “How we built our multi-agent research system.” Accessed: Feb. 12, 2026. [Online]. Available: https://www.anthropic.com/engineering/multi-agent-research-system