10  Architecture Design Description

The architecture of this system is driven by a single constraint: context windows are scarce. Modern LLMs process 100K+ tokens, but effective reasoning requires focused context. Per [1], “Context Engineering” is the discipline of optimizing LLM inputs for quality outputs.

10.1 Context Management as Architectural Driver

Per [2, Sec. 2.3.5.4], architecture decisions should trace to stakeholder concerns. The primary concern driving this architecture is context efficiency: enabling AI agents to reason effectively about SysML v2 models within constrained token budgets.

10.1.1 The Problem

A typical systems engineering project context consumes 40K+ tokens for project awareness (requirements, architecture decisions, constraints, current task state). This leaves limited tokens for model reasoning. When an AI agent must also understand a SysML v2 model, naive approaches (loading entire files) quickly exhaust available context.

10.1.2 Four Context Management Strategies

Literature analysis (Section 3.1) identified four proven strategies:

Strategy Description Example Pattern
Avoidance Minimize context needs through tiny, focused prompts Sentence-level LLM calls [3]
Staged Decomposition Pipeline with intermediate representations JSON between stages [4]
Progressive Narrowing Similarity + reachability-based pruning Subgraph extraction [5]
Multi-Agent Partitioning Divide context across specialized agents Pipeline agents

10.1.3 Architectural Response

This system enables all four strategies:

  1. Fine-grained query tools support avoidance (get exactly what you need)
  2. JSON intermediate representation enables staged decomposition
  3. Subgraph extraction implements progressive narrowing
  4. Tool design supports multi-agent workflows via stateless calls

10.2 System Context Diagram

Per [2, Sec. 2.3.5.4], the context diagram defines the system boundary and external interfaces.

                     +-----------------------------------------+
                     |           External Systems              |
                     +-----------------------------------------+
                                      |
       +------------------------------+------------------------------+
       |                              |                              |
       v                              v                              v
+-------------+              +-------------+              +-------------+
| MCP Client  |              |Git Provider |              |  SysML v2   |
| (Claude,    |              |  (GitLab,   |              | API Server  |
|  VS Code)   |              | GitHub,etc) |              |             |
+------+------+              +-------------+              +-------------+
       |                              ^                          ^
       | MCP Protocol                 | REST API                 | REST API
       | (stdio/HTTP)                 | (HTTPS)                  | (HTTPS)
       v                              |                          |
+----------------------------------------------------------------------+
|                           open-mcp-sysml                             |
|  +----------------------------------------------------------------+  |
|  |                        System Boundary                         |  |
|  |                                                                |  |
|  |  Tools: sysml_parse, sysml_validate, sysml_list_definitions,  |  |
|  |         repo_list_files, repo_get_file                        |  |
|  |  Resources: sysml://examples/*, repo://{project}/{path}       |  |
|  |                                                                |  |
|  +----------------------------------------------------------------+  |
+----------------------------------------------------------------------+

External Interfaces:

Interface Protocol Direction Description
MCP Client MCP 2024-11-05 (stdio/HTTP) Bidirectional AI tool integration
Git Provider API REST (HTTPS) Outbound Repository file access (provider-agnostic)
SysML v2 API REST (HTTPS) Outbound Model query and validation (planned)

10.2.1 Operational Modes

Mode Transport Use Case Authentication
Local Development stdio Individual engineer with Claude/VS Code Git provider PAT
Team Server HTTP (planned) Shared server for team access PAT per request
CI/CD Pipeline HTTP (planned) Automated validation CI job token

10.2.2 Use Cases

UC-1: AI-Assisted Model Review

  1. Engineer opens Claude Desktop with MCP server configured
  2. Asks: “List all requirement definitions in the vehicle model”
  3. MCP server fetches model via repo_get_file, parses via sysml_parse
  4. Claude presents findings with sysml_list_definitions and suggests improvements
  5. Engineer requests changes via iterative tool calls

UC-2: Model Validation in CI/CD

  1. Developer commits SysML v2 model changes
  2. CI pipeline calls sysml_validate via stdio
  3. Validation results reported in merge request

UC-3: Exploratory Model Query

  1. New team member asks natural language questions about model structure
  2. MCP server uses sysml_query to search elements
  3. AI explains model architecture and relationships

10.3 Novel Contributions

This project addresses gaps in the open-source MBSE ecosystem. While proprietary solutions exist, the community lacks accessible building blocks for AI-augmented systems engineering.

Contribution Gap Addressed Community Benefit
tree-sitter-sysml No open-source SysML v2 grammar in tree-sitter ecosystem Syntax highlighting, IDE support, MCP integration
kebnf-to-tree-sitter Manual grammar maintenance as OMG spec evolves Automated updates, formal traceability
open-mcp-sysml No MBSE MCP server in 7,364+ public repos AI-augmented MBSE workflows

10.3.1 tree-sitter-sysml

The first open-source SysML v2 grammar for tree-sitter, enabling:

  • Syntax highlighting in any tree-sitter-compatible editor
  • Incremental parsing for responsive IDE experiences
  • WASM compilation for browser-based tooling
  • Contribution path to tree-sitter GitHub organization
  • GitLab contribution path via vendor/grammars/

Status: Production ready. 125 corpus tests passing, 99.6% coverage across 275 external files (OMG Training, OMG Examples, GfSE, Advent of SysML v2). Context-sensitive definition bodies implemented. Technical debt documented.

10.3.2 kebnf-to-tree-sitter

Automated grammar generation from OMG KEBNF specifications:

  • One-shot conversion preserves spec traceability
  • Semantic mapping document captures conversion decisions
  • Enables automated updates when OMG publishes new specs
  • Research contribution on grammar transposition for MBSE languages

Status: Parser and emitter complete (640/640 KEBNF rules). Generated grammar has 335+ conflicts requiring resolution. Iterative conflict resolution strategy selected—fixing conflicts one at a time with documented rationale for INCOSE paper.

10.3.3 open-mcp-sysml

MCP server enabling AI agents to work with SysML v2 models:

  • Fine-grained query tools for context-efficient access
  • Provider-agnostic repository access (GitLab as reference implementation)
  • Tree-sitter integration for reliable parsing
  • Stateless tool design supporting multi-agent workflows

Status: Phase 1 Complete (Feb 13, 2026). Tree-sitter integration working with 5 MCP tools: sysml_parse (L0/L1/L2 detail levels), sysml_validate, sysml_list_definitions, repo_list_files, repo_get_file. 22 tests passing. Phase 2 PRD ready for 7 token reduction strategies. Ready for benchmark vignette testing.

10.3.4 sysml-grammar-benchmark (NEW)

Automated grammar benchmark dashboard comparing SysML v2 parser implementations:

  • Objective comparison across standardized test corpora
  • GitLab CI automation with scheduled runs
  • Quarto + D3 static dashboard on GitLab Pages
  • Community contribution path for new parsers and corpora

Status: Repository scaffolded with PRD. Dashboard and CI pipeline pending implementation.

10.4 Architecture Alternatives

Per [2, Sec. 2.3.5.4], candidate architectures were evaluated before selection.

Alternative Evaluation
Python + FastMCP Rejected: additional runtime dependency
TypeScript + official SDK Rejected: heavier deployment footprint
Go + go-sdk Initially selected, then superseded
Rust + rmcp + tree-sitter Selected: GKG alignment, community grammar

Parser Architecture Pivot: Replaced planned hand-rolled parser with tree-sitter grammar. Key factors:

Factor Hand-rolled Tree-sitter
Error recovery Must implement Built-in
Incremental parsing Must implement Built-in
Multi-platform Rust only C, WASM, Rust, Go, Python, Swift
Community reuse Project-specific tree-sitter org contribution
GitLab integration N/A vendor/grammars/ path

10.5 Technology Stack

Component Technology Rationale
Language Rust 1.85+ GKG alignment, memory safety, single static binary
MCP SDK rmcp (official) Official Rust SDK, tokio async runtime
Git Client Provider-agnostic trait Future GKG integration; GitLab as first implementation
Parser tree-sitter-sysml Community grammar, multi-platform
Transport stdio (HTTP planned) stdio for local dev; HTTP transport planned for Phase 2
Container Buildah/Podman OCI-compliant, rootless, CI-friendly
Documentation Quarto Markdown-native, GitLab Pages compatible

10.6 Repository Structure

gitlab.com/dunn.dev/open-mcp-sysml/    # GitLab Group (monorepo root)
+-- capstone/                           # SE Documentation (Quarto Book)
+-- open-mcp-sysml/                     # Rust MCP Server
|   +-- crates/
|       +-- sysml-parser/              # tree-sitter wrapper (parse, validate, list)
|       +-- repo-client/                # Provider-agnostic Git interface
|       +-- mcp-server/                 # MCP server binary
+-- tree-sitter-sysml/                  # Brute-Force Grammar (99.6% coverage)
+-- kebnf-to-tree-sitter/               # Spec-Driven Grammar Converter
+-- sysml-grammar-benchmark/            # Grammar Comparison Dashboard (NEW)

Key Design Decisions:

  • Dual-path grammar strategy: Brute-force for immediate value, spec-driven for research
  • open-mcp-sysml consumes grammar via Rust bindings
  • Dual CI: GitHub Actions for tree-sitter ecosystem, GitLab CI for coverage tracking

10.7 Component Architecture

+-------------------------------------------------------+
|               MCP Client (Claude, etc.)               |
+-------------------------------------------------------+
                            |
                            v
+-------------------------------------------------------+
|                   Transport Layer                     |
|                   (stdio / HTTP)                      |
+-------------------------------------------------------+
                            |
                            v
+-------------------------------------------------------+
|                 mcp-server crate                      |
|  +-----------+   +-----------+   +-----------+       |
|  |   Tools   |   | Resources |   |  Prompts  |       |
|  +-----------+   +-----------+   +-----------+       |
+-------------------------------------------------------+
                            |
        +-------------------+-------------------+
        v                   v                   v
+-------------+    +---------------+      +-------------+
| repo-client |    | tree-sitter   |      |  SysML v2   |
|   crate     |    | Rust bindings |      | API Client  |
+-------------+    +---------------+      +-------------+
        |                   |                   |
        v                   v                   v
+-------------+    +---------------+      +-------------+
|Git Provider |    |tree-sitter-   |      | SysML v2    |
| API         |    |sysml grammar  |      | API Server  |
+-------------+    +---------------+      +-------------+

Component Responsibilities:

Component Responsibility
mcp-server MCP protocol handling, tool dispatch
sysml-parser Rust API wrapping tree-sitter for parse, validate, list_definitions
repo-client Provider-agnostic Git operations (GitLab reference implementation)
tree-sitter-sysml Grammar definition, generated parser

10.8 Context-Aware Tool Design

Per literature recommendations, tools are organized by context cost. The implemented tools (Phase 1) prioritize low-context operations; future phases will add higher-context analysis and generation tools.

10.8.1 Implemented Tools (Phase 1)

sysml_parse:
  description: Parse SysML v2 text and extract elements
  input:
    source: string
    detail_level: "L0" | "L1" | "L2"  # L0: names only, L1: +types, L2: full AST
  output: Element[]
  context_cost: L0 ~100 tokens, L1 ~300 tokens, L2 ~2000 tokens

sysml_validate:
  description: Validate SysML v2 syntax via tree-sitter parse diagnostics
  input:
    source: string
  output: ValidationResult (errors, warnings)
  context_cost: ~300 tokens

sysml_list_definitions:
  description: List all definitions in SysML v2 text
  input:
    source: string
  output: Definition[] (name, type)
  context_cost: ~200 tokens

repo_list_files:
  description: List files in Git repository
  input:
    project: string
    path: string?
    ref: string?
    sysml_only: boolean?
  output: FileEntry[]
  context_cost: ~300 tokens

repo_get_file:
  description: Read file content from Git repository
  input:
    project: string
    path: string
    ref: string?
  output: FileContent
  context_cost: varies by file size

10.8.2 Planned Tools (Phase 2+)

sysml_query:
  description: Query elements by type/properties
  status: Planned

repo_commit:
  description: Commit file changes to repository
  status: Planned

Design Principles:

  • Stateless calls: No session pollution between tool invocations
  • Server-side pruning: Return minimal sufficient context
  • JSON intermediate representation: Enable staged decomposition
  • Syntax validation: Catch errors early on all generated content

10.9 Implementation Patterns for Progressive Context Budgeting

The tool design principles in Section 10.8 establish what to build; this section addresses how to implement these patterns using proven techniques from the rapidly evolving ecosystem of MCP servers and AI agents. These patterns emerge primarily from practitioner implementations rather than academic literature—many creators are building rapidly without publishing formal papers.

OpenAI’s “Harness Engineering” report [6] provides production-scale validation of these patterns: a team built a million-line product with zero manually-written code by treating repository-local knowledge as the system of record, enforcing architectural invariants mechanically, and using progressive disclosure (a short AGENTS.md as map, not encyclopedia). Their experience confirms that context management patterns are not academic abstractions but production necessities.

This survey reflects the ecosystem as of February 2026. Given the pace of change, we recommend revisiting these patterns monthly.

10.9.1 MCP Server Patterns

Three implementation patterns have emerged for managing context budgets within MCP tool servers, each offering 90%+ token reduction from naive implementations.

10.9.1.1 Two Meta-Tool Architecture

Instead of exposing all tools directly to the agent (consuming 15,000+ tokens), expose only two meta-tools [7]:

┌─────────────────────────────────────────────────────┐
│  Client Context Window                               │
├─────────────────────────────────────────────────────┤
│  Tools available: 2 (~800 tokens)                   │
│  ┌─────────────────────────────────────────────────┐│
│  │ get_tools_in_category(category_path)            ││
│  │ → Returns tool names + descriptions in category ││
│  └─────────────────────────────────────────────────┘│
│  ┌─────────────────────────────────────────────────┐│
│  │ execute_tool(tool_path, parameters)             ││
│  │ → Executes any tool by hierarchical path        ││
│  └─────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────┘
         │
         ▼ Agent navigates tool tree on demand
┌─────────────────────────────────────────────────────┐
│  Server Tool Registry (~15,000 tokens if exposed)   │
│  └─ sysml/                                          │
│     ├─ query/     (get_element, search, ...)        │
│     ├─ analysis/  (extract_subgraph, trace, ...)    │
│     └─ generate/  (suggest_completion, ...)         │
└─────────────────────────────────────────────────────┘

Token savings: ~95% (15,000 → 800 tokens) Trade-off: Adds one round-trip for tool discovery per category

10.9.1.2 Cache ID + Summary Pattern

Return concise summaries with cache identifiers; provide details on demand [8]:

Phase 1: Query returns summary + cache ID
─────────────────────────────────────────
sysml_list_elements(package="Vehicle")
→ { 
    cacheId: "elem-abc123",
    summary: { totalElements: 47, parts: 12, ports: 8, ... },
    quickAccess: [ "Vehicle::PowerSubsystem", "Vehicle::Chassis", ... ]
  }
  (~500 tokens)

Phase 2: Details fetched only when needed
─────────────────────────────────────────
sysml_get_details(cacheId: "elem-abc123", filter: "parts-only")
→ { elements: [ ... full element definitions ... ] }
  (~2000 tokens, only when required)

Token savings: ~97% for list operations (18,700 → 540 tokens baseline)

Complementary patterns:

  • RTFM documentation: Tool descriptions minimal (<10 words); agents call rtfm(toolName) for full documentation on demand
  • defer_loading flag: Platform-native lazy discovery (Claude-specific)

10.9.1.3 L0/L1/L2 Tiered Loading

Filesystem-inspired progressive context loading [9]:

Tier Content Tokens Use Case
L0 (Abstract) Names + one-line descriptions ~100 Quick relevance check
L1 (Overview) Structure, relationships, key properties ~2,000 Understand organization
L2 (Details) Full element definitions, constraints Full Deep analysis/editing

Application to SysML: L0 returns qualified names; L1 returns element skeletons with relationship counts; L2 returns full textual notation with documentation.

10.9.2 Complementary Techniques

Beyond MCP-specific patterns, several techniques optimize context at other layers.

10.9.2.1 Prompt Compression

LLMLingua [10] uses small language models (GPT-2, LLaMA-7B) to identify and remove non-essential tokens before the main LLM call.

Performance: Up to 20x compression with minimal quality loss RAG improvement: +21.4% accuracy using only 1/4 tokens Application: Client-side preprocessing of large retrieved documents

10.9.2.2 KV-Cache-First Design

Manus’s production experience [11] identifies KV-cache hit rate as “the single most important metric for a production AI agent”:

  • Append-only context: Never modify previous actions/observations
  • Stable prefixes: Avoid timestamps or dynamic content at prompt start
  • Tool masking: Constrain action space via logit masking, not tool removal

Cost impact: 10x difference between cached ($0.30/MTok) and uncached ($3.00/MTok)

10.9.2.3 Frequent Intentional Compaction

HumanLayer’s workflow [12] structures development around context management:

  1. Research: Subagent explores codebase, returns compressed findings
  2. Plan: Outline steps with verification criteria (human reviews here)
  3. Implement: Execute plan phase-by-phase, compacting state between phases

Key insight: “A bad line of research could lead to thousands of bad lines of code”—focus human review on research and plans, not just generated code.

Context utilization: Maintain 40-60% window utilization for optimal reasoning.

10.9.3 Performance Metrics Framework

The following metrics enable comparison across implementations:

Category Metric Baseline Examples Source
Token efficiency Compression ratio 95-97% (MCP patterns) [7], [8]
Cost impact Cached vs uncached tokens 10x difference [11]
Latency Tool response time 120ms (semantic) vs 2000ms (visual) [8]
Quality Task completion rate 24%→51% with optimization DSPy MIPROv2
Context utilization % window used effectively 40-60% sweet spot [12]
Multi-agent gain Single vs multi-agent success 90.2% improvement [13]

10.9.4 Implementation Options for open-mcp-sysml

Based on the patterns surveyed, the following implementation options are available, each addressing different aspects of context efficiency:

Cache ID + Summary

All list/query tools return summaries with cache IDs. Add sysml_get_details(cacheId, detailType) for on-demand expansion. Expected savings: ~90% on common workflows.

L0/L1/L2 Tiered Responses

Add detail_level parameter to existing tools. Default to L1; agents request L2 only when editing.

RTFM Documentation Pattern

Minimize tool descriptions to <10 words. Add sysml_rtfm(toolName) for full documentation.

Two Meta-Tool Wrapper

Useful if tool count exceeds ~20. Consider as enhancement once tool surface stabilizes.

KV-Cache Optimization

Design response formats for append-only workflows. Ensure deterministic JSON serialization.

10.9.5 Reference Implementations

Repository Stars Pattern Language URL
mcp-proxy 3 Two meta-tool Go github.com/IAMSamuelRodda/mcp-proxy
xc-mcp 59 Cache ID + RTFM + defer_loading TypeScript github.com/conorluddy/xc-mcp
OpenViking 1,100 L0/L1/L2 tiered loading Python github.com/volcengine/OpenViking
LLMLingua 5,800 Prompt compression Python github.com/microsoft/LLMLingua
12-factor-agents 18,200 Workflow patterns Docs github.com/humanlayer/12-factor-agents
Beads 16,000 Git-backed agent memory Go github.com/steveyegge/beads
OpenAI Harness Progressive disclosure + golden principles Internal [6]

10.10 Interface Definitions

10.10.1 MCP Protocol Interface

The server implements MCP 2024-11-05:

  • initialize - Protocol handshake
  • tools/list - Enumerate available tools
  • tools/call - Execute a tool
  • resources/list - Enumerate resources
  • resources/read - Read a resource

10.10.2 Repository Interface Schema

{
  "RepoClient": {
    "methods": {
      "read_file": {
        "params": ["project", "path", "ref"],
        "returns": "bytes"
      },
      "list_files": {
        "params": ["project", "path", "ref"],
        "returns": "FileEntry[]"
      }
    }
  }
}

10.10.3 Parser Interface Schema

{
  "sysml_parse": {
    "input": {
      "content": "string"
    },
    "output": {
      "success": "boolean",
      "tree": "ParseTree?",
      "errors": "ParseError[]"
    }
  }
}

10.11 Dual-Path Grammar Strategy

Both grammar development paths are actively maintained:

Path Repository Purpose Status
Brute Force tree-sitter-sysml Practical parsing, MCP server Tier 1 complete
Spec-Driven kebnf-to-tree-sitter Formal compliance, INCOSE paper Tool complete, grammar in progress

Cross-validation: Comparing outputs identifies spec interpretation errors in brute-force and practical parsing issues in spec-driven output.

Why both matter:

  • Brute-force provides immediate practical value
  • Spec-driven enables automated updates and formal traceability
  • Comparison validates both approaches

10.12 Requirements Allocation

Per [2, Sec. 2.3.5.4], requirements are allocated to architecture elements.

Requirement Architecture Element Component
FR-MCP-001, FR-MCP-002 MCP Server mcp-server crate
FR-MCP-003 HTTP Transport mcp-server crate
FR-REPO-001, FR-REPO-002 Repo Client repo-client crate
FR-SYS-001, FR-SYS-002 SysML Parser tree-sitter-sysml
NFR-DEP-001 Build Configuration Cargo.toml
NFR-DEP-002 Container Image Containerfile

10.13 Deployment Architecture

10.13.1 Deployment Modes

Mode Transport Use Case Configuration
Local Development stdio Claude Desktop, VS Code --transport stdio
CI/CD Integration HTTP GitLab CI services --transport http --port 8080
Container HTTP Production deployment Docker/Podman with port mapping

10.13.2 Development Constraints

Constraint: No local container builds on macOS (no podman machine).

Mitigation:

  • Local development uses cargo build and cargo test directly
  • MCP protocol testing via stdio (no containers required)
  • Container builds run exclusively in GitLab CI

10.14 CI/CD Pipeline

GitLab Ultimate features leveraged:

Feature Purpose
SAST Static Application Security Testing for Rust
Dependency Scanning Scan Cargo.lock for vulnerabilities
Secret Detection Prevent accidental credential commits
License Compliance Track crate licenses
Code Quality Clippy integration for Rust linting

10.14.1 tree-sitter-sysml CI/CD

Dual CI for ecosystem compatibility:

Platform Purpose
GitHub Actions Tree-sitter org compatibility
GitLab CI Coverage tracking, security scanning