Enabling AI-Augmented MBSE with the Model Context Protocol

Systems Engineering Capstone Project

Authors
Affiliations

Andrew Dunn

GitLab, Public Sector

Greg Pappas

Department of Defense, Army DEVCOM

Dr. Stephen Rapp

Wayne State University, Industrial and Systems Engineering

Published

February 17, 2026

Download as PDF View Presentation Presentation PDF

0.1 Executive Summary

This document outlines the systems engineering plan for developing an open source SysML v2 Model Context Protocol (MCP) server. The project serves dual purposes:

  1. Open Source Contribution: Provide standalone tooling for AI-augmented Model-Based Systems Engineering (MBSE) workflows, filling a gap between commercial platforms and DIY scripting
  2. Academic Capstone: Demonstrate INCOSE systems engineering principles [1] for a Wayne State University masters engineering capstone project

Commercial tools like SysGit provide mature, licensed Git-based MBSE platforms with comprehensive SysML v2 support. This project offers a lightweight open source alternative focused specifically on AI/LLM integration via the Model Context Protocol.

0.1.1 Key Deliverables

Software:

  • tree-sitter-sysml: Standalone SysML v2 grammar for tree-sitter (100% training file coverage)
  • kebnf-to-tree-sitter: Automated converter from OMG KEBNF specifications to tree-sitter grammars
  • open-mcp-sysml: Rust MCP server with Git integration (GitLab as reference) and SysML v2 support

Publications:

  • NDIA GVSETS 2026: AI-Augmented MBSE via MCP (draft Mar 23, final Jun 5, presentation Aug 11)
  • INCOSE/SysEng Journal: Grammar Transposition Methodology (kebnf-to-tree-sitter)
  • INCOSE 2027: SE Benchmark for LLMs (future work)

Academic:

  • Capstone SE documentation (SEP, SyRS, ADD, VVP, RTM)

0.1.2 Timeline

  • Initial Research: Early January 2026 (SysML v2 specifications and prior art)
  • Concept Phase Start: January 12, 2026 (Week 1)
  • Capstone Delivery: April 25, 2026 (Week 15)
  • Duration: 15 weeks

0.1.3 Project Status (Feb 14, 2026)

TipCurrent Status

Repositories:

Repository Status Next Step
tree-sitter-sysml βœ… 99.6% coverage (274/275 files), 125/125 tests Pre-release cleanup (queries, CHANGELOG)
kebnf-to-tree-sitter ◐ 640 rules parsed, 335+ conflicts Resolve DD-001 architecture decision
open-mcp-sysml βœ… Phase 1 Complete (5 MCP tools, 22 tests) Execute benchmark vignettes, Phase 2 token strategies
sysml-grammar-benchmark πŸ†• Scaffolded (0% functional) Add corpus submodules, implement adapter
gvsets ◐ ~7 pages, quantitative claims unvalidated Execute V1/V4/V5, replace TODO placeholders
capstone βœ“ SRR + PDR complete (Feb 14) VVP prose, conclusions rewrite

Publications:

Paper Status Due
GVSETS 2026 ◐ Drafted, evaluation section placeholder Draft Mar 23, Final Jun 5
Grammar Transposition ◐ Conflict resolution in progress Q3-Q4 2026
INCOSE 2027 Benchmark β—‹ 8 vignettes defined (Section C.1) Q3 2027

tree-sitter-sysml (Brute-Force Grammar): PRODUCTION READY

  • 125/125 corpus tests passing
  • 99.6% coverage across 275 external files (OMG, GfSE, Advent)
  • Context-sensitive definition bodies implemented
  • 6 language bindings (C, Rust, Go, Python, Node.js, Swift)
  • Pre-release cleanup: ~18-25 hours to 1.0.0

kebnf-to-tree-sitter (Spec-Driven Grammar):

  • KEBNF parser complete (640/640 rules)
  • Grammar generation produces 335+ conflicts (vs 54 in brute-force)
  • 4 critical path decisions open (DD-001, DD-008, DD-009, DD-020)

open-mcp-sysml (MCP Server): PHASE 1 COMPLETE

  • 3 crates: sysml-parser, repo-client, mcp-server
  • 5 MCP tools: sysml_parse, sysml_validate, sysml_list_definitions, repo_list_files, repo_get_file
  • L0/L1/L2 detail levels implemented (80-97% token reduction)
  • 22 tests (unit, integration, MCP protocol compliance)
  • Phase 2 PRD ready: 7 token reduction strategies (including overflow detection)

sysml-grammar-benchmark:

  • Repository scaffolded with PRD, CI, Quarto dashboard
  • Python runner script complete, no adapters or corpora yet
  • Placeholder data in dashboard

GVSETS Paper:

3-condition experiment designed with benchmark vignettes V1, V4, V5 (Section C.1):

  1. Baseline: All files concatenated (naive)
  2. Vanilla MCP: Simple tool calls
  3. Optimized MCP: Cache ID + Summary pattern

Next Priority: Execute benchmark vignettes V1/V4/V5 to validate GVSETS quantitative claims

0.2 Problem Statement

The Model Context Protocol [2] ecosystem has 75,000+ GitHub stars and 10+ official SDKs, while SysML v2 [3] achieved OMG adoption in July 2025. Yet their intersection remains unexplored. Defense and aerospace organizations need:

  • Standardized AI-tool integration for MBSE workflows
  • Lightweight programmatic access to SysML v2 models
  • CI/CD integration for model validation
  • Open source alternatives to proprietary vendor lock-in

0.3 MCP for SysML Context

The Model Context Protocol [2] standardizes how AI applications access external data and tools. An MCP server bridges AI assistants and domain-specific systemsβ€”in our case, SysML v2 models stored in Git repositories.

WITHOUT MCP SERVER:

  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚   Engineer   β”‚ ─── copy/paste ────▢ β”‚   AI Assistant   β”‚
  β”‚              β”‚ ◀── copy/paste ───── β”‚  (Claude, etc.)  β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                                       β”‚
         β–Ό                                       β–Ό
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚  Git Repo    β”‚    (no connection)   β”‚  Generic SysML   β”‚
  β”‚    .sysml    β”‚                      β”‚  knowledge only  β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

  Problems: AI sees snippets, not full project. Cannot validate.
            Cannot commit. Context lost between conversations.


WITH MCP SERVER:

  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       MCP        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚   Engineer   │◀─── Protocol ───▢│   AI Assistant   β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                  β”‚  (Claude, etc.)  β”‚
                                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                             β”‚
                                             β”‚ MCP
                                             β–Ό
                                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                    β”‚   SysML v2 MCP   β”‚
                                    β”‚      Server      β”‚
                                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                             β”‚
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚                              β”‚                              β”‚
              β–Ό                              β–Ό                              β–Ό
     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”               β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”               β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
     β”‚  Git Repo    β”‚               β”‚  SysML v2   β”‚               β”‚    Local    β”‚
     β”‚    .sysml    β”‚               β”‚  API Server β”‚               β”‚    Parser   β”‚
     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

  Benefits: AI reads full project. Validates models. Commits changes.
            Structured understanding. Persists across conversations.
Without MCP With MCP Server
AI sees pasted snippets AI reads entire project
No model validation Validates against SysML v2 spec
Manual copy/paste workflow Direct Git repository integration
Generic SysML knowledge Structured element queries
Context lost between sessions Project state persists

This transforms the AI from a β€œSysML syntax helper” into an β€œMBSE collaborator” that understands actual project state and can take actions within it. For detailed MCP architecture and server design, see Section 4.1.

0.4 Project Objectives

  1. Develop an open source MCP server for SysML v2
  2. Integrate with Git providers for model persistence (GitLab as reference implementation)
  3. Connect to SysML v2 API Services for validation
  4. Demonstrate AI-augmented MBSE workflows using GitLab Duo
  5. Publish findings at NDIA GVSETS

0.5 Central Thesis: The Harness Matters

This MCP server is one component of a larger harness for leveraging LLMs in MBSE workflows. The thesis of this project is that harness designβ€”how context is selected, structured, and presented to LLMsβ€”may matter more than raw model capability.

This thesis draws inspiration from emerging practitioner frameworks, particularly Dex Horthy’s 12-Factor Agents [4] and the concept of Context Engineeringβ€”the discipline of optimizing what information reaches an LLM and how it’s structured. As Horthy notes: β€œEverything is context engineering. LLMs are stateless functions that turn inputs into outputs. To get the best outputs, you need to give them the best inputs.”

OpenAI’s β€œHarness Engineering” report [5] provides compelling industry validation: a team built a million-line production product with zero manually-written code by investing in environment design rather than direct coding. Their central findingβ€”β€œthe primary job of our engineering team became enabling the agents to do useful work”—directly parallels this project’s thesis. Their experience with progressive disclosure (β€œgive Codex a map, not a 1,000-page instruction manual”), repository-local knowledge stores, and mechanical enforcement of architectural invariants confirms that harness quality determines agent effectiveness, independent of model capability.

0.5.1 The Context Window Problem

Large language models exhibit measurable performance degradation when operating in the back half of their context windows. Even frontier models with 200K+ token limits show reasoning quality drops as context length increases. SysML v2 models exacerbate this:

  • Enterprise systems contain thousands of elements across hundreds of files
  • Naive β€œload everything” approaches exhaust token budgets before work begins
  • Relevant elements become obscured within structural boilerplate
  • Model relationships span files in ways that defeat simple truncation

A 40,000-token β€œproject awareness” overhead (as observed in this document’s own development) leaves limited budget for actual reasoning about complex models.

0.5.2 Intelligent Context Management

The value proposition extends beyond β€œMCP server provides model access” to β€œMCP server enables selective context presentation”:

Anti-Pattern Harness-Aware Approach
Return entire .sysml files Return specific elements by query
Dump full element hierarchies Return element + immediate relationships
Include all metadata Filter to semantically relevant properties
Load model into context upfront Lazy-load via iterative tool calls
Single monolithic prompt Decompose across agents/iterations

The parser/grammar provides the foundation: structured access to model elements. The MCP server provides the interface: tools that can be designed for minimal, targeted context injection. The harness design determines whether this pipeline produces meaningful LLM contributions or context-stuffed hallucinations.

0.5.3 Research Questions

This thesis motivates several questions addressed in the literature review (Section 3.1):

  1. What context management strategies do existing AI+MBSE systems employ?
  2. How much context is sufficient for meaningful LLM reasoning over models?
  3. What decomposition patterns (multi-agent, iterative refinement) reduce per-call context?
  4. How do we measure β€œmeaningful performance” for LLM-MBSE interactions?

The architecture (Section 10.1) is designed to enable experimentation with these questions through configurable tool granularity and optional SysML v2 API integration for server-side query resolution.

0.5.4 Research-Plan-Implement Cycles

An emerging pattern in effective LLM agent design is the iterative research-before-planning approach: rather than planning upfront and executing linearly, high-quality agent workflows interleave targeted research slices with incremental planning. Each cycle:

  1. Research: Gather just enough context relevant to the immediate decision
  2. Plan: Make a focused plan for the next concrete step
  3. Implement: Execute the step, capturing results
  4. Repeat: Use implementation results to inform the next research slice

This pattern appears across multiple sourcesβ€”Anthropic’s β€œBuilding Effective Agents” [6] describes the evaluator-optimizer workflow where β€œone LLM call generates a response while another provides evaluation and feedback in a loop.” The 12-Factor Agents framework emphasizes small, focused agents that β€œown their context window” rather than attempting monolithic operations.

For SysML v2 models, this suggests MCP tools should support incremental exploration: query a subsystem, analyze its interfaces, decide what adjacent context is needed, fetch that context, then proceedβ€”rather than loading an entire model upfront. The grammar and parser provide the foundation for these surgical context extractions.

0.6 Scope

0.6.1 In Scope

Grammar Development (Dual-Path):

  • tree-sitter-sysml: Brute-force grammar with 100% training file coverage
  • kebnf-to-tree-sitter: Spec-driven grammar converter for formal traceability

MCP Server:

  • open-mcp-sysml (Rust): Consumes tree-sitter-sysml via bindings
  • Git provider file read/write operations (GitLab as reference)
  • SysML v2 API client integration
  • stdio and HTTP transport mechanisms
  • Container deployment

Publications:

  • GVSETS 2026 paper on MCP architecture
  • Grammar transposition methodology paper
  • SE documentation (SEP, SyRS, ADD, VVP)

0.6.2 Out of Scope (Future Work)

  • sysml.rs: Full SysML v2 semantic analysis in Rust β€” a research instrument for PhD work enabling import resolution, type checking, and constraint evaluation beyond tree-sitter’s syntax-only capabilities
  • AI benchmarking framework (INCOSE 2027 paper topic)
  • Multi-agent architectures
  • Commercial integrations

0.7 Document Structure

This book contains the complete systems engineering documentation:

  • Chapter 1: Foundation (SysML v2 background)
  • Chapter 3: Literature Review (AI + MBSE research, prior art)
  • Chapter 4: Model Context Protocol
  • Chapter 5: Tooling Ecosystem
  • Chapter 6: Systems Engineering Plan (SEP)
  • Chapter 7: Work Breakdown Structure (WBS)
  • Chapter 8: Stakeholder Analysis
  • Chapter 9: System Requirements Specification (SyRS)
  • Chapter 10: Architecture Design Description (ADD)
  • Chapter 11: Verification & Validation Plan (VVP)
  • Chapter 12: Implementation
  • Chapter 13: Conclusions

Appendices include glossary, references, traceability matrix, publication strategy, and benchmark vignettes.