Enabling AI-Augmented MBSE with the Model Context Protocol
Systems Engineering Capstone Project
Download as PDF View Presentation Presentation PDF
0.1 Executive Summary
This document outlines the systems engineering plan for developing an open source SysML v2 Model Context Protocol (MCP) server. The project serves dual purposes:
- Open Source Contribution: Provide standalone tooling for AI-augmented Model-Based Systems Engineering (MBSE) workflows, filling a gap between commercial platforms and DIY scripting
- Academic Capstone: Demonstrate INCOSE systems engineering principles [1] for a Wayne State University masters engineering capstone project
Commercial tools like SysGit provide mature, licensed Git-based MBSE platforms with comprehensive SysML v2 support. This project offers a lightweight open source alternative focused specifically on AI/LLM integration via the Model Context Protocol.
0.1.1 Key Deliverables
Software:
- tree-sitter-sysml: Standalone SysML v2 grammar for tree-sitter (100% training file coverage)
- kebnf-to-tree-sitter: Automated converter from OMG KEBNF specifications to tree-sitter grammars
- open-mcp-sysml: Rust MCP server with Git integration (GitLab as reference) and SysML v2 support
Publications:
- NDIA GVSETS 2026: AI-Augmented MBSE via MCP (draft Mar 23, final Jun 5, presentation Aug 11)
- INCOSE/SysEng Journal: Grammar Transposition Methodology (kebnf-to-tree-sitter)
- INCOSE 2027: SE Benchmark for LLMs (future work)
Academic:
- Capstone SE documentation (SEP, SyRS, ADD, VVP, RTM)
0.1.2 Timeline
- Initial Research: Early January 2026 (SysML v2 specifications and prior art)
- Concept Phase Start: January 12, 2026 (Week 1)
- Capstone Delivery: April 25, 2026 (Week 15)
- Duration: 15 weeks
0.1.3 Project Status (Feb 14, 2026)
Repositories:
| Repository | Status | Next Step |
|---|---|---|
| tree-sitter-sysml | β 99.6% coverage (274/275 files), 125/125 tests | Pre-release cleanup (queries, CHANGELOG) |
| kebnf-to-tree-sitter | β 640 rules parsed, 335+ conflicts | Resolve DD-001 architecture decision |
| open-mcp-sysml | β Phase 1 Complete (5 MCP tools, 22 tests) | Execute benchmark vignettes, Phase 2 token strategies |
| sysml-grammar-benchmark | π Scaffolded (0% functional) | Add corpus submodules, implement adapter |
| gvsets | β ~7 pages, quantitative claims unvalidated | Execute V1/V4/V5, replace TODO placeholders |
| capstone | β SRR + PDR complete (Feb 14) | VVP prose, conclusions rewrite |
Publications:
| Paper | Status | Due |
|---|---|---|
| GVSETS 2026 | β Drafted, evaluation section placeholder | Draft Mar 23, Final Jun 5 |
| Grammar Transposition | β Conflict resolution in progress | Q3-Q4 2026 |
| INCOSE 2027 Benchmark | β 8 vignettes defined (Section C.1) | Q3 2027 |
tree-sitter-sysml (Brute-Force Grammar): PRODUCTION READY
- 125/125 corpus tests passing
- 99.6% coverage across 275 external files (OMG, GfSE, Advent)
- Context-sensitive definition bodies implemented
- 6 language bindings (C, Rust, Go, Python, Node.js, Swift)
- Pre-release cleanup: ~18-25 hours to 1.0.0
kebnf-to-tree-sitter (Spec-Driven Grammar):
- KEBNF parser complete (640/640 rules)
- Grammar generation produces 335+ conflicts (vs 54 in brute-force)
- 4 critical path decisions open (DD-001, DD-008, DD-009, DD-020)
open-mcp-sysml (MCP Server): PHASE 1 COMPLETE
- 3 crates: sysml-parser, repo-client, mcp-server
- 5 MCP tools: sysml_parse, sysml_validate, sysml_list_definitions, repo_list_files, repo_get_file
- L0/L1/L2 detail levels implemented (80-97% token reduction)
- 22 tests (unit, integration, MCP protocol compliance)
- Phase 2 PRD ready: 7 token reduction strategies (including overflow detection)
sysml-grammar-benchmark:
- Repository scaffolded with PRD, CI, Quarto dashboard
- Python runner script complete, no adapters or corpora yet
- Placeholder data in dashboard
GVSETS Paper:
3-condition experiment designed with benchmark vignettes V1, V4, V5 (Section C.1):
- Baseline: All files concatenated (naive)
- Vanilla MCP: Simple tool calls
- Optimized MCP: Cache ID + Summary pattern
Next Priority: Execute benchmark vignettes V1/V4/V5 to validate GVSETS quantitative claims
0.2 Problem Statement
The Model Context Protocol [2] ecosystem has 75,000+ GitHub stars and 10+ official SDKs, while SysML v2 [3] achieved OMG adoption in July 2025. Yet their intersection remains unexplored. Defense and aerospace organizations need:
- Standardized AI-tool integration for MBSE workflows
- Lightweight programmatic access to SysML v2 models
- CI/CD integration for model validation
- Open source alternatives to proprietary vendor lock-in
0.3 MCP for SysML Context
The Model Context Protocol [2] standardizes how AI applications access external data and tools. An MCP server bridges AI assistants and domain-specific systemsβin our case, SysML v2 models stored in Git repositories.
WITHOUT MCP SERVER:
ββββββββββββββββ ββββββββββββββββββββ
β Engineer β βββ copy/paste βββββΆ β AI Assistant β
β β βββ copy/paste βββββ β (Claude, etc.) β
ββββββββββββββββ ββββββββββββββββββββ
β β
βΌ βΌ
ββββββββββββββββ ββββββββββββββββββββ
β Git Repo β (no connection) β Generic SysML β
β .sysml β β knowledge only β
ββββββββββββββββ ββββββββββββββββββββ
Problems: AI sees snippets, not full project. Cannot validate.
Cannot commit. Context lost between conversations.
WITH MCP SERVER:
ββββββββββββββββ MCP ββββββββββββββββββββ
β Engineer βββββ Protocol ββββΆβ AI Assistant β
ββββββββββββββββ β (Claude, etc.) β
ββββββββββ¬ββββββββββ
β
β MCP
βΌ
ββββββββββββββββββββ
β SysML v2 MCP β
β Server β
ββββββββββ¬ββββββββββ
β
ββββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββ
β β β
βΌ βΌ βΌ
ββββββββββββββββ βββββββββββββββ βββββββββββββββ
β Git Repo β β SysML v2 β β Local β
β .sysml β β API Server β β Parser β
ββββββββββββββββ βββββββββββββββ βββββββββββββββ
Benefits: AI reads full project. Validates models. Commits changes.
Structured understanding. Persists across conversations.
| Without MCP | With MCP Server |
|---|---|
| AI sees pasted snippets | AI reads entire project |
| No model validation | Validates against SysML v2 spec |
| Manual copy/paste workflow | Direct Git repository integration |
| Generic SysML knowledge | Structured element queries |
| Context lost between sessions | Project state persists |
This transforms the AI from a βSysML syntax helperβ into an βMBSE collaboratorβ that understands actual project state and can take actions within it. For detailed MCP architecture and server design, see Section 4.1.
0.4 Project Objectives
- Develop an open source MCP server for SysML v2
- Integrate with Git providers for model persistence (GitLab as reference implementation)
- Connect to SysML v2 API Services for validation
- Demonstrate AI-augmented MBSE workflows using GitLab Duo
- Publish findings at NDIA GVSETS
0.5 Central Thesis: The Harness Matters
This MCP server is one component of a larger harness for leveraging LLMs in MBSE workflows. The thesis of this project is that harness designβhow context is selected, structured, and presented to LLMsβmay matter more than raw model capability.
This thesis draws inspiration from emerging practitioner frameworks, particularly Dex Horthyβs 12-Factor Agents [4] and the concept of Context Engineeringβthe discipline of optimizing what information reaches an LLM and how itβs structured. As Horthy notes: βEverything is context engineering. LLMs are stateless functions that turn inputs into outputs. To get the best outputs, you need to give them the best inputs.β
OpenAIβs βHarness Engineeringβ report [5] provides compelling industry validation: a team built a million-line production product with zero manually-written code by investing in environment design rather than direct coding. Their central findingββthe primary job of our engineering team became enabling the agents to do useful workββdirectly parallels this projectβs thesis. Their experience with progressive disclosure (βgive Codex a map, not a 1,000-page instruction manualβ), repository-local knowledge stores, and mechanical enforcement of architectural invariants confirms that harness quality determines agent effectiveness, independent of model capability.
0.5.1 The Context Window Problem
Large language models exhibit measurable performance degradation when operating in the back half of their context windows. Even frontier models with 200K+ token limits show reasoning quality drops as context length increases. SysML v2 models exacerbate this:
- Enterprise systems contain thousands of elements across hundreds of files
- Naive βload everythingβ approaches exhaust token budgets before work begins
- Relevant elements become obscured within structural boilerplate
- Model relationships span files in ways that defeat simple truncation
A 40,000-token βproject awarenessβ overhead (as observed in this documentβs own development) leaves limited budget for actual reasoning about complex models.
0.5.2 Intelligent Context Management
The value proposition extends beyond βMCP server provides model accessβ to βMCP server enables selective context presentationβ:
| Anti-Pattern | Harness-Aware Approach |
|---|---|
Return entire .sysml files |
Return specific elements by query |
| Dump full element hierarchies | Return element + immediate relationships |
| Include all metadata | Filter to semantically relevant properties |
| Load model into context upfront | Lazy-load via iterative tool calls |
| Single monolithic prompt | Decompose across agents/iterations |
The parser/grammar provides the foundation: structured access to model elements. The MCP server provides the interface: tools that can be designed for minimal, targeted context injection. The harness design determines whether this pipeline produces meaningful LLM contributions or context-stuffed hallucinations.
0.5.3 Research Questions
This thesis motivates several questions addressed in the literature review (Section 3.1):
- What context management strategies do existing AI+MBSE systems employ?
- How much context is sufficient for meaningful LLM reasoning over models?
- What decomposition patterns (multi-agent, iterative refinement) reduce per-call context?
- How do we measure βmeaningful performanceβ for LLM-MBSE interactions?
The architecture (Section 10.1) is designed to enable experimentation with these questions through configurable tool granularity and optional SysML v2 API integration for server-side query resolution.
0.5.4 Research-Plan-Implement Cycles
An emerging pattern in effective LLM agent design is the iterative research-before-planning approach: rather than planning upfront and executing linearly, high-quality agent workflows interleave targeted research slices with incremental planning. Each cycle:
- Research: Gather just enough context relevant to the immediate decision
- Plan: Make a focused plan for the next concrete step
- Implement: Execute the step, capturing results
- Repeat: Use implementation results to inform the next research slice
This pattern appears across multiple sourcesβAnthropicβs βBuilding Effective Agentsβ [6] describes the evaluator-optimizer workflow where βone LLM call generates a response while another provides evaluation and feedback in a loop.β The 12-Factor Agents framework emphasizes small, focused agents that βown their context windowβ rather than attempting monolithic operations.
For SysML v2 models, this suggests MCP tools should support incremental exploration: query a subsystem, analyze its interfaces, decide what adjacent context is needed, fetch that context, then proceedβrather than loading an entire model upfront. The grammar and parser provide the foundation for these surgical context extractions.
0.6 Scope
0.6.1 In Scope
Grammar Development (Dual-Path):
- tree-sitter-sysml: Brute-force grammar with 100% training file coverage
- kebnf-to-tree-sitter: Spec-driven grammar converter for formal traceability
MCP Server:
- open-mcp-sysml (Rust): Consumes tree-sitter-sysml via bindings
- Git provider file read/write operations (GitLab as reference)
- SysML v2 API client integration
- stdio and HTTP transport mechanisms
- Container deployment
Publications:
- GVSETS 2026 paper on MCP architecture
- Grammar transposition methodology paper
- SE documentation (SEP, SyRS, ADD, VVP)
0.6.2 Out of Scope (Future Work)
- sysml.rs: Full SysML v2 semantic analysis in Rust β a research instrument for PhD work enabling import resolution, type checking, and constraint evaluation beyond tree-sitterβs syntax-only capabilities
- AI benchmarking framework (INCOSE 2027 paper topic)
- Multi-agent architectures
- Commercial integrations
0.7 Document Structure
This book contains the complete systems engineering documentation:
- Chapter 1: Foundation (SysML v2 background)
- Chapter 3: Literature Review (AI + MBSE research, prior art)
- Chapter 4: Model Context Protocol
- Chapter 5: Tooling Ecosystem
- Chapter 6: Systems Engineering Plan (SEP)
- Chapter 7: Work Breakdown Structure (WBS)
- Chapter 8: Stakeholder Analysis
- Chapter 9: System Requirements Specification (SyRS)
- Chapter 10: Architecture Design Description (ADD)
- Chapter 11: Verification & Validation Plan (VVP)
- Chapter 12: Implementation
- Chapter 13: Conclusions
Appendices include glossary, references, traceability matrix, publication strategy, and benchmark vignettes.