Appendix B — Publication Strategy

B.1 Overview

The expanded project scope supports three distinct publications, each building on the prior:

Paper Repository Target Venue Timing Status
GVSETS 2026 gvsets/ NDIA GVSETS Draft Mar 23, Final Jun 5 Drafted, evaluation pending
Grammar Transposition kebnf-to-tree-sitter MODELS/SLE 2026 or SE Journal Q3-Q4 2026 Outline complete
SE Benchmark for AI sysml-grammar-benchmark INCOSE IS 2027 Q3 2027 Notional

B.1.1 Publication Dependencies

GVSETS 2026 (Foundation)
    │
    ├──→ Grammar Paper (Formal Rigor)
    │         │
    └─────────┴──→ Benchmark Paper (Validation)

B.1.2 Authors (All Papers)

  • Andrew Dunn (GitLab Public Sector)
  • Greg Pappas (DoD, Army, AFC-DEVCOM)
  • Dr. Stephen Rapp (Wayne State University, ISE)

B.2 GVSETS 2026: AI-Augmented MBSE

Working Title: “Enabling AI-Augmented Model-Based Systems Engineering with the Model Context Protocol”

Attribute Value
Track Digital Engineering / AI
Format 8-page technical paper + presentation
Draft Due March 23, 2026
Notification May 1, 2026
Final Due June 5, 2026
Presentations Due July 23, 2026
Presentation August 11, 2026 (Novi, MI)

B.2.1 Key Thesis

MCP provides a standardized interface enabling AI assistants to interact with SysML v2 models stored in Git repositories, offering token efficiency via selective retrieval (estimated 80-97% reduction with L0/L1/L2 detail levels, pending benchmark validation), structured responses eliminating parsing ambiguity, and authoritative answers from tools connected to real repositories.

B.2.2 Current Status

The paper is drafted in gvsets/paper/main.tex with all sections present. The evaluation section (Section 5) contains placeholder data pending benchmark vignette execution (V1, V4, V5 from Section C.1). Eight TODO markers flag unvalidated quantitative claims that must be resolved before submission.

B.2.3 3-Condition Experiment Design

Condition Description
Baseline All files concatenated into prompt (naive approach)
Vanilla MCP Simple tool calls without optimization
Optimized MCP Cache ID + Summary pattern, L0/L1/L2 tiered responses

Critical path: Execute benchmark vignettes V1/V4/V5 against the Eve Mining Frigate model to replace placeholder data with measured results.

B.2.4 Relationship to Capstone

This paper establishes the foundation — demonstrating practical AI-MBSE integration and proof of value. The systems engineering artifacts in this capstone (SEP, SyRS, ADD, VVP, RTM) provide the methodological rigor backing the paper’s claims. The expanded ecosystem understanding from scope exploration (7 projects) enables articulation of meaningful future research directions.


B.3 Grammar Transposition Paper

Working Title: “Automated Grammar Transposition: Converting OMG KEBNF Specifications to Tree-sitter Parsers”

Target Venue: MODELS/SLE 2026 or Systems Engineering Journal

Attribute Value
Format 10-12 page technical paper
Target Q3-Q4 2026 submission
Repository kebnf-to-tree-sitter

B.3.1 Key Thesis

Formal specification grammars (KEBNF) can be systematically converted to practical parser generators (tree-sitter) with ~93% automation and documented semantic mappings. This enables reproducible grammar generation when specifications update, formal traceability from parser rules to specification sources, and a reusable methodology for any OMG KEBNF-based specification (OCL, Alf, future textual notations).

B.3.2 Unique Contribution

This is the first documented methodology for KEBNF → tree-sitter conversion. No existing literature addresses KEBNF specifically, despite OMG’s use of KEBNF across multiple standards and tree-sitter’s rapid adoption across major editors and platforms.

B.3.3 Paper Structure

  1. Introduction (~1 page): MBSE adoption driving need for SysML v2 tooling; gap between OMG specifications and practical parsers; contribution statement
  2. Background (~2 pages): OMG grammar specifications, KEBNF syntax elements (type annotations, property assignments, cross-references, semantic actions), tree-sitter architecture
  3. Methodology (~3 pages): KEBNF pattern taxonomy, conversion algorithm (parse → classify → transform → record mapping → emit), semantic mapping document design
  4. Implementation (~2 pages): Tool architecture (Chumsky parser → Mapper → tree-sitter emitter), technology choices, conflict detection approach
  5. Case Study: SysML v2 (~2 pages): Input corpus (640 rules across KerML + SysML KEBNF files), automation results by category, comparison with hand-written tree-sitter-sysml grammar
  6. Discussion (~1 page): Applicability beyond SysML (OCL, Alf), limitations, tree-sitter enhancement opportunities
  7. Conclusions (~0.5 page)

B.3.4 Current Status

The kebnf-to-tree-sitter tool is functional: parser complete (640/640 rules), emitter produces tree-sitter grammar.js output. Generated grammar has 335+ conflicts requiring iterative resolution. The iterative resolution process (fixing conflicts one at a time with documented rationale) is itself a contribution for the paper.

B.3.5 Automation Results

Category % Rules Handling
Direct conversion 38% Basic syntax maps directly
Strip & convert 55% Remove annotations, keep structure
Best-effort 6% Approximate semantic actions
Manual review <1% Complex disambiguation

B.3.6 Dual-Path Cross-Validation

Comparing the generated grammar against the hand-written tree-sitter-sysml identifies spec interpretation errors in the hand-written grammar and practical parsing issues in the generated grammar. This cross-validation is a novel contribution.


B.4 INCOSE 2027: SE Benchmark for AI

Working Title: “Toward a Systems Engineering Benchmark for Large Language Models”

Target Venue: INCOSE International Symposium 2027 or Systems Engineering Journal

Attribute Value
Format 10-12 page technical paper
Target Q2-Q3 2027 submission
Builds On GVSETS 2026 (MCP server), Grammar Paper (formal methodology)

B.4.1 Key Thesis

The systems engineering community needs standardized benchmarks to evaluate AI/LLM capabilities on SE tasks, analogous to SWE-bench for software engineering. No standardized benchmark exists for requirements engineering, architecture definition, or V&V — core SE activities. Without benchmarks, progress in AI4SE cannot be measured objectively.

B.4.2 Gap Analysis

Existing Benchmark Domain SE Coverage
SWE-bench Software bug fixing None
HumanEval Code generation None
Humanity’s Last Exam Expert knowledge Minimal
No existing benchmark Systems engineering -

B.4.3 Proposed Framework

SE task taxonomy aligned with INCOSE processes: requirements elicitation, requirements quality assessment, model completion, test generation from requirements, and requirements-to-design traceability. Each task has deterministic ground truth, measurable evaluation metrics, and supports comparison between baseline AI and MCP-enabled AI conditions.

B.4.4 Relationship to Benchmark Vignettes

The benchmark vignettes defined in Section C.1 (V1-V8) serve as pilot tasks for this paper. The GVSETS paper uses V1, V4, V5 for proof of value; this paper expands to the full set and adds formal evaluation methodology.

B.4.5 Timeline

Phase Target Activities
Foundation Current (capstone) Literature review, MCP implementation, vignette definitions
Task Design Q3 2026 Define 50-100 tasks, evaluation protocols
Pilot Study Q4 2026 Run benchmark, collect data
SME Validation Q1 2027 Expert review of tasks and results
Paper Draft Q2 2027 Write and internal review
Submission Q3 2027 Target INCOSE IS 2027