Per [1, Secs. 2.3.5.9, 2.3.5.11], this plan defines how we confirm the system meets requirements (verification) and stakeholder needs (validation).
The V&V strategy reflects the layered architecture of the system itself. The SysML v2 MCP server is built atop a tree-sitter grammar, wrapped in Rust crates, and exposed via the MCP protocol. Each layer has distinct failure modes and appropriate verification techniques: grammar correctness is best verified by corpus tests against known-good parse trees, Rust crate behavior by unit tests with the standard cargo test framework, and MCP protocol compliance by integration tests that exercise the full JSON-RPC message flow. This layered approach ensures defects are caught at the earliest possible stage — a grammar error surfaces in corpus testing before it can propagate to MCP tool responses.
The dual-CI strategy (GitHub Actions for tree-sitter-sysml, GitLab CI for the capstone ecosystem) reflects the grammar’s dual contribution path: tree-sitter organization conventions require GitHub-hosted CI, while project-level coverage tracking and security scanning use GitLab Ultimate features.
Method
Scope
Environment
Corpus Testing
tree-sitter grammar constructs
Local (tree-sitter test)
Coverage Testing
Training file parse rate
GitLab CI
Unit Testing
Rust crates
Local (cargo test)
Integration Testing
MCP protocol compliance
Local (stdio)
Container Testing
Image builds, runtime
GitLab CI only
HTTP Transport Testing
Remote MCP connections
GitLab CI (service containers)
Acceptance Testing
End-to-end with Claude/VS Code
Local (stdio) + manual
System verification (this chapter) establishes that tools are functional and meet their requirements. The benchmark vignettes (Section C.1) then use these verified tools to evaluate AI-MBSE workflow effectiveness for the GVSETS publication — measuring whether MCP-enabled AI outperforms baseline approaches on real SE tasks. The two activities are complementary: verification is a prerequisite for meaningful benchmarking.
Test (T) dominates the verification method assignments below because this is a software-intensive system where most requirements are directly executable. The MCP protocol, repository integration, and SysML parsing requirements all produce observable, deterministic outputs given controlled inputs — making automated testing the most efficient and repeatable verification approach. Inspection (I) is reserved for documentation requirements where pass/fail is assessed by human review, and Demonstration (D) supplements testing for protocol compliance where showing a working client interaction provides additional confidence beyond unit-level assertions.
Requirement
Method
Rationale
FR-MCP-001
T, D
Test server initialization, demonstrate with client
FR-MCP-002, FR-MCP-005
T
Test tool enumeration and execution
FR-REPO-001, FR-REPO-002
T
Test file read from Git repositories
FR-SYS-001
T
Test parsing via tree-sitter corpus tests
FR-SYS-006
T
Test grammar subset via training file parse rate
FR-SYS-007
T
Test error recovery (tree-sitter ERROR nodes)
FR-SYS-008
I
Inspect tree-sitter-sysml README coverage docs
NFR-DEP-001
T, A
Test binary builds, analyze size
NFR-DEP-002
T
Test container builds in CI
NFR-DOC-001
I
Inspect Quarto output for completeness
NoteTailoring Note
The VMA table above covers the 11 highest-risk requirements that are directly verifiable through the current test infrastructure. The remaining 23 system requirements (covering HTTP transport, SysML v2 API integration, container deployment, and security) are deferred to post-Phase 1 verification as their corresponding features are implemented. Per INCOSE Handbook 4.3.4, this tailoring is appropriate for a software-intensive academic project where verification activities are prioritized by implementation phase.
11.3 Acceptance Criteria
Requirement Category
Verification Method
Acceptance Criteria
MCP Protocol Compliance
Integration test
Server initializes, lists tools/resources, executes tools
Repository Integration
Integration test
Read files from GitLab (reference) and self-hosted
SysML v2 Validation
System test
Validates correct/incorrect SysML syntax
Container Deployment
CI pipeline
Image builds, runs, responds to MCP requests
Documentation
Inspection
Quarto renders, deploys to GitLab Pages
11.4 Enabling Systems
Per [1, Sec. 2.3.5.9], enabling systems support verification activities.
Enabling System
Purpose
Responsibility
tree-sitter CLI
Grammar testing (tree-sitter test)
Local + CI
Cargo Test Framework
Rust unit and integration testing
Built into Rust toolchain
GitLab CI/CD
Automated pipeline execution
GitLab SaaS runners
GitHub Actions
tree-sitter grammar CI
GitHub runners
Buildah/Podman
Container image builds
CI environment only
Claude Desktop
Manual acceptance testing
Local development
MCP Inspector
Protocol debugging
Local development
Quarto
Documentation builds
Local + CI
11.4.1 Test Environment Configuration
Environment
Transport
External Services
Use Case
Local Dev
stdio
Mocked/optional
Unit tests, rapid iteration
CI Test
stdio
Mocked
Automated test suite
CI Integration
HTTP
GitLab API (PAT)
Integration tests
CI Container
HTTP
Service containers
End-to-end container tests
11.5 Test Cases
11.5.1 MCP Protocol Tests
ID
Test Case
Expected Result
Method
TC-MCP-001
Send initialize request
Server responds with capabilities
T
TC-MCP-002
Request tools/list
Returns list including sysml_parse
T
TC-MCP-003
Call sysml_parse with valid SysML
Returns parsed elements
T
TC-MCP-004
Request resources/list
Returns example resources
T
TC-MCP-005
Read sysml://examples/hello
Returns vehicle model content
T
11.5.2 Repository Integration Tests
ID
Test Case
Expected Result
Method
TC-REPO-001
Read file from public repo
Returns file content
T
TC-REPO-002
Read file with PAT auth
Returns file content
T
TC-REPO-003
List .sysml files in directory
Returns file list
T
TC-REPO-004
Read from self-hosted Git provider
Returns file content
T
TC-REPO-005
Handle non-existent file
Returns appropriate error
T
11.5.3 SysML Parsing Tests
11.5.3.1 tree-sitter Corpus Tests
ID
Test Case
Expected Result
Method
TC-SYS-001
Parse package declaration
Correct CST structure
T (corpus)
TC-SYS-002
Parse part definition
Correct CST structure
T (corpus)
TC-SYS-003
Parse requirement definition
Correct CST structure
T (corpus)
TC-SYS-004
Parse nested elements
Correct CST hierarchy
T (corpus)
TC-SYS-005
Parse with syntax errors
ERROR node in CST, partial parse
T (corpus)
11.5.3.2 Training File Coverage Tests
ID
Test Case
Expected Result
Method
TC-COV-001
Parse Module 01 files
Clean parse (no ERROR nodes)
T (CI)
TC-COV-002
Parse Module 02 files
Clean parse (no ERROR nodes)
T (CI)
TC-COV-003
Calculate overall parse rate
100% achieved (target was ≥10% Phase 1, ≥50% Phase 2)
A (CI)
11.6 Known Limitations
Container testing: Cannot be performed locally on macOS; relies on CI (risk R5, accepted)
HTTP transport: Requires CI service containers or Linux machine
SysML v2 API: Requires running API server; deferred to post-capstone. Basic parsing and repository operations work without API dependency
Grammar coverage: The tree-sitter-sysml grammar achieves 99.6% coverage across 275 external files (274/275) and 100% coverage of OMG training files (100/100). The single unparseable file uses non-standard UML syntax outside the SysML v2 specification. Full semantic compliance (type checking, import resolution) is out of scope for the tree-sitter grammar and deferred to future work (sysml.rs)