SWP Evaluation Showcase

Evaluation Questions

Domain Frameworks

96%

Format Adherence

92%

Tag Accuracy

Evaluation Overview

This evaluation tested an LLM's ability to understand, interpret, and apply Symbol Word Protocol tags across all seven domain frameworks. The AI was provided with the complete SWP documentation and framework specifications, then asked 25 questions spanning tag comprehension, domain-specific compliance, practical application, and edge cases.

Key Finding: When provided with SWP-tagged documents, the LLM consistently produced structured, deterministic responses that correctly referenced regulatory standards, compliance requirements, and tag semantics across all domains.

Domain Framework Performance

Response Quality Metrics

Question Categories & Results

Questions by Category

Accuracy by Question Type

Sample Evaluation Q&A

Explore selected questions and responses from the evaluation. Click to expand each example.

What does @weight: 9 signify in a SWP-tagged file? Tag Semantics ▼

Evaluation Focus: Understanding numerical tag values and their processing implications.

LLM Response Summary: A weight of 9 indicates the item is almost at the top of the importance hierarchy. It signals strategic significance - likely a core feature, compliance-critical component, or major business-critical deliverable. In automated processing, this triggers highest priority in job queues, more thorough validation, extra logging, and potential throttling of lower-weight tasks.

@weight @priority @risk_level

Interpret @data_sensitivity: Patient Data, Clinical Records for an LLM Healthcare ▼

Evaluation Focus: Domain-specific compliance understanding.

LLM Response Summary: The LLM correctly identified HIPAA and GDPR compliance requirements. It specified that data must be anonymized, secure, used only for inference (not stored), and that any training should use synthetic data or federated learning. The response included requirements for encryption, access control, audit logs, consent, de-identification, and PII removal.

@data_sensitivity @regulatory_standards @compliance_posture @audit_trail

Why is @fail_safe_protocol: Required critical in robotics? Robotics ▼

Evaluation Focus: Safety-critical domain understanding.

LLM Response Summary: The response correctly cited ISO 13482, IEC 61508, and ISO 26262 safety standards. It explained that fail-safe protocols protect human safety, prevent property damage, and ensure regulatory compliance. The LLM described specific components: safe state transitions, emergency stops, watchdog timers, hardware interlocks, and graceful degradation modes.

@fail_safe_protocol @real_time_processing @regulatory_standards

What is @risk_classification's role in finance documents? Finance ▼

Evaluation Focus: Financial regulatory compliance.

LLM Response Summary: The LLM explained that risk classification categorizes risks by type, magnitude, impact, and likelihood. It correctly referenced Basel III, SOX, and GDPR frameworks. The response detailed how classification influences risk appetite, mitigation strategies, monitoring, reporting to regulators, and capital allocation decisions.

@risk_classification @regulatory_standards @audit_trail

Generate a new SWP tag for data science projects Generative ▼

Evaluation Focus: Creative application of SWP principles.

LLM Response Summary: The LLM proposed @experiment_id: <UUID> - a unique identifier for each experiment in the data science workflow. It explained the tag would support reproducibility, audit trails, and collaboration by linking datasets, models, evaluations, and deployment artifacts. The response included integration examples with MLflow and Weights & Biases.

@experiment_id (proposed) @phase @status

Evaluation Methodology

The evaluation was designed to test comprehensive understanding of the Symbol Word Protocol across multiple dimensions.

Context Provided

Complete SWP documentation, all 7 domain frameworks (General, Healthcare, Robotics, Legal, EdTech, Finance, AI-Agents), and sample tagged documents.

Question Types

Tag identification, semantic interpretation, domain compliance, practical application, edge cases, and generative tasks.

Model Used

Evaluation conducted using a reasoning-capable LLM with chain-of-thought processing visible in responses.

Scoring Criteria

Tag accuracy, regulatory standard citations, format consistency, actionable guidance quality, and reasoning transparency.

What This Demonstrates

1. Format Adherence

The LLM consistently produced structured responses with tables, lists, and clear headers - directly influenced by the structured nature of SWP tags.

2. Determinism

Responses followed predictable patterns based on tag semantics. Similar tags across different domains produced consistently structured outputs.

3. Domain Awareness

The LLM correctly mapped tags to their appropriate regulatory standards: HIPAA/GDPR for Healthcare, ISO 13482 for Robotics, SOX/Basel III for Finance.

4. Reasoning Transparency

Chain-of-thought processing showed how SWP tags guided decision-making, making the AI's reasoning auditable and verifiable.

Try It Yourself

Experience how SWP transforms your documents into AI-readable formats with explicit structural signals.

Start Tagging Documents

What is Neurosymbolic AI?

Neurosymbolic AI is the convergence of two traditions in artificial intelligence that have historically developed in isolation:

Neural networks (the "neuro") — systems like GPT, Claude, and Llama that learn patterns from vast datasets. Powerful at language, creative generation, and fuzzy reasoning, but prone to hallucination, inconsistency, and opaque decision-making.
Symbolic reasoning (the "symbolic") — rule-based systems with explicit logic, structured knowledge, and deterministic behavior. Reliable and auditable, but brittle and difficult to scale to natural language.

For decades, researchers have sought ways to combine these strengths. The goal: neural flexibility guided by symbolic precision. Most approaches require custom model architectures, specialized training, or deep infrastructure changes. They work in labs but rarely reach production.

SWP takes a fundamentally different approach. Instead of modifying the neural network, it structures the input layer with deterministic symbolic tags that constrain and guide how the model processes information. The result is neurosymbolic behavior achieved through middleware — no model surgery required.

The SWP Architecture

The Symbol Word Protocol operates as a lightweight symbolic layer between humans and large language models. Here is how the three layers interact:

Neural Layer

LLM Processing

GPT, Claude, Llama, Mistral — any foundation model

→

Symbolic Layer

SWP Tags

Phase, weight, compliance, domain frameworks

→

Verification Layer

Proof Chain

SHA-256 hashes, tamper detection, audit trail

Each layer serves a distinct function:

The Neural Layer remains untouched. SWP works with any LLM — no fine-tuning, no weight modification, no custom architectures. The model reads SWP-tagged input and naturally produces more structured, deterministic output because it has explicit signals to follow.
The Symbolic Layer is where SWP operates. Deterministic classification algorithms analyze your document and prepend structured tags: @phase tells the model what processing stage applies, @weight signals priority, and domain-specific tags like @regulatory_standards: HIPAA inject compliance constraints. These are not suggestions to the model — they are cognitive scaffolding that channels how the model reasons.
The Verification Layer provides what neural networks alone cannot: cryptographic proof. Every document processed through the API receives a SHA-256 proof hash chained to the previous proof, creating a tamper-proof lineage. This is the deterministic accountability that regulated industries demand.

Why "Lightweight" Changes Everything

The distinction between SWP and traditional neurosymbolic approaches is not academic — it is practical, and it determines who can actually use the technology.

Traditional Neurosymbolic AI

Requires custom model architectures
Needs specialized training pipelines
Demands ML engineering expertise
Months of development time
Tightly coupled to specific models
Difficult to audit or explain
Rarely leaves the research lab

SWP Neurosymbolic Middleware

Works with any LLM out of the box
No training or fine-tuning required
API call — no ML expertise needed
Integrate in hours, not months
Model-agnostic by design
Every decision is tagged and auditable
Production-ready today

Traditional neurosymbolic research pursues approaches like neural theorem provers, differentiable logic programs, and neuro-symbolic concept learners. These are intellectually rigorous and scientifically important. But they require teams of researchers, custom infrastructure, and years of development.

SWP achieves the same fundamental goal — combining symbolic precision with neural flexibility — through a radically simpler mechanism: structured input. By adding explicit symbolic tags at the input layer, SWP gives the neural model the deterministic guidance it needs without any architectural modification. The model's own attention mechanisms latch onto the structured tags, naturally producing more consistent, auditable, and regulatory-aware outputs.

The Evidence

The evaluation data above demonstrates exactly what neurosymbolic integration should produce:

96% format adherence — symbolic structure in the input directly produces structured, predictable output. This is not the LLM guessing at format; it is following explicit symbolic signals.
92% tag accuracy — the model correctly interprets symbolic tags like @regulatory_standards: HIPAA and maps them to appropriate compliance behavior, demonstrating genuine symbolic-neural integration.
Cross-domain consistency — the same symbolic framework produces reliable results across Healthcare, Robotics, Legal, Finance, Education, and AI-Agent domains, proving the architecture generalizes.
Reasoning transparency — SWP tags make the model's decision-making auditable. Unlike pure neural outputs, every tagged response can be traced back to the symbolic constraints that shaped it.

This is the core insight: You do not need to rebuild the neural network to get symbolic behavior. You need to speak to it in a language that activates its existing capacity for structured reasoning. SWP is that language.

Where This Vision Leads

SWP as neurosymbolic middleware opens pathways that pure neural or pure symbolic systems cannot reach alone:

Regulated Industries

Healthcare, finance, and legal sectors cannot adopt AI without audit trails and compliance guarantees. SWP's symbolic tags + proof chain provides both, making LLM adoption possible in spaces that have resisted it.

Autonomous Systems

Robotics and autonomous agents need deterministic fail-safe logic alongside adaptive behavior. SWP's @fail_safe_protocol and @real_time_processing tags inject hard constraints into flexible neural reasoning.

Multi-Agent Orchestration

As AI agents collaborate, they need shared symbolic contracts defining roles, handoffs, and fallbacks. SWP's Agent Workflow framework provides this — explicit task sequences and handoff triggers that multiple agents can reliably follow.

LLM Fine-Tuning

SWP-tagged datasets are ideal for creating domain-specific LoRA adapters. Every encode response includes fine-tuning guidance, enabling teams to build specialized models that natively understand symbolic structure.

Verifiable AI

The proof chain creates an immutable record of every AI processing event. As governments move toward AI accountability legislation, SWP provides the cryptographic infrastructure for compliance.

Knowledge Graphs

SWP tags create structured metadata that maps naturally to knowledge graph nodes and edges. Tagged document collections become navigable, queryable knowledge bases without manual ontology design.

A New Category

IdeaPhase is not competing with LLM providers like OpenAI or Anthropic. It is not a prompt engineering tool. It is not a fine-tuning platform.

SWP creates a new category: neurosymbolic middleware for large language models. It sits between the human and the model, adding the symbolic structure and cryptographic verification that neural networks cannot provide on their own.

The academic community has pursued heavy neurosymbolic integration for years. SWP demonstrates that lightweight, production-ready neurosymbolic behavior is achievable today — not through architectural complexity, but through structured input that activates the reasoning capacity already present in modern language models.

The Symbol Word Protocol is patent pending with the USPTO. IdeaPhase is the first implementation of lightweight neurosymbolic middleware — making structured, verifiable, compliance-aware AI accessible to anyone with an API key.

Explore the API

See the full technical documentation for integrating SWP neurosymbolic encoding into your systems.

API Documentation