Skip to main content

Morph Compiler Architecture Overview

Core Model

Morph currently compiles through these major stages:

Source
-> Lexer
-> Parser
-> Frontend
-> Semantic Analysis
-> NIR
-> MIR
-> Backend Host

The important architectural shift is that MorphAPI is no longer only a frontend-lowering tool. Covered construct families can now carry ownership from semantic resolution through backend route emission.

Core Is A Blind Orchestrator

Core (src/) knows nothing about the language it compiles.

This is not a limitation — it is a deliberate, non-negotiable invariant:

  • Core does not know what is, method, +, gpu, or any other keyword means.
  • Core does not know that variable declarations exist.
  • Core does not know what a Tensor or Int type is.
  • Core does not know which tokens are valid in a source file.
  • Core does not know how any construct is lowered, optimized, or emitted.

All of that knowledge lives exclusively in packages (morphs/). Tokens are declared in [tokens] inside package morph.toml files. Grammar rules live in [forms.*] inside block.toml files. Semantic rules, NIR lowering, backend routes, and runtime symbols live in feature.toml and the C++ registered through the MorphABI.

Core is a generic, domain-agnostic orchestration host: it runs the pipeline stages, dispatches to registered plugins, and enforces the ABI contract — but it does not own a single language concept. Any syntax, type, token, or semantic assumption found in src/ is an architectural leak and must be moved into the appropriate package.

If you are looking for where is is defined: it is not in src/. It is in morphs/Core/morph.toml under [tokens].
If you are looking for how a variable declaration is parsed: it is not in src/parser/. It is in morphs/Core/blocks/*/block.toml under [forms.*].


Main Subsystems

Compiler (src/)

  • lexer/: tokenization
  • parser/: raw syntax tree construction
  • frontend/: source loading and diagnostic setup
  • sema/: type checking, ownership rules, Morph surface semantic dispatch
  • nir/: typed SSA IR and Morph-backed optimization
  • mir/: canonical backend handoff IR built from NIR
  • codegen/: backend hosts, LLVM emission, JIT support
  • lsp/: editor-facing analysis and debug views
  • vcon/: VM container format, bytecode, and sandbox-adjacent tooling used by the CLI (morph vcon …)

Runtime (runtime/)

  • io/: input/output runtime calls
  • tensor/: tensor and dispatch runtime
  • graphics/: graphics and scene runtime
  • gpu/: device/runtime integration
  • platform/: platform abstraction

Morph Ownership

Morph packages live under morphs/ and are the exclusive source of truth for all language knowledge. Core does not hardcode any of it. Dozens of domains — Core, Types, Flow, Ops, Test, Tensor, GPU, Graphics, Wasm, Access, LLVM, platform packs, and others (see morphs/README.md) — each own their slice of the pipeline end-to-end: grammar (block.toml [forms.*]), features and routes (feature.toml), semantic roles, NIR lowers/optimizers, backend hosts (LLVM, Vulkan, SPIR‑V, NN, REPL), runtime families, and tooling commands, all declared through the root morph.toml graph and wired via the MorphABI — see Plugin-governed pipeline and Feature manifests.

The boundary rule is absolute: if domain knowledge (a token, a type name, a syntax form, a runtime symbol) appears anywhere in src/, that is a bug, not a feature.

Illustrative covered examples:

  • Input root/chain lowering and REPL behavior
  • Add surface semantic ownership with host/gpu/shader route selection
  • core.input and core.add host.llvm backend routes

Data Flow

AST
-> Morph surface semantic resolution
-> NIR emission
-> Morph-backed optimizer
-> MIR builder
-> backend host
-> target artifact

Compile, REPL, JIT, and MIR/LSP debug paths now share the NIR -> MIR -> backend host boundary.

Backend Direction

LLVMCodeGen is being reduced to a backend host/orchestrator for host.llvm. Domain-specific final emission is moving into Morph backend route components.

This migration is partial, not complete:

  • covered semantic families use Morph backend routes
  • uncovered legacy instructions still use the generic host implementation
  • missing configured Morph backend routes fail fast

See pipeline.md for the stage-by-stage view.