Morph Compiler Architecture Overview
Core Model
Morph currently compiles through these major stages:
Source
-> Lexer
-> Parser
-> Frontend
-> Semantic Analysis
-> NIR
-> MIR
-> Backend Host
The important architectural shift is that MorphAPI is no longer only a frontend-lowering tool. Covered construct families can now carry ownership from semantic resolution through backend route emission.
Core Is A Blind Orchestrator
Core (src/) knows nothing about the language it compiles.
This is not a limitation — it is a deliberate, non-negotiable invariant:
- Core does not know what
is,method,+,gpu, or any other keyword means. - Core does not know that variable declarations exist.
- Core does not know what a
TensororInttype is. - Core does not know which tokens are valid in a source file.
- Core does not know how any construct is lowered, optimized, or emitted.
All of that knowledge lives exclusively in packages (morphs/). Tokens are declared in [tokens] inside package morph.toml files. Grammar rules live in [forms.*] inside block.toml files. Semantic rules, NIR lowering, backend routes, and runtime symbols live in feature.toml and the C++ registered through the MorphABI.
Core is a generic, domain-agnostic orchestration host: it runs the pipeline stages, dispatches to registered plugins, and enforces the ABI contract — but it does not own a single language concept. Any syntax, type, token, or semantic assumption found in src/ is an architectural leak and must be moved into the appropriate package.
If you are looking for where
isis defined: it is not insrc/. It is inmorphs/Core/morph.tomlunder[tokens].
If you are looking for how a variable declaration is parsed: it is not insrc/parser/. It is inmorphs/Core/blocks/*/block.tomlunder[forms.*].
Main Subsystems
Compiler (src/)
lexer/: tokenizationparser/: raw syntax tree constructionfrontend/: source loading and diagnostic setupsema/: type checking, ownership rules, Morph surface semantic dispatchnir/: typed SSA IR and Morph-backed optimizationmir/: canonical backend handoff IR built from NIRcodegen/: backend hosts, LLVM emission, JIT supportlsp/: editor-facing analysis and debug viewsvcon/: VM container format, bytecode, and sandbox-adjacent tooling used by the CLI (morph vcon …)
Runtime (runtime/)
io/: input/output runtime callstensor/: tensor and dispatch runtimegraphics/: graphics and scene runtimegpu/: device/runtime integrationplatform/: platform abstraction
Morph Ownership
Morph packages live under morphs/ and are the exclusive source of truth for all language knowledge. Core does not hardcode any of it. Dozens of domains — Core, Types, Flow, Ops, Test, Tensor, GPU, Graphics, Wasm, Access, LLVM, platform packs, and others (see morphs/README.md) — each own their slice of the pipeline end-to-end: grammar (block.toml [forms.*]), features and routes (feature.toml), semantic roles, NIR lowers/optimizers, backend hosts (LLVM, Vulkan, SPIR‑V, NN, REPL), runtime families, and tooling commands, all declared through the root morph.toml graph and wired via the MorphABI — see Plugin-governed pipeline and Feature manifests.
The boundary rule is absolute: if domain knowledge (a token, a type name, a syntax form, a runtime symbol) appears anywhere in src/, that is a bug, not a feature.
Illustrative covered examples:
Inputroot/chain lowering and REPL behaviorAddsurface semantic ownership with host/gpu/shader route selectioncore.inputandcore.addhost.llvmbackend routes
Data Flow
AST
-> Morph surface semantic resolution
-> NIR emission
-> Morph-backed optimizer
-> MIR builder
-> backend host
-> target artifact
Compile, REPL, JIT, and MIR/LSP debug paths now share the NIR -> MIR -> backend host boundary.
Backend Direction
LLVMCodeGen is being reduced to a backend host/orchestrator for host.llvm.
Domain-specific final emission is moving into Morph backend route components.
This migration is partial, not complete:
- covered semantic families use Morph backend routes
- uncovered legacy instructions still use the generic host implementation
- missing configured Morph backend routes fail fast
See pipeline.md for the stage-by-stage view.