Compiler Pipeline Reference
Overview
Source (.mx)
-> Lexer
-> Parser
-> Frontend
-> Semantic Analysis
-> NIR Builder
-> Morph-backed Optimizer
-> MIR Builder
-> Backend Host
-> Target Artifact
Current target artifacts include native binaries through the host.llvm backend host. Morph-owned backend routes are now part of the pipeline for covered construct families.
Stage 1 - Lexer (src/lexer/)
Input: source text
Output: std::vector<Token>
Role:
- tokenizes source text
- attaches source locations
- leaves context-sensitive decisions to later stages
Stage 2 - Parser (src/parser/)
Input: token stream
Output: raw AST
Role:
- builds the structural AST
- does not own Morph surface semantics
- keeps operators, calls, member access, indexing, and context blocks as syntax-level constructs
Stage 3 - Frontend (src/frontend/)
Input: file paths and CLI settings
Output: loaded source graph and diagnostic setup
Role:
- source loading
- import resolution
- diagnostic context wiring
Stage 4 - Semantic Analysis (src/sema/)
Input: raw AST
Output: type-checked AST plus Morph surface facts for owned constructs
Role:
- name resolution and type checking
- generic semantic validation
- Morph surface semantic dispatch through the structural surface host
- construction of typed semantic facts for Morph-owned families
Implemented Morph-owned semantic families currently include Input and Add.
Stage 5 - NIR Build + Optimization (src/nir/)
Input: semantic AST and Morph surface facts
Output: optimized NIR module
Role:
- lowers semantic AST into typed SSA NIR
- preserves Morph ownership by tagging Morph-owned instructions with semantic family metadata
- runs the default Morph-backed optimizer pipeline loaded from
morphs/pipelines.toml
Morph-owned lowering roots live under morphs/<Domain>/Lowering/<Root>/ and Morph-owned surface/operator families live under morphs/<Domain>/Surface/<Family>/.
Stage 6 - MIR Construction (src/mir/)
Input: optimized NIR module
Output: MIR module
Role:
- consumes NIR, not AST
- mirrors the optimized NIR function/block structure into MIR
- preserves
operationId, structuralInstKind,executionHint, andsemanticFamilyId - retains an owned NIR backing module for backend hosts during the transition to fully Morph-owned backend emission
This is now the canonical codegen handoff boundary used by compile, REPL, JIT, and MIR debug views.
Stage 7 - Backend Host (src/codegen/)
Input: MIR module
Output: target artifact
Role:
LLVMCodeGenis thehost.llvmbackend host/orchestrator- Morph-owned backend route components can emit covered semantic families through the opaque backend route API
- uncovered legacy instructions still fall through to the generic backend host implementation during migration
- route selection uses
MorphContextand explicit project-configured route ids
Currently implemented host.llvm Morph backend routes include:
core.inputcore.add
Missing backend routes for Morph-owned semantic families are hard failures.
Runtime Paths
The following entry points now use the same public stage boundary:
- compile pipeline
- REPL pipeline
- JIT path
- MIR debug views
They all build NIR first, then MIR, then invoke the backend host.
Validation
Use morphc after pipeline changes:
.\morphc.bat test "mir"
.\morphc.bat test "llvm"
.\morphc.bat build