Assembly: The processor executes a custom instruction set called PySM (Not very original name, I know :) ). It's inspired by CPython Bytecode — stack-based, dynamically typed — but streamlined to allow efficient hardware pipelining.
Right now, I’m not sharing the full ISA publicly yet, but happy to describe the general structure: it includes instructions for stack manipulation, binary operations, comparisons, branching, function calling, and memory access.
Why not ARM/X86/etc...
Existing CPUs are optimized for static, register-based compiled languages like C/C++.
Python’s dynamic nature — stack-based execution, runtime type handling, dynamic dispatch — maps very poorly onto conventional CPUs, resulting in a lot of wasted work (interpreter overhead, dynamic typing penalties, reference counting, poor cache locality, etc.).
Wow, this is fascinating stuff. Just a side question (and please understand I am not a low-level hardware expert, so pardon me if this is a stupid question): does this arch support any sort of speculative execution, and if so do you have any sort of concerns and/or protections in place against the sort of vulnerabilities that seem to come inherent with that?
Right now, PyXL runs fully in-order with no speculative execution.
This is intentional for a couple of reasons:
First, determinism is really important for real-time and embedded systems — avoiding speculative behavior makes timing predictable and eliminates a whole class of side-channel vulnerabilities.
Second, PyXL is still at an early stage — the focus right now is on building a clean, efficient architecture that makes sense structurally, without adding complex optimizations like speculation just for the sake of performance.
In the future, if there's a clear real-world need, limited forms of prediction could be considered — but always very carefully to avoid breaking predictability or simplicity.
> it includes instructions for stack manipulation, binary operations
Your example contains some integer arithmetic, I'm curious if you've implemented any other Python data types like floats/strings/tuples yet. If you have, how does your ISA handle binary operations for two different types like `1 + 1.0`, is there some sort of dispatch table based on the types on the stack?
Python the language isn't stack-based, though CPython's bytecode is. You could implement it just as well on top of a register-based instruction set. You may have a point about the other features that make it hard to compile, though.
This sounds like your ‚arch‘ (sorry don‘t 100% know the correct term here) could potentially also run ruby/js if the toolchain can interpret it into your assembly language?
Good question — I’m not 100% sure.
I'm not an expert on Ruby or JS internals, and I haven’t studied their execution models deeply.
But in theory, if the language is stack-based (or can be mapped cleanly onto a stack machine), and if the ISA is broad enough to cover their needs, it could be possible.
Right now, PyXL’s ISA is tuned around Python’s patterns — but generalizing it for other languages would definitely be an interesting challenge.
Edit: Just want to mention that this sounds like a super interesting project. I have to admit that I struggled to see where python was run on the hardware when mentioning custom toolchains and a compilation step. But the important aspect is that your hardware runs this similar to how a vm would run it with all dynamic aspects of the language included. I wonder similar to a parent comment if something similar for wasm would be worth having.
HDL: Verilog
Assembly: The processor executes a custom instruction set called PySM (Not very original name, I know :) ). It's inspired by CPython Bytecode — stack-based, dynamically typed — but streamlined to allow efficient hardware pipelining. Right now, I’m not sharing the full ISA publicly yet, but happy to describe the general structure: it includes instructions for stack manipulation, binary operations, comparisons, branching, function calling, and memory access.
Why not ARM/X86/etc... Existing CPUs are optimized for static, register-based compiled languages like C/C++. Python’s dynamic nature — stack-based execution, runtime type handling, dynamic dispatch — maps very poorly onto conventional CPUs, resulting in a lot of wasted work (interpreter overhead, dynamic typing penalties, reference counting, poor cache locality, etc.).