Hacker Newsnew | past | comments | ask | show | jobs | submit | noone_youknow's commentslogin

While I agree this might be a fun resource and useful example code for various aspects of legacy x86 interfacing, I would urge anyone who hopes to actually get into OS development to ignore this (and in fact every other tutorial I’ve ever seen, including those hosted on the popular sites).

For all the reasons stated in the link from the README [1] and agreed by the author, this project should not be followed if one wants to gain an understanding of the design and implementation of operating systems for modern systems. Following it will likely lead only to another abandoned “hello world plus shell” that runs only in emulation of decades old hardware.

My advice is get the datasheets and programmers’ manuals (which are largely free) and use those to find ways to implement your own ideas.

[1] https://github.com/cfenollosa/os-tutorial/issues/269


People interested in a "read the manual and code it up on real hardware"-type guide should take a look at Stanford's CS140E[1] repo! Students write a bare metal OS for a Raspberry Pi Zero W (ARMv6) in a series of labs, and we open source each year's code.

Disclaimer: I'm on the teaching team

[1]https://github.com/dddrrreee/cs140e-25win


Do you have the course online? Looks like a bunch of files

There aren't any 1hr+ lectures, just some readings (selected from the manuals in `docs/`) and a bit of exposition from the professor before diving into the lab. Lots of "as needed" assistance

Interesting. Now I am wondering if I could build an OS for the Zero. I have five of them sitting in my drawer

You absolutely can, and should! :)

not really my cup of tea, but a random feedback: take a look at nand2tetris as it's really "user friendly" and, if I'm not mistaken, was even made into a game on Steam.

It is user friendly, and it's astounding how much they managed to pack into a single semester. Highly recommended!

However, it's arguably too idealized and predetermined. I think you could get all the way through building the computer in https://nandgame.com/ without really learning anything beyond basic logic design as puzzle solving, but computer organization, as they call it in EE classes, is more about designing the puzzles than solving them. Even there, most of what you learn is sort of wrong.

I haven't worked through the software part, but it looks like it suffers from the same kind of problems. IIRC there's nothing about virtual memory, filesystems, race conditions, deadly embraces, interprocess communication, scheduling, or security.

It's great but it's maybe kind of an appetizer.


IMO this is kind of the tradeoff. In 140E we do touch on virtual memory (w/ coherency handling on our specific ARM core), FAT32 "from scratch", etc. but it comes at the expense of prerequisites. There is a lot of effort to "minify" the labs to their core lesson, but there is an inevitable amount of complexity that can't (or shouldn't) be erased.

disclaimer: I really have no idea about OSes, but hey...

Maybe it's a matter of marketing the product and managing expectations, but many of these projects are A ok being "legacy and obsolete" just for the sake of simplicity for introducing basic concepts.

Let's take two random examples.

(1) "let's create 3D graphics from scratch" It's quite easy to grab "graphics gems" and create a comprehensive tutorial on software renderer. Sure it won't be practical, and sure it will likely end on phong shading, but for those wanting to understand how 3d models are translated into pixels on screen it's way more approachable than studying papers on nanites.

(2) "let's crate a browser from scratch" It's been widely discussed that creating a new browser today would be complete madness (yet there's Ladybird!), but shaving the scope, even if it wouldn't be able to run most modern websites would be a interesting journey for someone who'd interested in how things work.

PS. Ages ago I've done a Flash webpage that was supposed to mimic a desktop computer for ad campaign for tv show. Webpage acted as a personal computer of main character and people could lurk into it between episodes to read his emails, check his browser history, etc. I took it as a learning opportunity to learn about OS architecture and spent ungodly amount of unpaid overtime to make it as close to win3.1 running on dos as possible. Was it really an OS? Of course not, but it was a learning opportunity to get a grasp of certain things and it was extremely rewarding to have an easter egg with a command.com you could launch and interact with the system.

Would I ever try to build a real OS. Hell no, I'm not as smart as lcamtuf to invite couple friends for drinks and start Argante. :)


To pick on graphics, since I'm more familiar with that domain, the problem isn't that this tutorial is about software rasterization, it's that the tutorial is a raytracer that doesn't do shading, textures, shadows, or any geometry but spheres, and spends most of its word count talking about implementing the trig functions on fixed-point numbers instead of just using math.h functions on IEEE floats.

Well put! This succinctly sums up the crux of my argument in my other comments.

great counterpoint :)

That simulated personal computer for a TV character actually sounds really cool. I love the idea that the environment would change from week to week with each new episode. What was the TV show?

Obviously you'll need to read the manuals to get much done, but these kinds of tutorials are complimentary.

The issue with x86_64 is that you need to understand some of the legacy "warts" to actually use 64-bits. The CPU does not start in "long mode" - you have to gradually enable certain features to get yourself into it. Getting into 32-bit protected mode is prerequisite knowledge to getting into long mode. I recall there was some effort by Intel to resolve some of this friction, breaking backward compatibility, but not sure where that's at.

The reason most hobby OS projects die is more to do with drivers. While it's trivial to support VGA and serial ports, for a modern machine we need USB3+, SATA, PCI-E, GPU drivers, WIFI and so forth. The effort for all these drivers dwarfs getting a basic kernel up and running. The few novel operating systems that support more modern hardware tend to utilize Linux drivers by providing a compatible API over a hardware abstraction layer - which forces certain design constraints on the kernel, such as (at least partial) POSIX compatibility.


Even taking only x86_64 as an example, going from real to long modes is primarily of concern to those writing firmware these days - a modern operating system will take over from UEFI or a bootloader (itself usually a UEFI executable). The details of enabling A20, setting up segmentation and the GDT, loading sectors via BIOS etc are of course historically interesting (which is fine if that’s the goal!) but just aren’t that useful today.

The primary issue with most tutorials that I’ve seen is they don’t, when completed, leave one in a position of understanding “what’s next” in developing a usable system. Sticking with x86_64, those following will of course have set up a basic GDT, and even a bare-bones TSS, but won’t have much understanding of why they’ve done this or what they’ll need to do to next to support syscall, say, or properly layout interrupt stacks for long mode.

By focusing mainly on the minutiae of legacy initialisation (which nobody needs) and racing toward “bang for my buck” interactive features, the tutorials tend to leave those completing it with a patchy, outdated understanding of the basics and a simple baremetal program that is in no way architected as a good base upon which to continue toward building a usable OS kernel.


You can just copy and paste the initialization code. It only runs once and there's very little of value to learn from it (unless you're into retrocomputing).

I don't even know where to _begin_ writing an operating system.

If i wanted to learn just so i have a concept of what an os does, what would you recommend?

I'm not trying to write operating systems per se. I'm trying to become a better developer by understanding operating systems.


Go do the xv6 labs from the MIT 6.828 course, like yesterday. Leave all textbooks aside, even though there are quite a few good ones, forget all GitHub tutorials that have patchy support, blogs that promise you pie in the sky.

The good folks at MIT were gracious enough to make it available for free, free as in free beer.

I did this course over ~3 months and learnt immeasurably more than reading any blog, tutorials or textbook. There’s broad coverage of topics like virtual memory, trap processing, how device drivers work (high-level) etc that are core to any modern OS.

Most of all, you get feedback about your implementations in the form of tests which can help guide you if you have a working or effective solution.

10/10 highly recommended.


Tanenbaum's textbook is highly readable, comprehensive (surveys every major known solution to each major prpblem), and mostly correct. xv6 may be a smaller, more old-fashioned, and more practical approach. RISC-V makes the usually hairy and convoluted issues of paging and virtual memory seem simple. QEMU's GDB server, OpenOCD, JTAG, and SWD can greatly reduce the amount of time you waste wondering why things won't boot. Sigrok/Pulseview may greatly speed up your device driver debugging. But I haven't written an operating system beyond some simple cooperative task-switching code, so take this with a grain of salt.

Funny question since you bring up JTAG and RISC-V -- do you have a cheapish RISC-V device you'd recommend that actually exposes its JTAG? The Milk-V Duo S, Milk-V Jupiter, and Pine64 Oz64 all seem not to expose one; IIRC, the Jupiter even wires TDO as an input (on the other side of a logic level shifter)...

That doesn't seem off-topic at all to me!

I don't know what to recommend there. I have no relevant experience, because all my RISC-V hardware leaves unimplemented the privileged ISA, which is the part that RISC-V makes so much simpler. The unprivileged ISA is okay, but it's nothing to write home about, unless you want to implement a CPU instead of an OS.


If you want to practice, try: https://littleosbook.github.io/

I am an occasional uni TA that teaches OS and I use littleosbook as the main reference for my own project guidebook.

It's a decent warm-up project for undergraduates, giving them a first-hand experience programming in a freestanding x86 32-bit environment.


Start with a simple program loader and file system, like DOS.

I suggest the opposite. Never DOS, particularly never MS-DOS and never x86 or anything in that family. They are an aboslute horror show of pointless legacy problems on a horrifying obsolete platform. Practically everything you'd learn is useless and ugly.

Start with an RTOS on a microcontroller. You'll see what the difference is between a program or a library and a system that does context switching and multitasking. That's the critical jump and a very short one. Easy diversion to timers and interrupts and communication, serial ports, buses and buffers and real-time constraints. Plus, the real-world applications (and even job opportunities) are endless.


If you have that sort of disrespect for computing history, you will only destroy and reinvent things badly.

risc-v seems is a clean-sheet design and that should be a good starting point (imho).

fwiw, xv-6, the pedagogical os has migrated to it.


Nice work! Looks like most of the basics are covered, and meanwhile in my current kernel the RISC-V entrypoint is >700 lines (of C) just to get to the arch-independent entrypoint!

I was just looking around for your input/output code, I don’t know zig but I expected to find putChar in kernel.zig based on the import in common.zig, but I don’t see it, should I be looking somewhere else? I didn’t see any simple command line processing either as mentioned in the README?

Mostly just looking around since your README mentioned VGA (and you seem to have a BIOS boot) which struck me as interesting on a RISC-V project, I was curious if you were actually using the SBI routines or had actually mapped in a VGA text mode buffer?


There is a todo in the putchar stub. Looks like it’s not implemented yet.

I have it implemented here in my own roughly 1k line zig kernel: https://github.com/Fingel/aeros-v/blob/ddc6842291e9cf4876729...


Thanks! I see that you’re using the SBI routines, which is what I was expecting here but couldn’t find - the reference to “output text to VGA” in the post made me curious.

I did see the putchar stub in the user.zig but, lacking understanding of zig, wasn’t sure how that could work given common.zig is looking for putChar in kernel.zig as far as I could tell.

I just jumped through a couple of hoops to get zig 0.13 installed and see an error about “root struct of file ‘kernel’ had no member named ‘putchar’” so I guess maybe it’s not implemented here at all.


I don’t know what output to the VGA means from the OP. I suspect that entire project was vibe coded including the readme. It’s a common first step when writing an OS for x86 but for riscv you don’t use vga to get text output, you go through the UART.

Are you saying you had trouble getting putchar working in my implementation? As far as I know common.console is fully implemented and working. Let me know if something is wrong.


Sorry for the confusion, I was referring to the putchar not being implemented in OPs code, I got yours working right away (nice work btw).

What you mention about VGA being common on x86 was what got me curious, since it’s not a thing on riscv. In my own OS project I’m using a framebuffer to show a graphical terminal on both x86 and riscv64 so I wondered if OP was doing similar or was really using SBI output.


Same here. I got a Bambu X1C and have been very happy with it.


> It's a bit disappointing that every time somebody decides to write their own kernel, the first thing they do is implement some subset of the POSIX spec.

Well, not quite _every_ time. For example, I’m deliberately not doing POSIX with my latest one[0], so I can be a bit more experimental.

[0] https://github.com/roscopeco/anos


Kudos for doing so! This is seriously a great endeavor. Regarding its relation to UNIX concepts, I do spot a classical hierarchical file system there, though. ;) Is it only an "add-on" (similar to IFS on IBM i), or is it fundamental?


Thank you!

At this early stage, the filesystem exists only to prove the disk drivers and the IPC interfaces connecting them. I chose FAT32 for this since there has to be a FAT partition anyway for UEFI.

The concept of the VFS may stick around as a useful thing, but it’s strictly for storage, there is no “everything is a file” ethos. It’s entirely a user space concept - the kernel knows nothing of virtual filesystems, files, or the underlying hardware like disks.


That makes sense.


Kudos for trying something else.


It’s a very noticeable step down from cursor in terms of AI integration IMO, but also a huge step up in almost everything else.

For a while I was running both cursor and RubyMine in tandem and switching between as needed, but lately I’ve been using Claude code for most stuff, in a RubyMine terminal and I hardly miss cursor at all.


Can you please share your workflow? I am kinda on the crossroad. Github copilot cuts the context window for the models while new Jetbrains AI Assistant pricing model is doubtful. AI Assistant seems to be much better integrated lately than the Github Copilot.

On the other hand - properly used Claude Code is very nice. Long story short: I am thinking about reducing AI usage to the CC only and attempt to use it as a chat assistant too. Plus maybe subscribing to minimal package from Jetbrains or Copilot for some nice commit generations and occasional chats with other models.


Our wonderful government “news” agency trying to legitimise this while also playing lip-service to balanced reporting: https://www.bbc.co.uk/news/live/cwy05zznyplt?post=asset%3A3e...


> Overall, the agency told BBC Verify that deleting 1,000 emails with attachments would save approximately 77.5 litres of water per year.

I very, very strongly doubt that.


I don't doubt the agency said that.


Nice article! Always good to see easy-to-follow explainers on these kinds of concepts!

One minor nit, for the “odd corner case that likely never exists in real code” of taken branches to the next instruction, I can think of at least one example where this is often used: far jumps to the next instruction with a different segment on x86[_64] that are used to reload CS (e.g. on a mode switch).

Aware that’s a very specific case, but it’s one that very much does exist in real code.


Author here. I'll work this in. Thank you.


I’ve been trying out various LLMs for working on assembly code in my toy OS kernel for a few months now. It’s mostly low-level device setup and bootstrap code, and I’ve found they’re pretty terrible at it generally. They’ll often generate code that won’t quite assemble, they’ll hallucinate details like hardware registers etc, and very often they’ll come up with inefficient code. The LLM attempt at an AP bootstrap (real-mode to long) was almost comical.

All that said, I’ve recently started a RISC-V port, and I’ve found that porting bits of low-level init code from x86 (NASM) to RISC-V (GAS) is actually quite good - I guess because it’s largely a simple translation job and it already has the logic to work from.


> They’ll often generate code that won’t quite assemble

Have you tried using a coding agent that can run the compiler itself and fix any errors in a loop?

The first version I got here didn't compile. Firing up Claude Code and letting it debug in a loop fixed that.


I have, and to be fair that has solved the “basically incorrect code” issue with reasonable regularity. Occasionally the error messages don’t seem helpful enough for it, which is understandable, and I’ve had a few occurrences of it getting “stuck” in a loop trying to e.g. use an invalid addressing mode (it may have gotten itself out of those situations if I were more patient) but generally, with one of the Claude 4 models in agent mode in cursor or Claude code, I’ve found it’s possible to get reasonably good results in terms of “does it assemble”.

I’m still working on a good way to integrate more feedback for this kind of workflow, e.g. for the attempt it made at AP bootstrap - debugging that is just hard, and giving an agent enough control over the running code and the ability to extract the information it would need to debug the resulting triple fault is an interesting challenge (even if probably not all that generally useful).

I have a bunch of pretty ad-hoc test harnesses and the like that I use for general hosted testing, but that can only get you so far in this kind of low-level code.


Similar experience - they seem to generally have a lot more problems with ASM than structured languages. I don't know if this reflects less training data, or difficulty.


As far as i can tell they have trouble with sustained satisfaction of multiple constraints, and asm has more of that than higher level languages. (An old Boss once said his record for bug density was in asm: he'd written 3 bugs in a single opcode)


I agree with this. Just the need to keep track of stack, flags and ad-hoc register allocations is something I’ve found they really struggle with. I think this may be why it does so much better at porting from one architecture to another - but even then I’ve seen it have problems with e.g. m68k assembly, where the rules for which moves affect flags are different from, say, x86.


The few times I've messed with it I've noticed they're pretty bad at keeping track of registers as they move between subroutines. They're just not great at coming up with a consistent "sub language" the way human assembly programmers tend to.


A bit tangential, but I've found 4 Sonnet to be much, much better at SIMD intrinsics (in my case, in Rust) than Sonnet 3.5 and 3.7, which were kind of atrocious. For example, 3.7 would write a scalar for loop and tell you "I've vectorized...", when I explicitly asked to do the operations with x86 intrinsics and gave it the capabilities of the hardware. Also, telling it to use AVX2 as supported would not make it use SSE or it would make conditionals to use them, which makes no sense. Seems Claude 4 solves most of that.

Edit: that -> than


This fits my experience. I’m definitely getting considerably better results with 4 than previous Claudes. I’d essentially dropped sonnet from my rotation before 4 became available, but now it’s a go-to for this sort of thing.


Definitely going to build this, I’ve been worried I might miss a nuke going off for a while now and this looks like just the thing!


For me, it’s not the typing - it’s the understanding. If I’m typing code, I have a mental model already or am building one as I type, whereas if I have an LLM generate the code then it’s “somebody else’s code” and I have to take the time to understand it anyway in order to usefully review it. Given that’s the case, I find it’s often quicker for me to just key the code myself, and come away with a better intuition for how it works at the end.


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: