As you all well know, ChatGPT is a pretty dang cool piece of software. Now a million people have written a million things about it, but today we’ll be covering a random idea I had in the shower about 30 minutes ago.
What if ChatGPT wrote a kernel? Now of course we’re not going for anything nuts, just a simple kernel that boots in qemu, says hello world. Many of you wonderful readers may have done this before just as a fun exercise and know it’s not the easiest of tasks especially if you mostly work with high level languages (Like I do, so if I screw something up, it’s actually because I’m stupid).
Enough talk, lets get in to it.
The first thing we’ll have it do is make a “hello world” kernel in assembly and give it a Makefile:
Ok and without giving it any review lets see if it builds….
It does! (with some slight modification to the Makefile - had to tell the linker to please be 32bit, but we’ll let that slide)
Ok, now lets ask how we’d run it:
Ok as a guy who’s seen a lot of qemu that’s obvious… So let’s see it run:
To the surprise of no one…. it doesn’t work. Why? Well it’s not a kernel. It is a regular old hello word written is assembly, a working one at that; running ./kernel
resulted in a perfect Hello World!
.
So let’s ask chatGPT to please rewrite it as an actual kernel
I’ll save you the time, it’s still not even close to working, but it kinda getting the gist…. sorta….
Ok, I’m going to be even more specific and give it a fair chance.
Yeah that didn’t work either. A not for the reader who doesn’t know low level code, it’s missing a proper “entry point” and basically all of the multiboot header. Let’s look up what a multiboot header look like on the ‘ole stack overflow and tell it to base it off that….
So after giving it that and a little playing around it gives me this
[BITS 32]
[global _start]
[ORG 0x100000] ;If using '-f bin' we need to specify the
;origin point for our code with ORG directive
;multiboot loaders load us at physical
;address 0x100000
MULTIBOOT_AOUT_KLUDGE equ 1 << 16
;FLAGS[16] indicates to GRUB we are not
;an ELF executable and the fields
;header address, load address, load end address;
;bss end address and entry address will be available
;in Multiboot header
MULTIBOOT_ALIGN equ 1<<0 ; align loaded modules on page boundaries
MULTIBOOT_MEMINFO equ 1<<1 ; provide memory map
MULTIBOOT_HEADER_MAGIC equ 0x1BADB002
;magic number GRUB searches for in the first 8k
;of the kernel file GRUB is told to load
MULTIBOOT_HEADER_FLAGS equ MULTIBOOT_AOUT_KLUDGE|MULTIBOOT_ALIGN|MULTIBOOT_MEMINFO
CHECKSUM equ -(MULTIBOOT_HEADER_MAGIC + MULTIBOOT_HEADER_FLAGS)
KERNEL_STACK equ 0x00200000 ; Stack starts at the 2mb address & grows down
VGA_MEM_LOC equ 0xB8000
VGA_WIDTH equ 80
VGA_HEIGHT equ 25
_start:
xor eax, eax ;Clear eax and ebx in the event
xor ebx, ebx ;we are not loaded by GRUB.
jmp multiboot_entry ;Jump over the multiboot header
align 4 ;Multiboot header must be 32
;bits aligned to avoid error 13
multiboot_header:
dd MULTIBOOT_HEADER_MAGIC ;magic number
dd MULTIBOOT_HEADER_FLAGS ;flags
dd CHECKSUM ;checksum
dd multiboot_header ;header address
dd _start ;load address of code entry point
;in our case _start
dd 00 ;load end address : not necessary
dd 00 ;bss end address : not necessary
dd multiboot_entry ;entry address GRUB will start at
multiboot_entry:
mov esp, KERNEL_STACK ;Setup the stack
push 0 ;Reset EFLAGS
popf
push eax ;2nd argument is magic number
push ebx ;1st argument multiboot info pointer
call _Main ;Call _Main
add esp, 8 ;Cleanup 8 bytes pushed as arguments
cli
endloop:
hlt
jmp endloop
_Main:
hello_world: db 'Hello World!', 0
mov ebx, hello_world
mov eax, 13
call _vga_write
jmp endloop
_vga_write:
; Input: eax = pointer to null-terminated string
; Output: None
; Modifies: ebx, ecx, edx, esi, edi
mov ebx, eax ; Save string pointer in ebx
mov ecx, 0x0B8000 ; Address of video memory (VGA text mode)
; Get current cursor position
mov edx, 0x3D4
mov al, 0x0E
out dx, al
inc edx
in al, dx
mov bh, al
mov edx, 0x3D5
in al, dx
mov bl, al
; Calculate starting address of cursor position
shl bx, 1
add ecx, ebx
; Write string to video memory
mov esi, ebx
mov edi, ecx
mov al, [ebx]
cld
mov ah, 0x0F
inc ebx
rep stosw
ret
It boots! but it doesn’t print hello world. So giving chat GPT an entire multi boot header let’s it figure out multiboot, but still doesn’t print hello world.
Now I’m sure if I spent all day giving ChatGPT specific instructions it would work, but I honestly don’t want too (If a reader would like too I would be like to see the results) and at some point it defeats the purpose of using a chat bot to code.
ChatGPT failing at assembly simply shows it wasn’t trained on much of it, not to surprising if you consider that there is probably next to no assembly in it’s data set and on top of that probably not to much on how low level code in general works (I believe it used GitHub as it’s primary code dataset so that’s hardly surprising).
I was honestly expecting this to be an interesting post where we tested the limits of ChatGPT, but in the end got something working. We didn’t and I think a part 2 to this will be in the works where we maybe use C or straight up feed it technical documentation.
If you enjoyed the post please subscribe and follow me on twitter.
Happy Hacking!