| | | | | |
MASM comes on 5 disks. Install it somewhere and add it to your PATH.
        ❯ set MASM_INSTALL_PATH '$HOME/<path_to_ml.exe>' 
       
        ❯ dosbox-x -c "MOUNT C: "(pwd) -c "MOUNT D "$MASM_INSTALL_PATH -c "SET PATH=%PATH%;D:\\" 
       
Check if that works. Once I have a simple "Hello World," I’ll create a . For now, let’s just try to get something running.
        .386 
       
        .model small 
       
        .stack 100h 
       
         
        .data 
       
        msg db "Hello World!", 13, 10, '$' 
       
         
        .code 
       
        .startup 
       
            mov ax, @data 
       
            mov ds, ax 
       
         
            mov ah, 9 
       
            mov dx, offset msg 
       
            int 21h 
       
         
            mov ah, 4Ch 
       
            mov al, 0 
       
            int 21h 
       
         
        .exit 
       
        end 
       
let's run it:
        C:\ML DAY01.ASM 
       
        Microsoft (R) Macro Assembler Version 6.11 
       
        Copyright (C) Microsoft Corp 1981-1993. All rights reserved. 
       
         
        Assembling: DAY01.ASM 
       
        DAY01.ASM(9): error A2199: .STARTUP does not work with 32-bit segments 
       
        DAY01.ASM(21): error A2198: .EXIT does not work with 32-bit segments 
       
        DAY01.ASM(14): error A2022: instruction operands must be the same size 
       
Oops. What’s going on here? It took me a while and one question (not even an answer) on Stack Overflow to figure out the order of compiler directives really matters. Amusingly, this detail is only spelled out in the MASM 5.1 manual and seems to have vanished from the MASM 6.0 docs…

Documents archived by BitSavers.org
Microsoft Macro Assembler 5.1 Programmer's Guide
With that small change, everything’s back to how it should be. Next up: debugging. DosBox-X ships with a pretty nice debugger, but you need to build it yourself. The version you get from Homebrew doesn’t have debugging enabled. Fortunately, you can compile it by hand:
        dosbox-x on  master  
       
        ❯ ./build-debug-macos-sdl2 
       
Before we jump into the puzzle, let’s see how to get command-line arguments. This is probably the second time I’ve had to dig into the PSP recently, and it turns out that if you want argv in DOS, you need to read the size at and then the string between and . It’s capped at 127 bytes.
        C:>DEBUGBOX DAY01.EXE EXAMPLE.TXT 
       
is an internal command that loads a program with parameters and sets a breakpoint at the first instruction. It looks like this:

Screenshot by author.
Program Segment Prefix memory view under DosBox-X debugger
When I look at the Program Segment Prefix under Data View, I can see the string at offset in the command-line tail section, but I also see it at offset . That was unexpected. Turns out those are two File Control Block (FCB) entries in the PSP. FCBs predate file handles and come from CP/M. DOS kept them for backward compatibility, but they aren’t typically relevant for modern usage. Wikipedia states:
The FCB originates from CP/M and is also present in most variants of DOS, though only as a backward compatibility measure in MS-DOS versions 2.0 and later.
This is spot on. MS-DOS 5.0 still documents the old FCB functions -- albeit briefly -- and doesn’t explain what means in any great detail. For that, you have to check older manuals like the Microsoft MS-DOS 3.2 Programmer's Reference:

Documents archived by Internet Archive
Microsoft Macro Assembler 3.2 Programmer's Guide
MS-DOS populates these blocks with the first two parameters passed on the command line. If a parameter included a path, DOS would store only the drive letter (with no filename). For instance:
        C:\>DAY01.EXE C: 
       
The Data View might look like this:
        0814:00000050 CD 21 CB 00 00 00 00 00 00 00 00 00 03 20 20 20  .!........... 
       
        0814:00000060 20 20 20 20 20 20 20 20 00 00 00 00 00 20 20 20          ..... 
       
If I do:
        C:\>DAY01.EXE DOESNTEX.IST 
       
Then Data View looks like:
        0814:00000050 CD 21 CB 00 00 00 00 00 00 00 00 00 00 44 4F 45  .!...........DOE 
       
        0814:00000060 53 4E 54 45 58 49 53 54 00 00 00 00 00 20 20 20  SNTEXIST.....    
       
Notice how DOS doesn’t validate whether the argument is a filename. It just takes the first one and populates the FCB. It also doesn’t preserve a dot (.), so becomes internally (FAT-style). That’s a conversation for another day.
What’s also interesting is that the MS-DOS 3.2 manual still refers to the FCB as something used by old system calls. Wikipedia notes that FCBs were introduced in MS-DOS 1.0, and by version 2.0 they were only kept for compatibility. I can’t fully verify this, but from what I’ve seen, the 3.2 manual labels them as “old,” while the MS-DOS 2.00 Programmer's Manual doesn’t. And without digging into the actual source code, it’s hard to say for sure. What we do know is that DOS 3.0 underwent a a major kernel re-write, at which point and file-handle-based I/O became the primary focus.
Anyway, for our purpose, the single byte at is the length of the command-line arguments. The actual string starts at , including the trailing whitespace.

Documents archived by Internet Archive
Microsoft Macro Assembler 3.2 Programmer's Guide
Let’s update our program to retrieve the filename from the PSP. First, we declare a buffer:
            FileNameBuffer       db  128 dup (0) 
       
Then we ask DOS for the PSP segment, just to be sure:
            mov ah, 51h            ; Get PSP segment 
       
            int 21h 
       
            mov es, bx             ; ES now points to PSP 
       
Check if there’s a parameter string:
            mov si, 80h 
       
            xor ax, ax 
       
            mov al, es:[si]         ; AL = length of parameter string 
       
            cmp al, 0 
       
            jz  exit                ; If zero, no arguments => print usage 
       
If not zero, we copy it out of the PSP into our own buffer. Sure, we could use , but that would mean juggling segment registers for vs. PSP. Instead, I just wrote a small copy loop:
            add si, 2               ; first character in the command-line tail is a space! 
       
            mov cx, ax 
       
            mov di, offset FileNameBuffer 
       
            cld 
       
             
       
        copy_loop: 
       
            mov al, es:[si] 
       
            mov ds:[di], al 
       
            inc si 
       
            inc di 
       
            loop copy_loop 
       
         
            mov al, '$' 
       
            mov ds:[di], al         ; 21.9 prints `$` terminated strings, not null. 
       
All that remains is a proper Makefile. It’s tricky to capture DosBox-X output directly, but DOS redirection doesn’t get passed to the PSP, so we can safely use it. Then we can just cat the output file in our :
        MASM_INSTALL_PATH ?= "${HOME}/.dos/MASM611" 
       
        MASM_MOUNT ?= "D" 
       
        DOSBOX_CMD ?= "dosbox-x" 
       
        ROOT_DIR:=$(shell dirname $(realpath $(firstword $(MAKEFILE_LIST)))) 
       
         
        ASM_SRC = day01.asm 
       
        EXE = $(basename $(ASM_SRC)).exe 
       
         
        all: $(EXE) 
       
         
        $(EXE): $(ASM_SRC) 
       
        	@$(DOSBOX_CMD) -silent -nolog -nogui \ 
       
                -c "MOUNT C $(ROOT_DIR)" \ 
       
                -c "MOUNT $(MASM_MOUNT) $(MASM_INSTALL_PATH)/BIN" \ 
       
                -c "SET PATH=%PATH;$(MASM_MOUNT):\\" \ 
       
                -c "C:" \ 
       
                -c "ML /Zd /Zi $(notdir $<) > ML.TXT" \ 
       
                -c "exit" 
       
        	@cat ML.TXT 
       
        	@rm -f ML.TXT 
       
         
        .PHONY: run 
       
        run: $(EXE) 
       
        	@$(DOSBOX_CMD) -silent -nolog -nogui \ 
       
            	-c "MOUNT C $(ROOT_DIR)" \ 
       
            	-c "C:" \ 
       
            	-c "$(EXE) > RUN.TXT" \ 
       
            	-c "exit" 
       
        	@cat RUN.TXT 
       
        	@rm -f RUN.TXT 
       
         
        .PHONY: clean 
       
        clean: 
       
        	@rm $(EXE) 
       
Now let’s tackle the puzzle.
I decided to write a simple Bubble Sort function for sorting a list of 32-bit integers. I toyed with the idea of doing Quick Sort, but with only 1000 integers, Bubble Sort is good enough and way more compact.
I used a custom calling convention that relies on registers only. Typically, x86 calling conventions use , , and (and sometimes ), but I chose , , and so on to minimize trouble and for semantic clarity:
        ;-------------------------------------------------------------------- 
       
        ; Sub-procedure to sort array of 32 bit integers 
       
        ; In: 
       
        ;   CX - element count 
       
        ;   SI - offset to array 
       
        ; Destroys: 
       
        ;   CX, DX, SI, DI 
       
        ;-------------------------------------------------------------------- 
       
        BubbleSort32 PROC 
       
            ; if array has < 2 elements, no sort needed 
       
            cmp cx, 2 
       
            jb bs16_exit 
       
         
            ; Set DI to the start of the array, for the inner loops 
       
            mov di, si 
       
            dec cx 
       
         
        bs16_outter_loop: 
       
            cmp cx, 0 
       
            je bs16_exit 
       
         
            mov bx, cx                      ; set counter for the inner loop 
       
            mov si, di                      ; set pointer for the inner loop 
       
         
            bs16_inner_loop: 
       
                mov  edx, [si]              ; get array[i] 
       
                cmp  edx, [si + 4]          ; compare dx(array[i]) and array[i+1] 
       
                jbe  bs16_no_swap           ; if array[i] <= array[i+1] -> no swap needed 
       
         
                xchg edx, [si + 4]          ; array[i+1] is set to array[i], while dx holds old array[i+1] value 
       
                mov [si], edx               ; array[i] = old array[i+1] 
       
         
            bs16_no_swap: 
       
                add si, 4 
       
                dec bx 
       
                jnz bs16_inner_loop 
       
         
            dec cx 
       
            jmp bs16_outter_loop 
       
         
        bs16_exit: 
       
            ret 
       
        BubbleSort32 ENDP 
       
Then I wrote another function to compute the sum of differences between two arrays. I used the stdcall calling convention, that I remembered from the WATCOM C++ times, which means parameters go on the stack and the callee cleans them up:
        push offset RightArray 
       
        push offset LeftArray 
       
        push [LineCount] 
       
        call FindSumOfDifferences 
       
Under DosBox-X, if you press or , you can see the stack in Data View:
![Screenshot of DosBox-X debugger. There are 4 windows: Register Overview, Data view, Code Overview and Output. Code Overview has a piece of code highlighted. The code does push 4C50; push 3B20; push word [3B1E]; call 000001D0; The data view has a rectangle highlighting corresponding stack area and shows how parameters and return address for the function look on the stack.](/images/2025/01/babel-of-code-w01/dosbox-stdcall.png)
Screenshot by author.
Stack memory view after function call under DosBox-X debugger
Stack is at which is , the first value there is , instruction automatically saves of the next instruction, so the address to return to. Next are our parameters - , and .
Here's the final code:
        FindSumOfDifferences PROC 
       
            push bp 
       
            push si             ; only eax, ecx and edx are for use 
       
            push di             ; only eax, ecx and edx are for use 
       
            mov  bp, sp 
       
         
            mov cx, [bp + 8]    ; Lines Count. +4 because we also pushed BP on 
       
                                ; the stack, so the stack looks like this: 
       
                                ; BP+0 OLD BP 
       
                                ; BP+2 OLD SI 
       
                                ; BP+4 OLD DI 
       
                                ; BP+6 RET ADDRESS 
       
                                ; BP+8 LinesCount 
       
                                ; ... 
       
            mov si, [bp + 10] 
       
            mov di, [bp + 12] 
       
         
            xor eax, eax 
       
            xor edx, edx 
       
         
        fsod_loop: 
       
            mov edx, [si] 
       
            cmp edx, [di] 
       
            jbe fsod_smaller 
       
            sub edx, [di] 
       
            jmp fsod_advance 
       
         
        fsod_smaller: 
       
            mov edx, [di] 
       
            sub edx, [si] 
       
         
        fsod_advance: 
       
            add si, 4 
       
            add di, 4 
       
            add eax, edx 
       
            loop fsod_loop 
       
         
            pop di 
       
            pop si 
       
            pop bp     
       
            ret 8 
       
        FindSumOfDifferences ENDP 
       
WATCOM C++ had (in a roundabout way) a register calling convetion, so I took a stab at something similar for the next function. It’s not terribly complicated, just requires a bit of planning with registers:
        FindOccurencesCount PROC 
       
            push si 
       
            push di 
       
         
            mov si, dx 
       
            mov di, bx 
       
         
            mov ecx, eax 
       
            xor edx, edx 
       
        foc_left_loop: 
       
            push ecx 
       
            mov ebx, eax 
       
            mov ecx, [si] 
       
            push edi 
       
            foc_right_loop: 
       
                push ebx 
       
         
                cmp ecx, [di] 
       
                jne foc_right_advance 
       
                add edx, ecx 
       
         
            foc_right_advance: 
       
                mov bx, di 
       
                add bx, 4 
       
                mov di, bx 
       
         
                pop ebx 
       
                dec ebx 
       
                jnz foc_right_loop 
       
         
            pop edi 
       
            mov cx, si 
       
            add cx, 4 
       
            mov si, cx 
       
            pop ecx 
       
            loop foc_left_loop 
       
         
            mov eax, edx 
       
            pop di 
       
            pop si 
       
            ret 
       
        FindOccurencesCount ENDP 
       
And there you have it. The complete solution is on Codeberg. Enjoy!