Reverse-Engineering DOS 1.0 – Part 1: The Boot Sector

Update: The source is available at github.com/mist64/msdos1

The bootsector of DOS 1.0 is celebrating its 28th birthday today (it contains the timestamp “7-May-81”), so let’s look at it more closely.

Here it is:

00000000  eb 2f 14 00 00 00 60 00  20 37 2d 4d 61 79 2d 38  |........ 7-May-8|
00000010  31 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |1...............|
00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000030  00 fa 8c c8 8e d8 ba 00  00 8e d2 bc 00 7c fb a1  |................|
00000040  06 7c 8e d8 8e c0 ba 00  00 8b c2 cd 13 72 41 e8  |................|
00000050  58 00 72 fb 2e 8b 0e 02  7c 51 bb 00 00 33 d2 b9  |................|
00000060  08 00 be 01 00 56 b0 01  b4 02 cd 13 72 22 5e 58  |................|
00000070  e8 e7 00 2b c6 74 14 fe  c5 b1 01 be 08 00 3b c6  |................|
00000080  73 04 8b f0 eb 01 96 56  50 eb dd 2e ff 2e 04 7c  |................|
00000090  be 44 7d b8 42 7d 50 32  ff ac 24 7f 74 0b 56 b4  |................|
000000a0  0e bb 07 00 cd 10 5e eb  f0 c3 bb 00 00 b9 04 00  |................|
000000b0  b8 01 02 cd 13 1e 72 34  8c c8 8e d8 bf 00 00 b9  |................|
000000c0  0b 00 26 80 0d 20 26 80  8d 20 00 20 47 e2 f3 bf  |................|
000000d0  00 00 be 76 7d b9 0b 00  fc f3 a6 75 0f bf 20 00  |................|
000000e0  be 82 7d b9 0b 00 f3 a6  75 02 1f c3 be f9 7c e8  |................|
000000f0  a5 ff b4 00 cd 16 1f f9  c3 0d 0a 4e 6f 6e 2d 53  |...........Non-S|
00000100  79 73 74 65 6d 20 64 69  73 6b 20 6f 72 20 64 69  |ystem disk or di|
00000110  73 6b 20 65 72 72 6f f2  0d 0a 52 65 70 6c 61 63  |sk erro?..Replac|
00000120  65 20 61 6e 64 20 73 74  72 69 6b 65 20 61 6e 79  |e and strike any|
00000130  20 6b 65 79 20 77 68 65  6e 20 72 65 61 64 f9 0d  | key when read?.|
00000140  0a 00 cd 18 0d 0a 44 69  73 6b 20 42 6f 6f 74 20  |......Disk Boot |
00000150  66 61 69 6c 75 72 e5 0d  0a 00 50 52 8b c6 bf 00  |failur?.........|
00000160  02 f7 e7 03 d8 5a 58 c3  52 6f 62 65 72 74 20 4f  |........Robert O|
00000170  27 52 65 61 72 20 69 62  6d 62 69 6f 20 20 63 6f  |'Rear ibmbio  co|
00000180  6d b0 69 62 6d 64 6f 73  20 20 63 6f 6d b0 c9 00  |m.ibmdos  com...|
00000190  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000200

DOS 1.0 shipped on a 160 KB single sided disk. The boot code in the IBM PC’s BIOS loaded the first sector into RAM at segment 0x0000, offset 0x7C00 and ran it. Later versions of BIOS checked for 0xAA55 in the last word of the bootsector, but the first version did not. Note that DOS 1.0 is also pre-BIOS Parameter Block, i.e. the bootsector does not contain any information about the physical layout of the disk, since there was only a single disk size.

What the boot sector is supposed to do is read “IBMBIO.COM” and “IBMDOS.COM” into RAM and run them – these are the DOS system files for machine abstraction and DOS API, respectively. In MS-DOS, they would be called “IO.SYS” and “MSDOS.SYS”.

But the DOS 1.0 bootsector takes quite a lot of shortcuts. It assumes it’s always a 40 track, 8 sectors single-sided disk and the two files occupy the first sectors of the data area contiguously – something that SYS.COM could guarantee when making a disk bootable.

So the bootsector first loads the first sector of the root directory (hardcoded to track 0, sector 4) and compares the first two entries with “IBMBIO.COM” and “IBMDOS.COM”. For some reason, the comparison is case-insensitive, although DOS only allows uppercase filenames.

00000600  49 42 4d 42 49 4f 20 20  43 4f 4d 06 00 00 00 00  |IBMBIO  COM.....|
00000610  00 00 00 00 00 00 00 00  f7 02 02 00 80 07 00 00  |................|
00000620  49 42 4d 44 4f 53 20 20  43 4f 4d 06 00 00 00 00  |IBMDOS  COM.....|
00000630  00 00 00 00 00 00 00 00  0d 03 06 00 00 19 00 00  |................|

If they are not there, it prompts the user to replace the disk and tries again. Otherwise, it loads 20 sectors starting from track 0, sector 8 to segment 0x0060, offset 0x0000 into memory and jumps there.

0x60:0x0000 is the same as the linear address 0x0600. On the IBM PC, 0x0000 to 0x03FF are occupied by the interrupts vectors, 0x0400 to 0x4FF is used by BIOS for its variables, and DOS 1.0 uses 0x500 to 0x5FF as the “DOS Communication Area”, so code can start at 0x0600.

00000e00  e9 62 01 e9 6d 00 e9 b2  00 e9 d8 00 e9 e8 00 e9  |................|
00000e10  24 01 e9 3a 01 e9 51 03  e9 52 03 e9 44 01 b1 00  |................|
00000e20  22 00 42 49 4f 53 20 56  65 72 73 69 6f 6e 20 31  |..BIOS Version 1|
00000e30  2e 30 30 a0 32 32 2d 4a  75 6c 2d 38 31 00 0d 0a  |.00.22-Jul-81...|

These are the first few bytes of IBMBIO.COM. Its birthday is about six weeks from now, but I am going to post about its internals next week.

Let us finally look at the complete commented disassembly of the boot sector. It compiles with NASM, but will emit a few bytes differently because of variations in the assembly encoding – but variations in size have been compensated wit NOPs. It is actually quite easy to read.

;-----------------------------------------------------------------------------
; DOS 1.0 Boot Sector (disk image MD5 73c919cecadf002a7124b7e8bfe3b5ba)
;   http://www.pagetable.com/
;-----------------------------------------------------------------------------

                org 0x7C00

                jmp     short start

;-----------------------------------------------------------------------------

os_numsectors   dw 20                   ; how many sectors to read
os_offset       dw 0                    ; segment to load code into
os_segment      dw 0x60                 ; offset to load code into

                db " 7-May-81",0        ; timestamp
                times 31 db 0           ; padding

;-----------------------------------------------------------------------------

start           cli
                mov     ax, cs
                mov     ds, ax          ; DS := CS
                mov     dx, 0
                mov     ss, dx          ; SS := 0000
                mov     sp, 0x7C00      ; stack below code
                sti
                mov     ax, [os_segment]
                mov     ds, ax
                mov     es, ax          ; ES := DS := where to load DOS
                mov     dx, 0
                mov     ax, dx
                int     0x13            ; reset drive 0
                jc      disk_error
again           call    check_sys_files ; check for presence of IBMDOS/IBMBIO
                jc      again           ; not found, try another disk
                mov     cx, [cs:os_numsectors]
                push    cx              ; remaining sectors
                mov     bx, 0
                xor     dx, dx          ; drive 0, head 0
                mov     cx, 8           ; track 0, sector 8
                mov     si, 1           ; read 1 sector in first found
                push    si
                mov     al, 1           ; 1 sector
read_loop       mov     ah, 2
                int     0x13            ; read sector(s)
                jc      disk_error
                pop     si              ; sectors read
                pop     ax              ; remaining sectors
                call    add_si_sectors  ; bx += si*512
                sub     ax, si          ; remaining -= read
                jz      done                ; none left
                inc     ch              ; next track
                mov     cl, 1           ; start at sector 1
                mov     si, 8           ; read up to 8 sectors
                cmp     ax, si          ; how many are left to read?
                jae     at_least_8_left ; at least 8
                mov     si, ax          ; only read remaining amount
                jmp     short skip
at_least_8_left xchg    ax, si          ; read 8 sectors this time
skip            push    si              ; number of remaining sectors
                push    ax              ; number of sectors to read this time
                jmp     read_loop       ; next read
done            jmp     far [cs:os_offset]; jump to IBMBIO.COM

disk_error      mov     si, FAILURE     ; string to print
                mov     ax, rom_basic   ; put return address of "int 18" code
                push    ax              ; onto stack

;-----------------------------------------------------------------------------
; print zero-terminated string pointed to by DS:SI
;-----------------------------------------------------------------------------

print           xor     bh, bh          ; XXX unnecessary
print_loop      lodsb
                and     al, 0x7F        ; clear bit 7 XXX why is it set?
                jz      ret0            ; zero-termination
                push    si
                mov     ah, 0x0E
                mov     bx, 7           ; light grey, text page 0
                int     0x10            ; write character
                pop     si
                jmp     print_loop
ret0            retn

;-----------------------------------------------------------------------------
; test for IBMBIO.COM and IBMDOS.COM in the first two directory entries
;-----------------------------------------------------------------------------

check_sys_files mov     bx, 0           ; read to address 0 in the DOS segment
                mov     cx, 4           ; track 0, sector 4
                mov     ax, 0x0201
                int     0x13            ; read 1 sector
                push    ds
                jc      non_system_disk ; error case
                mov     ax, cs
                mov     ds, ax          ; DS := CS
                mov     di, 0
                mov     cx, 11          ; convert 11 bytes of first two
to_lower        or      byte [es:di], 0x20; directory entries to lowercase
                or      byte [es:di+0x20], 0x20
                nop                     ; XXX original assembler wasted a byte
                inc     di
                loop    to_lower
                mov     di, 0           ; first entry
                mov     si, IBMBIO_COM
                mov     cx, 11
                cld
                rep cmpsb               ; compare first entry with IBMBIO.COM
                jnz     non_system_disk
                mov     di, 0x20        ; second entry
                mov     si, IBMDOS_COM
                mov     cx, 11
                rep cmpsb               ; compare second entry with IBMDOS.COM
                jnz     non_system_disk
                pop     ds
                retn                    ; return with carry clear
non_system_disk mov     si, NON_SYSTEM_DISK
                call    print
                mov     ah, 0
                int     0x16            ; wait for key
                pop     ds
                stc
                retn                    ; return with carry set

;-----------------------------------------------------------------------------

NON_SYSTEM_DISK db 13,10
                db "Non-System disk or disk erro",'r'+0x80
                db 13,10
                db "Replace and strike any key when read",'y'+0x80
                db  13,10,0

;-----------------------------------------------------------------------------

rom_basic       int     0x18                ; ROM BASIC

;-----------------------------------------------------------------------------

FAILURE         db 13,10
                db "Disk Boot failur",'e'+0x80
                db 13,10,0

;-----------------------------------------------------------------------------

add_si_sectors  push    ax              ; bx += si*512
                push    dx
                mov     ax, si
                mov     di, 512
                mul     di
                add     bx, ax
                pop     dx
                pop     ax
                retn

;-----------------------------------------------------------------------------

                db "Robert O'Rear "

IBMBIO_COM      db "ibmbio  com"
                db 0xB0                 ; XXX unused
IBMDOS_COM      db "ibmdos  com"
                db 0xB0, 0xC9           ; XXX unused

;-----------------------------------------------------------------------------

                times 512-($-$$) db 0

;-----------------------------------------------------------------------------

19 thoughts on “Reverse-Engineering DOS 1.0 – Part 1: The Boot Sector”

  1. When I researched the inner workings of DOS a few years ago, and realized how simplistic its overall operation really was, it made me wish I had had such knowledge 10-15 years ago when I was stuck still using old 8088 machines and such. To better understand it all, I even wrote my own boot loader replacement, and then an EXE loader, just to see if I could. I only emulated a small few DOS calls though, so only a handful of classic games (like Alleycat and NYET) would run.

    DOS and a lot of the hardware seemed kind of like a big mystery back when I actually used to use it. But now, the combination of electronics and assembly knowledge make it sound like one big playground. There’s so much I would have done back then had I known these things.

    Reply
  2. With reference to the print function, I think the comment “zero-termination” is incorrect. Have a close look what the author is doing to the last character in the string. He is adding 0x80. That “and al, 0x7F” enables the loop exit since it affects the ZF. The problem is that it will not actually print the last character as far as I can see.

    Just some random thoughts.

    Reply
  3. bi:

    You are correct. I noticed this after I had played around with my assembler a bit. Unfortunately I could not correct my error here. One is left to speculate as to the reason for some of the redundancies in the code. Perhaps it was to cope with idiosyncrasies from the BIOS or something. Must have an interesting story behind it. Correctly implemented, setting the high bit for the last character in the string (generally 0x10) for prints would have saved an additional byte (the zero terminator). An “and al, 0x80, js ret0” moved to the end of the print function immediately after the pop si would then terminate the print loop I think.

    One’s got appreciate the idea. Encoding the terminator to save a byte and squeeze the last little bit of space. Only got 510 bytes to work with!

    Reply
  4. The “and ah, 0x7f” mask has to stay. But we can use the redundant “xor bh, bh”, so:

    print mov ah, 0x0E
    print_loop lodsb
    push ax
    and al, 0x7F
    push si
    mov bx, 7
    int 0x10
    pop si
    pop ax
    and al, 0x80
    jns print_loop
    retn

    I think both implementations are 19 bytes.

    Reply
  5. But… PUSH & POP are way too slow, so:

    print xor bh, bh
    print_loop lodsb
    mov dl, al
    and al, 0x7F
    push si
    mov ah, 0x0E
    mov bx, 7
    int 0x10
    pop si
    and dl, 0x80
    jns print_loop
    retn

    Same size, faster with old redundancy (just in case we need to put the operating system in the boot sector!!!)

    Reply
  6. If the code reads the directory to check the first two entries, it might as well use the start position of the file; it might still assume it is contiguous. That would give the SYS command the possibility to install MSDOS on a non-empty floppy in many cases.

    Reply
  7. hi everyone
    i ve a MS dos boot loader program n i ve to get its details in Assembly lang 80386. So cald reverse engineering
    pls help me if possible
    thanks in advance

    Reply
  8. Pingback: ego-w

Leave a Comment