Update: The source is available at github.com/mist64/msdos1
The bootsector of DOS 1.0 is celebrating its 28th birthday today (it contains the timestamp “7-May-81”), so let’s look at it more closely.
Here it is:
00000000 eb 2f 14 00 00 00 60 00 20 37 2d 4d 61 79 2d 38 |........ 7-May-8| 00000010 31 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |1...............| 00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00000030 00 fa 8c c8 8e d8 ba 00 00 8e d2 bc 00 7c fb a1 |................| 00000040 06 7c 8e d8 8e c0 ba 00 00 8b c2 cd 13 72 41 e8 |................| 00000050 58 00 72 fb 2e 8b 0e 02 7c 51 bb 00 00 33 d2 b9 |................| 00000060 08 00 be 01 00 56 b0 01 b4 02 cd 13 72 22 5e 58 |................| 00000070 e8 e7 00 2b c6 74 14 fe c5 b1 01 be 08 00 3b c6 |................| 00000080 73 04 8b f0 eb 01 96 56 50 eb dd 2e ff 2e 04 7c |................| 00000090 be 44 7d b8 42 7d 50 32 ff ac 24 7f 74 0b 56 b4 |................| 000000a0 0e bb 07 00 cd 10 5e eb f0 c3 bb 00 00 b9 04 00 |................| 000000b0 b8 01 02 cd 13 1e 72 34 8c c8 8e d8 bf 00 00 b9 |................| 000000c0 0b 00 26 80 0d 20 26 80 8d 20 00 20 47 e2 f3 bf |................| 000000d0 00 00 be 76 7d b9 0b 00 fc f3 a6 75 0f bf 20 00 |................| 000000e0 be 82 7d b9 0b 00 f3 a6 75 02 1f c3 be f9 7c e8 |................| 000000f0 a5 ff b4 00 cd 16 1f f9 c3 0d 0a 4e 6f 6e 2d 53 |...........Non-S| 00000100 79 73 74 65 6d 20 64 69 73 6b 20 6f 72 20 64 69 |ystem disk or di| 00000110 73 6b 20 65 72 72 6f f2 0d 0a 52 65 70 6c 61 63 |sk erro?..Replac| 00000120 65 20 61 6e 64 20 73 74 72 69 6b 65 20 61 6e 79 |e and strike any| 00000130 20 6b 65 79 20 77 68 65 6e 20 72 65 61 64 f9 0d | key when read?.| 00000140 0a 00 cd 18 0d 0a 44 69 73 6b 20 42 6f 6f 74 20 |......Disk Boot | 00000150 66 61 69 6c 75 72 e5 0d 0a 00 50 52 8b c6 bf 00 |failur?.........| 00000160 02 f7 e7 03 d8 5a 58 c3 52 6f 62 65 72 74 20 4f |........Robert O| 00000170 27 52 65 61 72 20 69 62 6d 62 69 6f 20 20 63 6f |'Rear ibmbio co| 00000180 6d b0 69 62 6d 64 6f 73 20 20 63 6f 6d b0 c9 00 |m.ibmdos com...| 00000190 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00000200
DOS 1.0 shipped on a 160 KB single sided disk. The boot code in the IBM PC’s BIOS loaded the first sector into RAM at segment 0x0000, offset 0x7C00 and ran it. Later versions of BIOS checked for 0xAA55 in the last word of the bootsector, but the first version did not. Note that DOS 1.0 is also pre-BIOS Parameter Block, i.e. the bootsector does not contain any information about the physical layout of the disk, since there was only a single disk size.
What the boot sector is supposed to do is read “IBMBIO.COM” and “IBMDOS.COM” into RAM and run them – these are the DOS system files for machine abstraction and DOS API, respectively. In MS-DOS, they would be called “IO.SYS” and “MSDOS.SYS”.
But the DOS 1.0 bootsector takes quite a lot of shortcuts. It assumes it’s always a 40 track, 8 sectors single-sided disk and the two files occupy the first sectors of the data area contiguously – something that SYS.COM could guarantee when making a disk bootable.
So the bootsector first loads the first sector of the root directory (hardcoded to track 0, sector 4) and compares the first two entries with “IBMBIO.COM” and “IBMDOS.COM”. For some reason, the comparison is case-insensitive, although DOS only allows uppercase filenames.
00000600 49 42 4d 42 49 4f 20 20 43 4f 4d 06 00 00 00 00 |IBMBIO COM.....| 00000610 00 00 00 00 00 00 00 00 f7 02 02 00 80 07 00 00 |................| 00000620 49 42 4d 44 4f 53 20 20 43 4f 4d 06 00 00 00 00 |IBMDOS COM.....| 00000630 00 00 00 00 00 00 00 00 0d 03 06 00 00 19 00 00 |................|
If they are not there, it prompts the user to replace the disk and tries again. Otherwise, it loads 20 sectors starting from track 0, sector 8 to segment 0x0060, offset 0x0000 into memory and jumps there.
0x60:0x0000 is the same as the linear address 0x0600. On the IBM PC, 0x0000 to 0x03FF are occupied by the interrupts vectors, 0x0400 to 0x4FF is used by BIOS for its variables, and DOS 1.0 uses 0x500 to 0x5FF as the “DOS Communication Area”, so code can start at 0x0600.
00000e00 e9 62 01 e9 6d 00 e9 b2 00 e9 d8 00 e9 e8 00 e9 |................| 00000e10 24 01 e9 3a 01 e9 51 03 e9 52 03 e9 44 01 b1 00 |................| 00000e20 22 00 42 49 4f 53 20 56 65 72 73 69 6f 6e 20 31 |..BIOS Version 1| 00000e30 2e 30 30 a0 32 32 2d 4a 75 6c 2d 38 31 00 0d 0a |.00.22-Jul-81...|
These are the first few bytes of IBMBIO.COM. Its birthday is about six weeks from now, but I am going to post about its internals next week.
Let us finally look at the complete commented disassembly of the boot sector. It compiles with NASM, but will emit a few bytes differently because of variations in the assembly encoding – but variations in size have been compensated wit NOPs. It is actually quite easy to read.
;----------------------------------------------------------------------------- ; DOS 1.0 Boot Sector (disk image MD5 73c919cecadf002a7124b7e8bfe3b5ba) ; http://www.pagetable.com/ ;----------------------------------------------------------------------------- org 0x7C00 jmp short start ;----------------------------------------------------------------------------- os_numsectors dw 20 ; how many sectors to read os_offset dw 0 ; segment to load code into os_segment dw 0x60 ; offset to load code into db " 7-May-81",0 ; timestamp times 31 db 0 ; padding ;----------------------------------------------------------------------------- start cli mov ax, cs mov ds, ax ; DS := CS mov dx, 0 mov ss, dx ; SS := 0000 mov sp, 0x7C00 ; stack below code sti mov ax, [os_segment] mov ds, ax mov es, ax ; ES := DS := where to load DOS mov dx, 0 mov ax, dx int 0x13 ; reset drive 0 jc disk_error again call check_sys_files ; check for presence of IBMDOS/IBMBIO jc again ; not found, try another disk mov cx, [cs:os_numsectors] push cx ; remaining sectors mov bx, 0 xor dx, dx ; drive 0, head 0 mov cx, 8 ; track 0, sector 8 mov si, 1 ; read 1 sector in first found push si mov al, 1 ; 1 sector read_loop mov ah, 2 int 0x13 ; read sector(s) jc disk_error pop si ; sectors read pop ax ; remaining sectors call add_si_sectors ; bx += si*512 sub ax, si ; remaining -= read jz done ; none left inc ch ; next track mov cl, 1 ; start at sector 1 mov si, 8 ; read up to 8 sectors cmp ax, si ; how many are left to read? jae at_least_8_left ; at least 8 mov si, ax ; only read remaining amount jmp short skip at_least_8_left xchg ax, si ; read 8 sectors this time skip push si ; number of remaining sectors push ax ; number of sectors to read this time jmp read_loop ; next read done jmp far [cs:os_offset]; jump to IBMBIO.COM disk_error mov si, FAILURE ; string to print mov ax, rom_basic ; put return address of "int 18" code push ax ; onto stack ;----------------------------------------------------------------------------- ; print zero-terminated string pointed to by DS:SI ;----------------------------------------------------------------------------- print xor bh, bh ; XXX unnecessary print_loop lodsb and al, 0x7F ; clear bit 7 XXX why is it set? jz ret0 ; zero-termination push si mov ah, 0x0E mov bx, 7 ; light grey, text page 0 int 0x10 ; write character pop si jmp print_loop ret0 retn ;----------------------------------------------------------------------------- ; test for IBMBIO.COM and IBMDOS.COM in the first two directory entries ;----------------------------------------------------------------------------- check_sys_files mov bx, 0 ; read to address 0 in the DOS segment mov cx, 4 ; track 0, sector 4 mov ax, 0x0201 int 0x13 ; read 1 sector push ds jc non_system_disk ; error case mov ax, cs mov ds, ax ; DS := CS mov di, 0 mov cx, 11 ; convert 11 bytes of first two to_lower or byte [es:di], 0x20; directory entries to lowercase or byte [es:di+0x20], 0x20 nop ; XXX original assembler wasted a byte inc di loop to_lower mov di, 0 ; first entry mov si, IBMBIO_COM mov cx, 11 cld rep cmpsb ; compare first entry with IBMBIO.COM jnz non_system_disk mov di, 0x20 ; second entry mov si, IBMDOS_COM mov cx, 11 rep cmpsb ; compare second entry with IBMDOS.COM jnz non_system_disk pop ds retn ; return with carry clear non_system_disk mov si, NON_SYSTEM_DISK call print mov ah, 0 int 0x16 ; wait for key pop ds stc retn ; return with carry set ;----------------------------------------------------------------------------- NON_SYSTEM_DISK db 13,10 db "Non-System disk or disk erro",'r'+0x80 db 13,10 db "Replace and strike any key when read",'y'+0x80 db 13,10,0 ;----------------------------------------------------------------------------- rom_basic int 0x18 ; ROM BASIC ;----------------------------------------------------------------------------- FAILURE db 13,10 db "Disk Boot failur",'e'+0x80 db 13,10,0 ;----------------------------------------------------------------------------- add_si_sectors push ax ; bx += si*512 push dx mov ax, si mov di, 512 mul di add bx, ax pop dx pop ax retn ;----------------------------------------------------------------------------- db "Robert O'Rear " IBMBIO_COM db "ibmbio com" db 0xB0 ; XXX unused IBMDOS_COM db "ibmdos com" db 0xB0, 0xC9 ; XXX unused ;----------------------------------------------------------------------------- times 512-($-$$) db 0 ;-----------------------------------------------------------------------------
When I researched the inner workings of DOS a few years ago, and realized how simplistic its overall operation really was, it made me wish I had had such knowledge 10-15 years ago when I was stuck still using old 8088 machines and such. To better understand it all, I even wrote my own boot loader replacement, and then an EXE loader, just to see if I could. I only emulated a small few DOS calls though, so only a handful of classic games (like Alleycat and NYET) would run.
DOS and a lot of the hardware seemed kind of like a big mystery back when I actually used to use it. But now, the combination of electronics and assembly knowledge make it sound like one big playground. There’s so much I would have done back then had I known these things.
But who was Robert O’Rear? (00000160-170)
The long displacement near to_lower can be reproduced by writing
or byte [word es:di+0x20], 0x20
(Yes, that’s some really perverse nasm syntax…)
— bi
Robert O’Rear was one of the original MS employees. Apparently he has retired and is living the extreme high life. 🙂
http://en.wikipedia.org/wiki/Bob_O'Rear
With reference to the print function, I think the comment “zero-termination” is incorrect. Have a close look what the author is doing to the last character in the string. He is adding 0x80. That “and al, 0x7F” enables the loop exit since it affects the ZF. The problem is that it will not actually print the last character as far as I can see.
Just some random thoughts.
Zibbly:
Not really true. and al, 0x7f will only set the zero flag if the lower 7 bits of al are zero, i.e. the loaded byte is either precisely 0x00 or precisely 0x80. If the byte is anything else, it’ll still go on to print the character.
bi:
You are correct. I noticed this after I had played around with my assembler a bit. Unfortunately I could not correct my error here. One is left to speculate as to the reason for some of the redundancies in the code. Perhaps it was to cope with idiosyncrasies from the BIOS or something. Must have an interesting story behind it. Correctly implemented, setting the high bit for the last character in the string (generally 0x10) for prints would have saved an additional byte (the zero terminator). An “and al, 0x80, js ret0” moved to the end of the print function immediately after the pop si would then terminate the print loop I think.
One’s got appreciate the idea. Encoding the terminator to save a byte and squeeze the last little bit of space. Only got 510 bytes to work with!
The “and ah, 0x7f” mask has to stay. But we can use the redundant “xor bh, bh”, so:
print mov ah, 0x0E
print_loop lodsb
push ax
and al, 0x7F
push si
mov bx, 7
int 0x10
pop si
pop ax
and al, 0x80
jns print_loop
retn
I think both implementations are 19 bytes.
But… PUSH & POP are way too slow, so:
print xor bh, bh
print_loop lodsb
mov dl, al
and al, 0x7F
push si
mov ah, 0x0E
mov bx, 7
int 0x10
pop si
and dl, 0x80
jns print_loop
retn
Same size, faster with old redundancy (just in case we need to put the operating system in the boot sector!!!)
Hello
who is the guy who posted this article?
I am working on a change in a MBR FAT32 to WINXP and needing a little help!
tanks all
net_hw@bol.com.br
If the code reads the directory to check the first two entries, it might as well use the start position of the file; it might still assume it is contiguous. That would give the SYS command the possibility to install MSDOS on a non-empty floppy in many cases.
hi everyone
i ve a MS dos boot loader program n i ve to get its details in Assembly lang 80386. So cald reverse engineering
pls help me if possible
thanks in advance