GB Programming: Difference between revisions

Jump to navigation Jump to search
Content added Content deleted
>Torchickens
No edit summary
>Torchickens
(Minor grammar fixes)
Line 1: Line 1:
Welcome to this tutorial ! The purpose of this page is to allow everyone to program anything on a Game Boy. This can sound complex, but the Game Boy is a very well-documented system, and programming for it can be learned quite easily.
Welcome to this tutorial! The purpose of this page is to allow everyone to program anything on a Game Boy. This may sound complex, but the Game Boy is a very well-documented system, and programming for it can be learned quite easily.


Our goal will first be to create [[arbitrary code execution]] programs, but later sections of this tutorial will give you tools for more general-purpose programming, so you'll be able to create your own games.
Our goal will first be to create [[arbitrary code execution]] programs, but later sections of this tutorial will give you tools for more general-purpose programming, so you'll be able to create your own games.


Let's get started !
Let's get started!




Line 9: Line 9:
In this part is a collection of different terms, concepts and notations that are '''vital''' for the rest of this tutorial. Do NOT skip anything here unless the text specifies you can. I mean it.
In this part is a collection of different terms, concepts and notations that are '''vital''' for the rest of this tutorial. Do NOT skip anything here unless the text specifies you can. I mean it.


If you don't understand something later on, read this part again, and chances are, you'll understand it.
If you don't understand something later on, read that part again, and chances are, you'll understand it.


===Numeric systems===
===Numeric systems===
Line 28: Line 28:
However, decimal is the base humans like to count in. But computers don't. Instead, they prefer '''binary'''. Binary is '''base 2''', that is, instead of working with powers of 10, we work with powers of 2. Also, only 2 symbols are allowed, 0 and 1. Each of them is called a '''bit'''.
However, decimal is the base humans like to count in. But computers don't. Instead, they prefer '''binary'''. Binary is '''base 2''', that is, instead of working with powers of 10, we work with powers of 2. Also, only 2 symbols are allowed, 0 and 1. Each of them is called a '''bit'''.


This paragraph is some trivia about why computers use binary instead of decimal. You can skip it if you will. So, why binary ? Because we need computers to be efficient. So we need to store information using electricity. The easiest variable to manipulate is "Is power running ?". The answer is either 0 (it doesn't) or 1 (it does). And there you got it ! Why do computers count in binary ? To keep them at reasonable prices !
This paragraph is some trivia about why computers use binary instead of decimal. You can skip it if you want. So, why binary? Because we need computers to be efficient. So we need to store information using electricity. The easiest variable to manipulate is "Is power running?". The answer is either 0 (it doesn't) or 1 (it does). And there you got it! Why do computers count in binary? To keep them at reasonable prices!


To differentiate decimal numbers from binary numbers, binary numbers will be prepended with a % symbol. So, 10 is decimal, and %10 is binary. Got it ? Okay.
To differentiate decimal numbers from binary numbers, binary numbers will be prepend with a % symbol. So, 10 is decimal, and %10 is binary. Got it? Okay.


Here is an example :
Here is an example :
Line 61: Line 61:
</pre>
</pre>


Why using hexadecimal ? Well, writing binary numbers is quite tedious. Take both examples together : we have 149 = %10010101 = $95. Now, consider the digits individually :
Why using hexadecimal? Well, writing binary numbers is quite tedious. Take both examples together : we have 149 = %10010101 = $95. Now, consider the digits individually :
<pre>
<pre>
$9 = %1001
$9 = %1001
Line 70: Line 70:
This way, we have a more readable way of writing numbers that can be converted to binary in a snap.
This way, we have a more readable way of writing numbers that can be converted to binary in a snap.


For the rest of this tutorial, we will mostly be using hexadecimal, but always remember the binary lying down below !
For the rest of this tutorial, we will mostly be using hexadecimal, but always remember the binary lying down below!






==A dip into technicals==
==A dip into technical information==
===Registers===
===Registers===
Registers are sections of RAM within the CPU itself. That is what you will be working with, alongside memory. But we'll see memory later.
Registers are sections of RAM within the CPU itself. That is what you will be working with, alongside memory. But we'll see memory later.


There are 8 different registers, which can be actually paired up. These are A, B, C, D, E, F, H and L. These are NOT hex digits, so beware !
There are 8 different registers, which can be actually paired up. These are A, B, C, D, E, F, H and L. These are NOT hex digits, so beware!


Any of these registers can hold an '''unsigned 8-bit value'''. That means :
Any of these registers can hold an '''unsigned 8-bit value'''. That means :
Line 95: Line 95:
* A is the '''Accumulator'''. It is the register you have to use to make arithmetic operations, and most of the time, memory access.
* A is the '''Accumulator'''. It is the register you have to use to make arithmetic operations, and most of the time, memory access.
* B is usually a 8-bit counter.
* B is usually a 8-bit counter.
* C is also used as a 8-bit counter, but also for port access. We'll see that wayy later.
* C is also used as a 8-bit counter, but also for port access. We'll see that way later.
* D, E, H and L have no special attribute as 8-bit. However, when paired, they do.
* D, E, H and L have no special attribute as 8-bit. However, when paired, they do.
* F holds the CPU's '''Flags'''. It is very special, as you cannot use it as a general-purpose register. You can't even directly access it ! We'll see how to use it later.
* F holds the CPU's '''Flags'''. It is very special, as you cannot use it as a general-purpose register. You can't even directly access it! We'll see how to use it later.
* HL is quite the equivalent of A, but is 16-bit. Its name is because it stores the '''High''' and '''Low''' bytes of a memory address.
* HL is quite the equivalent of A, but is 16-bit. Its name is because it stores the '''High''' and '''Low''' bytes of a memory address.
* BC is mostly used as a '''Byte Counter'''. It can also be used together with A to access memory.
* BC is mostly used as a '''Byte Counter'''. It can also be used together with A to access memory.
Line 108: Line 108:
|Store the value of ''source'' into ''destination''.
|Store the value of ''source'' into ''destination''.
|}
|}
Did I mention that nothing is case-sensitive ?
Did I mention that nothing is case-sensitive?


However, you can't do LD as you wish, there are restrictions :
However, you can't do LD as you wish, there are restrictions :
Line 408: Line 408:
Trying to do something like ''ld a, $100'' isn't possible. Like, physically impossible. You'll see why much, much later.
Trying to do something like ''ld a, $100'' isn't possible. Like, physically impossible. You'll see why much, much later.


Note that ''ld a, -1'' is valid, but actually, the "-1" wraps. Storing -1 will truly store 255 in a 8-bit register, and 65535 in a 16-bit register. Why ? Coming soon.
Note that ''ld a, -1'' is valid, but actually, the "-1" wraps. Storing -1 will truly store 255 in a 8-bit register, and 65535 in a 16-bit register. Why? Coming soon.


Notice that F and AF aren't usable anywhere. Actually, only a few instructions use them.
Notice that F and AF aren't usable anywhere. Actually, only a few instructions use them.
Line 414: Line 414:


===Negative numbers===
===Negative numbers===
Time to confess : I've lied to you. Actually, 8-bit and 16-bit registers can hold negative numbers.
Time to confess: I've lied to you. Actually, 8-bit and 16-bit registers can hold negative numbers.


I've told you, "individual registers can hold unsigned 8-bit values, and pairs unsigned 16-bit values". However, these aren't true : these values can be signed. How does that work ?
I've told you, "individual registers can hold unsigned 8-bit values, and pairs unsigned 16-bit values". However, these aren't true: these values can be signed. How does that work?


What we will be doing is cutting our number range in half, and telling one half is composed of negative numbers. But how to distinguish positive and negative numbers ? Well, we tell the MSB is no longer meaning the symbol in front of 2^7, but it will give away the sign of the number (0 = positive, 1 = negative). So, instead of having values in ranges 0 - 255 and 0 - 65535, we will have values in ranges -128 - 127 and -32768 - 32767. Neat !
What we will be doing is cutting our number range in half, and telling one half is composed of negative numbers. But how to distinguish positive and negative numbers? Well, we tell the MSB is no longer meaning the symbol in front of 2^7, but it will give away the sign of the number (0 = positive, 1 = negative). So, instead of having values in ranges 0 - 255 and 0 - 65535, we will have values in ranges -128 - 127 and -32768 - 32767. Neat!


How to multiply by -1 ? Easy ! You can either :
How to multiply by -1? Easy! You can either :
* Calculate zero minus your number (just like in real life). However, you should consider 0 the same as 256 (in 8-bit mode) or 65536 (in 16-bit mode).
* Calculate zero minus your number (just like in real life). However, you should consider 0 the same as 256 (in 8-bit mode) or 65536 (in 16-bit mode).
* Flip the state of every bit, then add one.
* Flip the state of every bit, then add one.
Line 429: Line 429:
Add 1 %10000000
Add 1 %10000000
</pre>
</pre>
So, uh, -(-128) = -128 ? Oops. You cannot negate -128 with only 8 bits. Try with 16 bits, and you'll see it works ! However, with 16 bits, you can't negate -32768 for the same reason.
So, uh, -(-128) = -128? Oops. You cannot negate -128 with only 8 bits. Try with 16 bits, and you'll see it works! However, with 16 bits, you can't negate -32768 for the same reason.


Now, let's see how the CPU handles the difference between unsigned and signed values. Surprise, it doesn't ! Why ? Because making the same operations using signed or unsigned values give the same result !
Now, let's see how the CPU handles the difference between unsigned and signed values. Surprise, it doesn't! Why? Because making the same operations using signed or unsigned values give the same result!
<pre>
<pre>
unsigned signed
unsigned signed
Line 438: Line 438:
%1 00000000 = 256 = 0 (Disqualify ninth bit)
%1 00000000 = 256 = 0 (Disqualify ninth bit)
</pre>
</pre>
And, you just saw why I told you to consider 0 the same as 256 : they are similar ! As 256 uses 8 zero bits preceded by a 1... but the 1 is discarded.
And, you just saw why I told you to consider 0 the same as 256 : they are similar! As 256 uses 8 zero bits preceded by a 1... but the 1 is discarded.




===Memory===
===Memory===
Finally ! So what is "memory", you ask ? It's a stream of bytes. A stream of numbers. And guess what ? Everything is a stream of numbers. From numbers to this cute kitten video, everything is numbers. Including programs. That's why [[Arbitrary code execution]] happens. Remember this sentence : ''Data is whatever you define it to be.''
Finally! So what is "memory", you ask? It's a stream of bytes. A stream of numbers. And guess what? Everything is a stream of numbers. From numbers to this cute kitten video, everything is numbers. Including programs. That's why [[arbitrary code execution]] happens. Remember this sentence : ''Data is whatever you define it to be.''


Ever wondered why you could open a JPEG in Word ? Now you know.
Ever wondered why you could open a JPEG in Word? Now you know.


How is every byte differentiated from its neighbors ? Well, everyone gets a 16-bit unsigned integer called their "address" (thus ranging from $0000 to $FFFF). To access a byte, use its address, just like to reach your friend, you use his e-mail address.
How is every byte differentiated from its neighbors? Well, everyone gets a 16-bit unsigned integer called their "address" (thus ranging from $0000 to $FFFF). To access a byte, use its address, just like to reach your friend, you use his e-mail address.


So, how does running a program works ? What happens is that a special register is incremented (its value is raised by one), then the processor fetches the byte located at the address held by that register, and processes it as an opcode ; when done, everyting is repeated. Instructions can be one to three opcodes (bytes) large, so this cycle may repeat for a single instruction.
So, how does running a program works? What happens is that a special register is incremented (its value is raised by one), then the processor fetches the byte located at the address held by that register, and processes it as an opcode ; when done, everything is repeated. Instructions can be one to three opcodes (bytes) large, so this cycle may repeat for a single instruction.


So now, how to access memory ? With parentheses ! To access memory address $CD38, you just have to use ($CD38). Yay !
So now, how to access memory? With parentheses! To access memory address $CD38, you just have to use ($CD38). Yay!


To access the memory location pointed to by HL, just do... (hl) ! It's the same with BC and DE.
To access the memory location pointed to by HL, just do... (hl)! It's the same with BC and DE.


So, to retrieve the value at memory address $5611 into register A : ''ld a, ($6511)''
So, to retrieve the value at memory address $5611 into register A : ''ld a, ($6511)''
Line 460: Line 460:
Remember to refer to the chart above for the legal LD combinations.
Remember to refer to the chart above for the legal LD combinations.


Obviously, ''ld ($6511), a'' will overwrite the previous value stored here. But ''ld ($6511), hl'' will store a 16-bit value, which is a word long, that is two bytes long ! So, not only will ($6511) be overwritten, but ($6512) too ! Always be very careful about the memory you're touching. Otherwise, stuff like the [[ZZAZZ Glitch]] happen.
Obviously, ''ld ($6511), a'' will overwrite the previous value stored here. But ''ld ($6511), hl'' will store a 16-bit value, which is a word long, that is two bytes long! So, not only will ($6511) be overwritten, but ($6512) too! Always be very careful about the memory you're touching. Otherwise, stuff like the [[ZZAZZ glitch]] happen.


For those wondering, ''ld a, ($6511)'' leaves ($6511) untouched.
For those wondering, ''ld a, ($6511)'' leaves ($6511) untouched.
Line 466: Line 466:


==Flags==
==Flags==
Remember that "special" F register ? Well, each of its bits is called a "flag", and holds information about (usually) the accumulator. A flag is dubbed "set" if it equals 1, and "reset" otherwise.
Remember that "special" F register? Well, each of its bits is called a "flag", and holds information about (usually) the accumulator. A flag is dubbed "set" if it equals 1, and "reset" otherwise.


Here are the 8 flags :
Here are the 8 flags :
Line 474: Line 474:
|S||Z||-||H||-||P/V||N||C
|S||Z||-||H||-||P/V||N||C
|}
|}
Both "-" are unsued flags. They behavior is very complicated, and isn't "official". Treat them as random.
Both "-" are unused flags. Their behavior is very complicated, and they aren't "official". Treat them as random.


===S : Sign===
===S : Sign===
Line 505: Line 505:


==Manipulating data==
==Manipulating data==
===Instructions get !===
===Instructions get!===
Let's get these :
Let's get these :
{| class="wikitable"
{| class="wikitable"
Line 559: Line 559:
If you want to get information about any instruction, go [http://tutorials.eeems.ca/ASMin28Days/ref/z80is.html there].
If you want to get information about any instruction, go [http://tutorials.eeems.ca/ASMin28Days/ref/z80is.html there].


Q : Hey, but where's MULT ?
Q : Hey, but where's MULT?


A : Nowhere :D To multiply, you must write your own routines ! However, a nice lil' trick : to do A <- A*2, simply ''add a, a'' ! To do A <- A*3, do ''ld b, a'', ''add a, a'', ''add a, b'' (you can swap B with any other register, of course). I'll leave you A <- A*4, A*5, A*6 and A*7 as an exercise.
A : Nowhere :D To multiply, you must write your own routines! However, a nice lil' trick : to do A <- A*2, simply ''add a, a''! To do A <- A*3, do ''ld b, a'', ''add a, a'', ''add a, b'' (you can swap B with any other register, of course). I'll leave you A <- A*4, A*5, A*6 and A*7 as an exercise.


For the rest of the tutorial, you'll see some text prefixed by a ";". These are comments, and are NOT part of the code. This line : "ld (hl), a ; Store the mon's ID" will be interpreted as "ld (hl), a". Everything following a ";" is ignored.
For the rest of the tutorial, you'll see some text prefixed by a ";". These are comments, and are NOT part of the code. This line : "ld (hl), a ; Store the mon's ID" will be interpreted as "ld (hl), a". Everything following a ";" is ignored.
Line 588: Line 588:
add a, 119
add a, 119
</pre>
</pre>
What value will hold A ? The naive guess would be 322, or %101000010. However, this is 9 bits large, and won't fit in A. The way it works is that the eight rightmost bits are kept in the register, the rest being discarded. Because a non-zero value is dicarded, the C flag is set.
What value will hold A? The naive guess would be 322, or %101000010. However, this is 9 bits large, and won't fit in A. The way it works is that the eight rightmost bits are kept in the register, the rest being discarded. Because a non-zero value is discarded, the C flag is set.


Thus, the result is A equals 66 = %01000010, and the C flag is set.
Thus, the result is A equals 66 = %01000010, and the C flag is set.
Line 594: Line 594:


===Register pairs and RAM===
===Register pairs and RAM===
Let's say you run a ld hl, $D361. $D361 is put into HL, but since it is registers H and L paired up, what happen to them ?
Let's say you run a ld hl, $D361. $D361 is put into HL, but since it is registers H and L paired up, what happen to them?


Because two hex digits mean one byte, $D3, as well as $61, is a byte. Since $D3 and H are leftmost in both cases, ld hl, $D361 is actually a shorter form of ld h, $D3 then ld l, $61.
Because two hex digits mean one byte, $D3, as well as $61, is a byte. Since $D3 and H are leftmost in both cases, ld hl, $D361 is actually a shorter form of ld h, $D3 then ld l, $61.
Line 602: Line 602:
Stop here, and remember this until it becomes natural to you. Because this "little-endian"ness is very tricky for beginners. It is ''very'' important when working with memory.
Stop here, and remember this until it becomes natural to you. Because this "little-endian"ness is very tricky for beginners. It is ''very'' important when working with memory.


Here is an exercise : what values will ($C000) to ($C00F) contain after this code is ran ?
Here is an exercise : what values will ($C000) to ($C00F) contain after this code is ran?


Initial values :
Initial values :
Line 633: Line 633:


==Stacks==
==Stacks==
===A stack ? Can you eat that ?===
===A stack? Can you eat that?===
No it's not. It's a data structure that has the cool property of not being fixed-length. How does it work ? Just like a stack of plates. Imagine you're washing some plates in the back of a restaurant. Next to you is a pile of plates you need to wash. Waiters come and place ("push") plates on top of the stack and when finished washing a plate, you take ("pop") the topmost one. This way of working is called LIFO (Last In First Out).
No it's not. It's a data structure that has the cool property of not being fixed-length. How does it work? Just like a stack of plates. Imagine you're washing some plates in the back of a restaurant. Next to you is a pile of plates you need to wash. Waiters come and place ("push") plates on top of the stack and when finished washing a plate, you take ("pop") the topmost one. This way of working is called LIFO (Last In First Out).


In our case, we will do it by saving the top of the stack as a memory address. This value is called the ''stack pointer''. Here is an example, with the stack growing to the right :
In our case, we will do it by saving the top of the stack as a memory address. This value is called the ''stack pointer''. Here is an example, with the stack growing to the right :
Line 671: Line 671:


===Coding a stack===
===Coding a stack===
Let's say we have our stack pointer saved at memory address $C000 (because it is a 16-bit value, it also uses memory address $C001 !!).
Let's say we have our stack pointer saved at memory address $C000 (because it is a 16-bit value, it also uses memory address $C001!!).


To push register DE :
To push register DE :
Line 694: Line 694:


===Good news===
===Good news===
Okay, coding a stack is cool, but... isn't there a faster way of doing it ? Of course ! Because the Game Boy's CPU has a stack by itself ! Meet
Okay, coding a stack is cool, but... isn't there a faster way of doing it? Of course! Because the Game Boy's CPU has a stack by itself! Meet
{| class="wikitable"
{| class="wikitable"
|PUSH reg16
|PUSH reg16
Line 736: Line 736:
Note that the stack grows '''downwards''' (ie, PUSH reduces the value from SP, and POP augments it). Also, POP doesn't alter memory.
Note that the stack grows '''downwards''' (ie, PUSH reduces the value from SP, and POP augments it). Also, POP doesn't alter memory.


You cannot PUSH / POP with 8-bit registers. Instead, to save register B, you '''must''' ''push bc''. You don't have to push and pop to the same register ! For example :
You cannot PUSH / POP with 8-bit registers. Instead, to save register B, you '''must''' ''push bc''. You don't have to push and pop to the same register! For example :
<pre>
<pre>
push af
push af
Line 744: Line 744:
is completely valid. (Note that E's value after the POP is equal to F's when PUSHing, so this is the only way to directly access the F register)
is completely valid. (Note that E's value after the POP is equal to F's when PUSHing, so this is the only way to directly access the F register)


Beware with the stack, even more when you're not coding your own game : everyone uses the stack ; even the CPU ! (We're about to see how) The best practice to have is to leave the stack identical before and after your code. Otherwise, expect some crashes, yay !
Beware with the stack, even more when you're not coding your own game : everyone uses the stack ; even the CPU! (We're about to see how) The best practice to have is to leave the stack identical before and after your code. Otherwise, expect some crashes, yay!




==Control structures==
==Control structures==
===Rollin' around===
===Rollin' around===
...at the speed of sound ! (I'M SO SORRY)
...at the speed of sound! (I'M SO SORRY)


Up until now, we've seen only programs that begin somewhere and that are ran top to bottom, in that order. However, this never happens in a more complex context. So, let's see how to manipulate code flow !
Up until now, we've seen only programs that begin somewhere and that are ran top to bottom, in that order. However, this never happens in a more complex context. So, let's see how to manipulate code flow!


We have two instructions that allow execution to jump somewhere else in memory :
We have two instructions that allow execution to jump somewhere else in memory :
Line 761: Line 761:
|Has execution jumping over offset8 bytes
|Has execution jumping over offset8 bytes
|}
|}
What's the difference ?
What's the difference?


First, JP can go '''anywhere'''. JP tells the CPU "jump to this memory address". JR is much more limited, as it can only reach a signed 8-bit range (128 bytes backwards, or 127 bytes forwards).
First, JP can go '''anywhere'''. JP tells the CPU "jump to this memory address". JR is much more limited, as it can only reach a signed 8-bit range (128 bytes backwards, or 127 bytes forwards).
Line 769: Line 769:
Third, JR takes 7 or 12 CPU cycles to run, whereas JP always takes 10.
Third, JR takes 7 or 12 CPU cycles to run, whereas JP always takes 10.


And this brings us to the next part !
And this brings us to the next part!




===Conditionals===
===Conditionals===
JP and JR can be executed in unconditional ways, meaning the jump will always occur. This can be useful, but sometimes we don't want that. And that's where flags come in handy ! Because we are able to trigger jumps depending on the status of the flags.
JP and JR can be executed in unconditional ways, meaning the jump will always occur. This can be useful, but sometimes we don't want that. And that's where flags come in handy! Because we are able to trigger jumps depending on the status of the flags.
{| class="wikitable"
{| class="wikitable"
|JP condition, label
|JP condition, label
Line 797: Line 797:


==Solutions to the exercises==
==Solutions to the exercises==
===Instructions get !===
===Instructions get!===
B can be swapped with any other register (except A)
B can be swapped with any other register (except A)
{| class="wikitable"
{| class="wikitable"
Line 863: Line 863:
sub a, b ; A = A - B = $06 - $DE = $06 + (-$DE) = $06 + ($21 + $01) = $28, C flag = 0
sub a, b ; A = A - B = $06 - $DE = $06 + (-$DE) = $06 + ($21 + $01) = $28, C flag = 0
<br/>
<br/>
Notice here that doing ''sub a, b'' actually increased A's value !
Notice here that doing ''sub a, b'' actually increased A's value!


inc (hl) ; (HL) = ($C004) = $DF
inc (hl) ; (HL) = ($C004) = $DF