PowerPC Assembly Language

Like x86, PowerPC machine code consists of bytes, with addresses, that represent assembly instructions and operands. PowerPC machine code also spends most of its time manipulating values in registers.

r0, register 0, is a scratch register. It's often used for a temporary.
r1 is the stack pointer.
r2 is reserved for some multithreaded global variable magic.
r3 is used to return values (like eax), and as the first argument (like rdi).
r4 is the second argument
r5-r10 are also argument registers (or generic scratch)
r13-r31 are saved registers

li r3, 7
blr

Add, like most arithmetic on PowerPC, takes *three* registers: two sources, and a destination.

li r8, 8
li r9, 1000
add r3,r8,r9
blr

There's a separate instruction named "addi" (add immediate) to add a constant; plain "add" only works on registers.

li r8, 8
addi r3,r8,1000
blr

PowerPC machine code always uses four bytes for every instruction (it's RISC), while x86 uses from one to a dozen bytes per instruction (it's CISC). Here's a good but long retrospective article on the RISC-vs-CISC war, which got pretty intense during the 1990's. Nowadays, RISC machines compress their instructions (like CISC), while CISC machines decode their instructions into fixed-size blocks (like RISC), so the war ended in the best possible way--both sides have basically joined forces!

One effect of fixed-size instructions is you can't load a 32-bit constant in a single instruction:

li r3, 0xabcdef ; ERROR! out of range!
blr

Instead, you break the 32-bit constant into two 16-bit pieces. They have a dedicated load-and-shift instruction "lis":

lis r3, 0xab ; "load immediate shifted" (the high half)
ori r3,r3, 0xcdef ; "or immediate" (the low half)
blr

Accessing Memory

Memory is accessed with the "lwz" (load word) and "stw" (store word) instructions. Unlike x86, these are the *only* instructions that access memory; you can't do an "add" with one operand in memory!

lwz r3, 0(r1) ; load register r3 from the stack
blr

Here I'm writing an integer out to the stack, then reading it in again.

li r7, 123
stw r7, 0(r1) ; store register r7 to the stack
lwz r3, 0(r1) ; load register r3 from the stack
blr

There are "updating" variants of load and store called "lwzu" and "stwu". These actually change the value of the pointer used as an address. For example,this code does two things:
stwu r7, -4(r1)

Store r7 into memory at address (r1-4).
Modify r1 = r1-4.

Here's an example:

li r7, 123
stwu r7, -16(r1) ; store register r7 to the stack (with push)
lwzu r3, 0(r1) ; load register r3 from the stack
addi r1,r1,16 ; clean up the stack
blr

Array indexing mostly has to be done manually. If r5 is the start of the array, and r6 is the index, you have to do something like this:

ori r5,r1,0 ; array pointer==stack pointer
li r6,2 ; array index
mulli r8,r6,4; array index*4
add r8,r8,r5; add base pointer
lwz r3,0(r8); access memory there
blr

You can combine the add and lwz with a "lwzx":

ori r5,r1,0 ; array pointer==stack pointer
li r6,2 ; array index
mulli r8,r6,4; array index*4
lwzx r3,r5,r8; access memory at base + index
blr

Calling Functions

You can get into a function pretty easily, with a "b" (branch, like "jmp") instruction:
li r3,99
b _print_int
blr

Here, _print_int will end with its own "blr", which will jump straight back to main, skipping us. Getting control back from a function is much trickier. The problem is a function will end with "blr" (Branch to the Link Register); the Link Register can only hold one value at a time. So if you just overwrite the Link Register with your own value, you can't return to main!

So this "bl" (Branch and Link) will return control back to you, but then *keep* returning control back to you, in an infinite loop:

li r3,99
bl _print_int
blr ; Oops! We trashed LR with the "bl" above!

The sequence of events here is:

Main calls us with "bl foo". "bl" will overwrite LR to point back to main.
We call "bl print_int". "bl" will overwrite LR to point back to us.
Print_int returns with "blr". That transfers control back to LR, which is us.
We return with "blr", but that just transfers control back to us again!
. repeat forever .

mflr r28 ; save main's link register

li r3,99
bl _print_int ; "bl" will overwrite LR, so print_int can return here

mtlr r28 ; restore main's link register
blr ; now this works. sorta

OK! Everybody returns correctly now, but main complains we overwrote its preserved data (r28 is preserved).

So now we save the old link register onto the stack:

mflr r0 ; save main's link register.
stwu r0,-32(r1); . onto the stack

li r3,99
bl _print_int ; "bl" will overwrite LR, so print_int can return here

lwz r0,0(r1); grab main's link register from the stack
addi r1,r1,32 ; restore the stack
mtlr r0 ; restore main's link register
blr ; finally, this works correctly!

Whew! The x86 "call" and "ret" are looking a lot better now!

More Info

The IBM 32-Bit PowerPC Programming Environment gives all the instructions in chapter 8.1 and a good overview in chapter 4. The IBM Compiler Writer's Guide gives the calling conventions.