PASM from the Beginning (by Brad Manske)

"PASM can wait till after 1.0" was my reaction when I heard the plans to include a cross platform assembler with PPL. While it was not a lot of code, the complexity was way up there. So that it could run on multiple processors it had to be a virtual processor that was compiled to. There were 36 addressing modes for 22 assembly instructions that would potentially compile to a series of instructions for 2 different processors. The project required complex and intimate knowledge of the processors and the testing challenge was not going to be easy.

My earliest e-mail on this project (that I kept) is dated the 20th of March 2004. We had already worked together for almost 2 years when the topic came up. There is no backing down from a complex technical challenge, so even though the pressure to release 1.0 was high, PASM went forward. This article will introduce you to PASM and it just may be the raw speed boost your code needs. I will start off with some explanation for the people who have not had much exposure to assembly language. Assembly is a text representation of the 1s and 0s that the computer actually executes. For example:

X$ = 10;

This would be an instruction to move the value 10 into the variable X$:

move x, 10

Sound simple? Well, yes, if the processor supports moving a value into memory without going through a register first. And if the value fits in a 32 bit register (Windows CE requires a 32 bit processor). And, etc...

This is the reason that PASM uses a virtual processor internally. We couldn't guarantee that these conditions would be met for each processor since all of the code written for PASM must run on all of our supported platforms.

The PASM virtual processor is made up of 4 general purpose registers named R0, R1, R2 and R3. There are more specialized registers like the Stack Pointer (SP) and the Stack Frame (SF). The Arithmetic and Logic unit of our processor is a simplified RISC (Reduced Instruction Set Computing) like design. The instruction set consists of about 22 different assembly operations. This isn't much compared to the hundreds of instructions supported by some processors, but there are 36 addressing modes to offset the simplicity of the instruction set.Here is a quick look at the MOV instruction and the addressing modes that it supports. For 32 bit values:

mov Register, Value move from Value to Register
mov Absolute, Value move from Value to Absolute address
mov [Register], Value move from Value to Indexed Register
mov [Absolute], Value move from Value to Indexed Absolute address
mov Register, Register move from Register to Register
mov Register, [Register] move from Indexed Register to Register
mov [Register], Register move from Register to Indexed Register
mov [Register], [Register] move from Indexed Register to Indexed Register
mov Absolute, Register move from Register to Absolute address
mov [Absolute], Register move from Register to Indexed Absolute address
mov Register, [Register+offset] move from Index+Offset Register to Register
mov [Register+offset], Register move from Register to Index+Offset Register
mov [Register+offset], [Register+offset] move from Index+Offset Register to Index+Offset Register
mov Absolute+offset, Register move from Register to Absolute+Offset address
mov Absolute+offset, Value move from Value to Absolute+Offset address
mov [Absolute+offset], Register move from Register to Index+Offset Absolute address
mov [Absolute+offset], Value move from Value to Index+Offset Absolute address
mov [Register+offset], Value move from Value to Index+Offset Register

MOV also supports a size modifier for 8 bits (byte) and 16 bit (word) values:

mov size Register, Register move size from Register to Register
mov size Register, [Register] move size from Indexed Register to Register
mov size [Register], Register move size from Register to Indexed Register
mov size [Register], [Register] move size from Indexed Register to Indexed Register
mov size Absolute, Register move size from Register to Absolute address
mov size Absolute, Value move size from Value to Absolute address
mov size [Absolute], Register move size from Register to Indexed Absolute address
mov size [Absolute], Value move size from Value to Indexed Absolute address
mov size Register, Value move size from Value to Register
mov size [Register], Value move size from Value to Indexed Register
mov size Register, [Register+offset] move size from Index+Offset Register to Register
mov size [Register+offset], Register move size from Register to Index+Offset Register
mov size [Register+offset], [Register+offset] move size from Index+Offset Register to Index+Offset Register
mov size Absolute+offset, Register move size from Register to Absolute+Offset address
mov size Absolute+offset, Value move size from Value to Absolute+Offset address
mov size [Absolute+offset], Register move size from Register to Index+Offset Absolute address
mov size [Absolute+offset], Value move size from Value to Index+Offset Absolute address
mov size [Register+offset], Value move size from Value to Index+Offset Register

A few quick note about the notation above. The brackets [] above mean that the value of the expression inside the brackets is the memory location that will be operated on. Register is a register R0 to R3 or one of the special registers. Absolute, is a number representing a specific memory location. Offset an integer value that allows you to adjust the value of the memory address operated on without the need to modify the base.
Here is a very simple example:

#include "console.ppl"

func WinMain;
InitConsole;
ShowConsole;

new(startVal$, tint);
new(endVal$, tint);

StartVal$ = 0;
EndVal$ = 0;

asmCall$ = asm(1024, );

callasm(asmCall$, 20, 30);

writeln("Test "+ startVal$ + ", "+ endVal$);

freeasm(asmCall$);

free(startVal$, endVal$);

return(true);
end;


If you read my previous articles, you know that I'm a fan of using the console for my examples, so it should be no surprise that I first include the console. Next I declare some variables in the PPL memory space outside of PASM. Next is the assembly code followed by the CallASM instruction. Some values are written and the assembly code and variables are freed. When compiled, the call to ASM takes 2 arguments the first being the size of the byte buffer that holds the assembled code and the second is the string of assembly instructions. The buffer is specified in bytes and a multiplier is used on the buffer size depending on what you are doing.

For example, by running your code under debug, it is possible to step through and break on assembly instructions. In order to do this, extra machine code instructions are inserted to support doing this so the buffer must be expanded. It also means that your code will execute slower under debug than it will in run mode. The buffer is created at run time and the assembler runs against the 2nd argument which is the text with all of the assembly instructions. So keep in mind that if you make a change to the assembly code, any errors will not be found until run time. It also means if you plan on using the assembler you may want
to place your ASM instructions at startup and keep them for the duration of the program so that the code is not reassembled during the execution of your program when you really need the speed.

The CallASM instruction invokes the code created in the buffer by the ASM command. CallASM can take additional arguments that will be passed into the assembly code as parameters. The parameters are placed into an AARGS$ array and the size of the array is placed into AARGSCOUNT$. Each of the parameters are treated as a 4 byte (32 bit) value. So the value of 20 is at [AARGS$] and the value of 30 is at [AARGS$+4].

The line ":main" above indicates the entry point to your assembly code. This is a label and is used as the target in jump instructions. The line "#DEASM" above instructs the ASM instruction place the actual assembly instruction for your processor to be placed into the DebugLog file. This does take extra time, so it shouldn't be used in production programs. The #debugoff pragma can be used to disable this for the entire project. Here is a simple example when using #DEASM. The 2 lines from above:

mov StartVal$, 1
mov EndVal$, 2

On Intel processors are translated into:

mov edi, D45B28h(STARTVAL$)
mov [edi], 01h
mov edi, D45B98h(ENDVAL$)
mov [edi], 02h


On Arm processors are translated into:

ldr r9, 34C9B0h(STARTVAL$)
ldr r10, #01h
str r10, [r9]
str r10, #01h
ldr r9, 34CA40h(ENDVAL$)
ldr r10, #02h
str r10, [r9]
str r10, #02h

The lines from the PASM example above:

savesp
pplpush [AARGSCOUNT$]
ppl showmessage


Show an example of saving the position on the stack pushing some arguments onto the stack where ppl can get to it then calling a PPL function. Here is another example:

savesp
pplpushstr tstStr1$
pplpushstr tstStr2$
pplpushstr tstStr3$
ppl concat
pplpull


In this example, the stack pointer is saved, all of the required arguments are placed onto the stack and the PPL concat function is called to concatenate the strings together. The stack is restored to its previous state after the PPL call, then the address of the new string is pulled from the stack. The new string is created in a new memory space that the garbage collector will automatically clean up from. I'll leave you with one more example. This example demonstrates the usage of Jump instructions and the use of an assembly procedure. The entry point is at ":main". It tests the number of arguments passed into the assembly code to see if there is only one. In this case there is only one so the value of 20 is passed to the function "!asmCalc". As in high level code, the string in parenthesis becomes a variable for the function. The "Var FinSum" instructions declares a local var for use within the function. The function then calculates the Fibonacci series on the number passed to it adding all of the numbers from n + (n-1) + ... + 2 + 1.

#include "console.ppl"

func WinMain;
InitConsole;
ShowConsole;

new(startVal$, tint);
new(endVal$, tint);

asmFib$ = asm(1024, );

t$ = tick;

callasm(asmFib$, 20);

writeln("Fibbon("+ startVal$ + ")="+ endVal$ + " time =" + (tick - t$));

freeasm(asmFib$);

free(startVal$, endVal$);

return(true);
end;


Play with PASM a while and let us know what you think in the Forums. In the next newsletter, I will address some more advanced examples.