Retro Assembler
Table of Contents
Global Labels
Local Labels
Regional Labels
Current Memory Address
About the Assembler
Retro Assembler was created as a hobby project to work with source codes targeting microcomputers. Hence the name Retro. My background is in coding a lot of demos and a few games on Commodore 64 and Plus/4, so the main target is the 6502 CPU family which is the closest to my heart. I happen to know I'm not alone with that feeling. But I also worked on the Amiga and Gameboy Color, so the assembler was created in a way that it can support multiple platforms with little effort on my part. Ultimately the goal is to support several CPUs and help the task of software development with neat assembler features. Gameboy CPU and standard Zilog Z80 support is planned, Motorola 68000 and ARM V7 are to be decided.
Supported CPU types
CPU Name | Description |
---|---|
6502 | The well known MOS6502 CPU and its standardized variants. |
6510 | Alternative name for the standard 6502, because the Commodore 64 deserves it. No functional difference. |
65C02 | An extension over the standard 6502 instruction set, with additional vendor specific WDC/Rockwell instructions. |
65SC02 | A version of the 65C02 without bit instructions (BBR, BBS, RMB, SMB). |
Configuration
The application has various command line switches that can change the assembler's behavior. You can set up your own defaults in the RetroAssembler.config file for most of these, along with your preferred include paths. Just be careful with future updates that would overwrite this customized file of yours.
Assembler mode (-a switch)
This is the default mode, where the assembler loads the main source file (with optional source file includes in it) and compiles it according to the rules of the target platform. It's set to 6502 with 64KB RAM by default.
The assembler works with separate memory segments. When the end result is saved, it saves a file made of all segments merged together, and if the source code used more than just one segment, it saves each individual segment as separate files, named after the segment's unique name.
When the output format is T64, the merged file and the optional segments are all saved into a single T64 container file. One container can hold up to 30 files, so keep your segments at bay when you use T64. But using the default Bin file format is recommended.
Disassembler mode (-d switch)
Optionally the assembler is capable of loading a .bin binary file and disassemble it into a text file according to the rules of the target platform that the binary file was made for. The default platform is 6502 and for other CPU types you have to choose (-C=<name> switch). Files loaded without using a load address header (-r switch) default to starting memory address $0000.
Other switches
Switch | Description | Mode |
-c | Turns on Case Sensitive mode for Labels, Functions and Macros. | A |
-m | Puts the start-end memory addresses into the output file name(s). | A |
-x | Prints out Global Labels and their values after a successful code compiling. | A |
-r | Saves (assembler) or Loads (disassembler) raw binary files without the load address header. | A / D |
-u | Allows using undocumented CPU instructions for the selected target CPU. | A / D |
-C=<name> | Sets the target platform's CPU type. See Supported CPU Types for accepted values. | A / D |
Sets the target platform's memory size in kilobytes. The upper limit is 128 megabytes (131072 kilobytes). | A / D | |
-O=<type> | Sets the output file's type. The value is handled as case insensitive. bin: raw binary file, with or without load address header (works with -r switch). h6x: text file with a printed memory dump. t64: T64 file (C64S Tape File Format) for emulators. It may contain multiple files. d64: D64 file (Disk Image Format) for emulators. The assembler creates a T64 file, then converts it using the utility "c1541" found in (Win)VICE. You must set up your VICE directory path in RetroAssembler.config in order to use this option, unless "c1541" happens to be in your search path. |
A |
The application was written for Windows (32/64 bit) in C# and requires .net Framework 4.7 .net Core version is planned, but alternatively it runs perfectly on Linux (and macOS) using Mono. It has been tested on Raspberry Pi 3.
If you use Retro Assembler, I would be happy to hear from you. You can tweet me at @Peter_Tihanyi
Enjoy!
Value Types
The following value types can be used in directives and instructions:
Numbers
Numbers can be decimal, hexadecimal or binary values.
123 //Decimal value. $12 //Hexadecimal value. 0x12 //Hexadecimal value (alternative). $1234 //Hexadecimal value, 16 bit. %100101 //Binary value.
Strings
Strings are one or more characters in double quotes, translating to ASCII bytes.
"Hello world!" //Normal text "Hello\nWorld!" //Normal text with an escaped newline character.
Characters
A character is one character in single quotes, translating to an ASCII byte.
'X' //X character '\r' //An escaped newline character. lda #'X' //The X character used as a byte value in an instruction.
Expressions and Operators
Expressions constructed of operators, values and Labels can be used in virtually any directive or instruction, which allows for some clever code building mechanics.
Brackets can be used to form more complex expressions in directives (except for .function and .macro), but not in instructions due to the complexity raised by the addressing types using brackets. However, you can still use expressions within instructions, just make sure they evaluate correctly without using brackets.
The available operators, in the order of evaluation:
Symbol ( ) * / + - <N >N << >> & ^ | == != < > <= >= && || , |
Type of operation Expression Multiplicative Additive Low and High Byte Bitwise shift Bitwise-AND Bitwise-exclusive-OR Bitwise-inclusive-OR Equality Relational Logical-AND Logical-OR Sequential evaluation |
Labels (including the Current Memory Address pointer) get replaced by their number value during expression evaluation, and in directives it's required to use labels with prior definitions in order to build reliable code.
The comparers (== != <= etc) and logical operators (&& ||) work best in .if and .while directives, because their end result is either 1 for true, or 0 for false. If you use them with .equ, you'll just end up with a 0/1 number that you can use as a flag.
Examples
MyValue .equ (MyConstant << 4) + 5 .if MyValue >= 8 || OtherValue != 13 (code lines) .endif //Assuming IrqRoutine is at $087c lda #<IrqRoutine //Low byte : $7c sta $0314 lda #>IrqRoutine //High byte: $08 sta $0315
Labels
Labels (also known as Symbols) are constants or memory addresses that can be used in directives and instructions as parts of expressions and operands.
A label's name can only contain letters, numbers and the "_" character, and it can't begin with a number.
White spaces are usually ignored in the source code file. The only place where you must use a space or tab (or a colon) as separator is between the main label of the code line and the directive/instruction after it.
To maintain compatibility with other assemblers, if the label is followed by a ":" (colon) character, this character gets processed as white space. So "MyLabel:" is the same as "MyLabel", you can format them either way.
Global Labels
The scope of a Global Label is the entire source code, so they must be named uniquely. Labels can be defined the following ways:
MyLabel .equ $73 //MyLabel gets the constant value $73 MemAddress lda #$00 //MemAddress gets the value of the current memory address, eg $0813
Then the value of these labels can be used in directives and instructions with ease. Keep in mind that only those labels can be used in directives, that have been defined before the directive's code line.
MyLabel .equ $73 //MyLabel gets the constant value $73 lda #MyLabel+2 //lda #$75 sta WhoKnows //sta $0000 because WhoKnows is unknown //and the therefore assembler assumes a 16 bit value. MyNewLabel .equ MyLabel+4 //$77 MyNewLabel2 .equ WhoKnows+4 //ERROR: The label WhoKnows is not defined yet. MyLabel .equ 55 //ERROR: The label "MyLabel" is already defined. MyVariable .var 10 //Create a variable with the initial value 10 MyVariable .equ MyVariable+1 //The variable is updated to 10 + 1 = 11 MyVariable = MyVariable+1 //The variable is updated to 11 + 1 = 12
You can use Labels almost anywhere. They will get replaced by the number value they hold (constant or memory address) and this value will be used in calculations, or in determining the addressing type of certain instructions.
If the label's value is not known during the time the assembler gets to an instruction that uses it (in the 1st Pass), the assembler assumes that it's 16 bit memory address for those instruction addressing types that work with memory addresses.
For example an "sta MyLabel,y" instruction placed before MyLabel's definition will be handled as "sta $0000,y" (3 bytes) instead of "sta $73,y" (2 bytes), then in the 2nd Pass the code will be entered as "sta $0073,y" (3 bytes) because the worst case was already assumed in 1st Pass, in order to get the number of bytes that the instruction will take in memory.
So if you want to use zero page values for a faster code execution, define those labels before the code lines that try to use them.
Local Labels
There is a Local Label type, that can be used with a limited scope. Defining a new Global Label and using certain directives close down the currently "open" Local labels, by setting a range of start and end line numbers in the merged source code, where the Local label can be addressed.
By having this automatic closure, the Local label names can be redefined in various sections of the code. Like small loop brances can just utilize the same @Loop label at most places, without running into any "label exists" errors.
The name must start with the "@" character, then any letter, number or the "_" character can be used in any combination. This means that even the really simple "@1" is accepted as a local label name.
Example
FillMem ldx #$00 lda #$ff @Loop sta MemAddress,x //Define a new Local label to "use it, then forget about it". inx cpx #$28 bne @Loop //Branches back to "sta MemAddress,x" as it should. NewLabel lda #$00 beq @Loop //ERROR: The @Loop label can't be found, because the //definition of NewLabel closed its range.
The Local label ranges are closed by the directives .function, .loop, .if, .while and their .end* counterparts, as well as Function and Macro references.
It's recommended to use local labels inside Macro code lines, loops and whiles, and wherever else you need labels with a short life span.
Regional Labels
If you understand how Local labels like "@Loop" works, you will understand Regional labels too. They are almost the same, labels with a limited scope. But unlike Local labels, these don't get closed down by the definition of a Global label, or by using a directives .loop, .if and .while.
They were designed to be used in Macros and Functions, so they stay open and available until an .endmacro or .endfunction directive is used. This allows the programmer to avoid using Global labels mainly in Macros (and also in Functions), which would lead to a "Global label exists" error when a Macro is referenced in the source code more than once.
The name must start with "@@" characters, then any letter, number or the "_" character can be used in any combination. This means that even the really simple "@@1" is accepted as a Regional label name.
Example
.macro MyMacro() FillMem ldx #$00 lda #$ff @@Loop sta MemAddress,x //@@Loop is now a Regional label. inx cpx #$28 bne @@Loop //Branches back to "sta MemAddress,x" as it should. NewLabel lda #$00 //Defining a new Global label here, that would close down Local labels. beq @@Loop //It's OK! Compared to the Local label example above, this works fine. .endmacro //Let's use this macro in our code. //It will inject the macro's code lines at place, with modifications. MyMacro() jmp @@Loop //ERROR: The @@Loop label can't be found, because its range has been closed.
The Regional label ranges are closed by the directives .function, .endfunction, .macro, .endmacro
Current Memory Address
Another kind of label that's worth mentioning is the "*" (asterisk) character, and the "pc" keyword on the 6502 family.
"*" or "pc" (on 6502) gets replaced by the current memory address, like $0813 during expression evaluation.
Please note that the "*" character is also used in multiplications, so the assembler tries to determine the context where the "*" character is used in, and acts accordingly.
Examples
ldx #$07 dex bne *-3 //Branches back to "ldx #$07" jmp * //Infinite loop to the memory address where the "jmp" instruction is. jmp pc //The same, only works on 6502 because that has no addressable "pc" register. MyLabel = *+$20 //The current memory address +$20 will be set as value for MyLabel. MyLabel2 = 5 * 6 //MyLabel2 will get the value 30 due to the multiplication.
Comments
Comments can be placed at the end of any directive or instruction, or they can be the only content in a code line. They are ignored by the assembler, so a code line that has been commented out is not processed at all. The comment markers are // and ; that work equally.
Block comments are also supported, where multiple code lines can be commented out, or the source code can contain a bigger block of text without prefixing each line with the comment marker. The block comment markers are /* for opening and */ for closing the block.
Examples
MyLabel lda #$0e //This is a comment for the instruction. MyLabel2 ldx #$06 ;This is also a comment, with the alternative comment marker ";". MyLabel3 //ldy #$00 //Now this is just a line with "MyLabel3" in it, the instruction is ignored. /* Some optional comment text, and the encapsulated code lines are ignored. sta $d020 //Ignored. stx $d021 //Ignored. */ nop //The "nop" instruction is actually processed as valid code content.
Directives
Directives are control commands for the assembler. The generally accepted format is:
[label] .directive parameter(s) [comment]
Specific parameters and formatting exceptions will be explained in the description of each directive. Certain directives have alternate names (aliases), they are interchangeable with the official name.
Please note that Labels used as directive parameters must have prior definitions, meaning their value (usually a constant or a memory address) must be defined before the directive code line in the source code.
.target
Sets the target architecture by specifying the CPU type and memory size. In some cases it's useful to set this up from the project's main source code file.
Once you entered any instruction, this directive is no longer allowed to use.
See the supported CPU types listed above. They are case insensitive.
The maximum supported memory size is 128 megabytes (131072 kilobytes).
Format
.target "CPU type", MemorySizeKB
Examples
.target "6510", 64 .target "65C02", 64
.org
Instructions and data bytes are always placed in the currently selected memory segment, right at the Current Memory Address, which is also called as Program Counter. The Originate directive sets this pointer to a defined memory address, to control where the program will be compiled in memory.
Format
[label] .org MemoryAddress
Optionally you can put a label in front of .org, then this label will get the selected address as value.
Examples
.org $2000 *= $2000 pc = $2000 //This works only on 6502 because that CPU has no addressable "pc" register.
Alternatives.pc*=pc=
.equ
Assigns a constant value to a Label, which later can be used as directive parameter, instruction operand, part of an expression etc. The value may be a number, another Label with previous definition, or an expression that evaluates to a number. Label values can be assigned only once, unless you use the .var directive.
If the Label is an existing entry marked as Variable, then this directive updates its value.
Format
Label .equ Value
The = character also can be used instead of .equ to make programming easier.
Examples
MyValue .equ 123 //Works only for the first time, unless it's a Variable. MyValue = Start + $0200 //Works only for the first time, unless it's a Variable. Counter = Counter + 1 //This is a Variable in our example, so this works anytime.
Alternative=
.var
Creates a Label marked as Variable. It has to be a Label that doesn't exist yet (neither as a Constant or a Variable), and then it can be updated in the code with the .equ directive. Which has the shortcut "=", so it can be just updated as VariableName = NewValue.
Format
Label .var Constant
Example
Counter .var 10 //Create a Variable with the initial value 10. Counter = Counter - 1 //Update the Variable's value, even by using expressions.
.align
Aligns the upcoming instructions or data to the next "round" memory address. The alignment value must be the power of 2, such as 2, 4, 8, 16, 32, 64 etc. The default filler byte is the selected CPU's "nop" instruction opcode.
Alignment works in relocatable memory segments as well.
Format
[label] .align Alignment, [Filler byte]
Examples
.align $100 .align $80, $ea
.storage
Preserves the following Length number of bytes in the memory to be used as storage bytes. The default filler byte is $00.
Format
[label] .storage Length, [Filler byte]
Examples
.storage $20 .storage $20, $ff
Alternatives.ds.fill
.closelabels
This directive forcibly closes the range of currently opened (still addressable) Local and Regional labels. It's mainly for unconventional use cases of Regional labels, should you decide to utilize them outside of a Macro or Function for some reason. Then by using .closelabels you can reset these labels for reusability reasons.
Format
.closelabels
.break
Creates a breakpoint, that will be saved into "moncommands.txt" for debugging in the VICE Monitor.
If there is at least one breakpoint in the source code, the monitor commands list gets saved and the list of labels are also included. But if there are no breakpoints set, the "moncommands.txt" file simply gets deleted, and doesn't get created. So if all you need is the labels, just set a breakpoint somewhere in the source code, even after the last instruction to force the file's creation.
Format
.break [IF Condition]
Examples
nop .break lda CurrentColor .break "A == $01" sta $d021The output in "moncommands.txt" will be something like this:
break 0821 break 0824 if A == $01 al 0070 .NtscSystemFlag al 0071 .NtscPlayCounter al 0855 .irq al 08b1 .CurrentColor
.debug
Prints the debug text on the console while compiling the code in the 2nd pass. The parameters can be combinations of strings, numbers, labels and expressions.
Format
.debug parameter(s)
Example
.debug "The current memory address is " * ", how cool is that!"
Alternative.out
.include
Includes the content of a source code file at the directive's code line, as if it was part of the main source code file's contents. Include files may use the .include directive to load other files, up to 16 levels of depth.
If the file is without a full path, the assembler tries to find it in known Include directories. The defaults are the input source code file's directory, the assembler application's base directory and the Include directory under these two. The rest of the lookup directories can be set up in the app.config file.
Format
.include "filename.ext"
Example
.include "C64_Registers.inc"
.incbin
Loads a binary file into the current memory segment, either at the current memory address (that you can control with a prior .org command), or at the memory location specified by the file's 2-byte load address header in auto mode. It's good for loading graphics assets, music and other data content, if you want to use the assembler as a linker.
If the file is without a full path, the assembler tries to find it in known Include directories. The defaults are the input source code file's directory, the assembler application's base directory and the Include (alternatively include) directory under these two. The rest of the lookup directories can be set up in the app.config file.
Unless you use the auto property, you can optionally set an Offset and even a Length, to control what sections to load from a more complex binary file. But in this mode the file is loaded at the current memory address, so make sure you set that up correctly beforehand.
Format
.incbin "filename.ext", [Offset], [Length] .incbin "filename.ext", auto
Example
.incbin "music.bin" auto //Load the file at $1000 where it belongs.
.segment
Segments are handled as separate virtual memory buffers within the assembler. They can be used to separate parts of the code, data blocks, memory banks on systems where they can be paged in, etc. Each segment can be set up to be of a certain size in kilobytes, but if you don't specify that, the target architecture's default memory size will be used. Therefore segments may overlap each other, but not in the assembler's memory buffers.
When a segment is first mentioned, it gets created. In further cases the assembler simply switches to the existing segment by the same name and continutes to put instructions and data to the selected segment's current memory address.
The assembler creates three standard segments by default: Code, Data and BSS, in this order. These can be accessed with directive shortcuts. The Code segment has a default start address of $0800, the others are created without a specific start address, meaning they can be relocated during code compilation.
It's possible to have a segment, where you start out with relocateable data, but once you put in an .org directive to set a specific memory address within the segment, the rest of the code will be absolute, that will not be relocated.
When a segment's data is relocated, "the previous segment's last used memory address + 1" is used as starting memory address.
At the end of code compilation, if there are multiple segments in actual use, the assembler saves a merged file where the segments may overlap (resulting in some possible data loss) and also saves each segment's used memory bytes individually. It's recommended to turn on the "show memory addresses in filename" option, to make binary file management a bit easier.
Format
.segment "Name", [Start Address], [Memory Size in KB]
Examples
.segment "Scroll" .segment "Bank2", $8000, 16
You may do rapid switching between segments to separate certain data types. For example:
.segment "Code" //".code" would do the same, as shortcut. (Scroller subroutine code lines) .segment "Data" //".data" would do the same, as shortcut. .stext "hello, this is my scroll text!" .byte ' ', $ff //End of the scroll text with an additional space before repeat. .segment "Code" //".code" would do the same, as shortcut. (other subroutines)
This way the scroller code and the scroll text can be kept near each other in the source code itself (they may even come from an include file), but the Code and Data would still be separated in memory during code compilation. In this example the scroll text bytes get placed after the (other subroutines) instructions in the output binary file.
Using segments is not necessary in most cases, you can just put all your code and data into the default Code segment and never change it. But if you must work with overlapping data, give segments a try. You can even put together a whole Commodore 64 demo that spans through a floppy disk, in a single project. If you're into such things.
.code, .data, .bss
Shortcut to the default Code, Data and BSS segments, respectively. It's the same as using the .segment directive with the selected segment's name, such as:
.segment "Data"
Format
.code .data .bss
.region
Regions are logical blocks that encapsulate one or more source code lines. This directive is ignored by the assembler, but can be used in certain text editors to fold regions on demand. It's just like the #region directive in Visual Studio, purely a visual element.
This directive must be closed by using .endregion
Format
.region [Region name as free text] (Code lines) .endregion
.endregion
Closes the previously opened .region directive, so the encapsulated source code lines can be folded in certain text editors. It's just like the #endregion directive in Visual Studio, purely a visual element.
Format
.endregion
.function
Functions are logical blocks that encapsulate one or more source code lines, that are meant to be called as a subroutine. They don't have calling parameters, and internally this directive just converts the function's name into a Label.
Functions can be called in the code using "FunctionName()". The assembler replaces this with the "jsr FunctionName" instruction for 6502 code, or with the appropriate mnemonic for other CPU types.
This directive must be closed by using .endfunction
Format
.function FunctionName()
Example
.function Scroller() lda #$07 sta $d016 (other code lines) .endfunction //Serves as "rts".
Calling example
lda #$0f sta $d020 Scroller() //Same as "jsr Scroller". Lda #$00 sta $d020
Please note that you can't open a new .function or .macro inside a function, but you are free to use .loop, .if and .while directives.
.endfunction
Closes the previously opened .function directive. The assembler replaces this with the "rts" instruction for 6502 code, or with the appropriate mnemonic for other CPU types, so the subroutine can return automatically after the last code line. You may put your own "rts" instruction into the function at the point of your chosen return, but at the end the assembler will always add an "rts" as closure.
Format
[label] .endfunction
Examples
.endfunction Return .endfunction //Marks the "rts" instruction with the global label "Return".
Alternative.endf
.macro
Macros are logical blocks that encapsulate one or more source code lines, that are compiled into the segment memory at the place of the macro call, using the arguments set by the macro call.
A macro can have one, more or zero parameters, and each parameter can optionally get a default value, in case the parameter is not specified in the macro call itself. If you don't set a default value for a parameter, it will be handled as number 0 and you better set a value for that parameter during the actual macro call, unless you happen to need 0 there.
The parameter names don't actually get created as global labels, so you can reuse parameter names or you can use names that you defined as a label elsewhere. Also, macros of course can use global labels and local labels inside the code block, but if you need to define a label there, you better use a local label.
As this might be a bit complicated, so pay attention to the example below where I'll try to highlight the features.
This directive must be closed by using .endmacro
Format
.macro MacroName( [Parameters=[DefaultValues]] )
Example
.macro SetColors(BackgroundColor=$06, BorderColor=$0e, MemAddress) lda #BackgroundColor sta $d021 lda #BorderColor sta $d020 ldx MemAddress stx $3300 .endmacro
Calling examples
MyColor .equ $09 SetColors($0b, $0f) //BackgroundColor is $0b, BorderColor is $0f. SetColors(, MyColor+1) //BackgroundColor is the default $06, BorderColor is $0a ($09 + 1). SetColors() //Use the default values $06 and $0e for the colors.
Note that we never set a value for MemAddress, so it keeps reading the value from memory address $0000.
Please note that you can't open a new .macro or .function inside a macro, but you are free to use .loop, .if and .while directives, that may be controlled by the calling arguments of the macro.
.endmacro
Closes the previously opened .macro directive.
Format
.endmacro
Alternative.endm
.loop
Loop blocks are logical blocks that encapsulate one or more source code lines, that get compiled into the segment memory LoopCount times in a row.
This directive must be closed by using .endloop
Format
.loop LoopCount
Example
.Loop 8 nop .endloop
This example is intentionally simplistic, but you can do some clever things with loops, especially if you keep modifying a variable value inside the loop, and use that value as a code modifier.
.endloop
Closes the previously opened .loop directive.
Format
.endloop
Alternative.endl
.if
If blocks are logical blocks that encapsulate one or more source code lines, that get compiled into the segment memory only if the conditional value or expression evaluates to 1 (true).
Since it works with expressions, arithmetic and logical comparisons etc, it can be a rather powerful tool.
This directive must be closed by using .endif
Format
.if Condition
Example
.if (SomeValue >= 2) || (OtherValue == 13) lda #$00 sta $d020 .endif
.endif
Closes the previously opened .if directive.
Format
.endif
.while
While blocks are logical blocks that encapsulate one or more source code lines, that get compiled into the segment memory in a loop, as long as the conditional value or expression keeps evaluating to 1 (true) during each iteration.
Since it works with expressions, arithmetic and logical comparisons etc, it can be a rather powerful tool.
Given how easy it is to get into an endless loop with a badly set condition, you must be careful with this. Having some variable like a counter or other value that is constantly (or just sometimes) updated inside the block is key.
This directive must be closed by using .endwhile
Format
.while Condition
Example
MyCounter .var 0 .while MyCounter != 20 sta $3200 + MyCounter MyCounter = MyCounter+1 .endwhile
.endwhile
Closes the previously opened .while directive.
Format
.endwhile
Alternative.endw
.byte
Puts one or more bytes at the current memory address.
The accepted, comma separated value types are 8-bit numbers (0-255), characters and strings. Characters and strings are converted to ASCII bytes, just like the .text directive does it.
Format
[label] .byte Value, [Values]
Example
.byte $12, %1001, <MyLabel, '\t', "My string value"
Alternative.b
.word
Puts one or more words (16 bit values) at the current memory address.
The accepted, comma separated value types are 16-bit numbers (0-65535), that also include 8-bit numbers (0-255).
The word's two bytes are put into the memory buffer in the order of the target CPU's endianness. For the 6502 family it means that $1234 is entered as "$34, $12".
Format
[label] .word Value, [Values]
Example
.word $1234, $12, %1001, MyLabel
Alternative.w
.lobyte
Puts one or more bytes at the current memory address, using the entered value's Low byte (bits 1-8).
The accepted, comma separated value types are 16-bit numbers (0-65535), that also include 8-bit numbers (0-255).
Format
[label] .lobyte Value, [Values]
Example
.lobyte $1234, MyLabel
This is the same as
.byte <$1234, <MyLabel
.hibyte
Puts one or more bytes at the current memory address, using the entered value's High byte (bits 9-16).
The accepted, comma separated value types are 16-bit numbers (0-65535), that also include 8-bit numbers (0-255).
Format
[label] .hibyte Value, [Values]
Example
.hibyte $1234, MyLabel
This is the same as
.byte >$1234, >MyLabel
.text
Puts one or more bytes at the current memory address, by converting strings to ASCII text bytes.
The accepted, comma separated value types are characters and strings.
Format
[label] .text Value, [Values]
Example
.text "My String Value", 'c'
Alternatives.t.txt
.stext
Puts one or more bytes at the current memory address, by converting strings to simplified ASCII bytes used in scroll texts and other, more compact text displays.
The strings are first converted to lowercase, then the ASCII bytes are rearranged, so $60-$7f (@abc...) are at $00-$1f, while the symbols and numbers are left intact in the $20-$3f range. The end result is text that fits into a character set of 64 characters, which is typical in demos and games.
The accepted, comma separated value types are characters and strings.
Format
[label] .text Value, [Values]
Example
.stext "my scroll text!", 'c'
Alternatives.st.stxt
.generate
This directive generates byte values at the current memory address, depending on the selected Mode, and on the Parameter(s) that the Mode expects. Usually these are data tables that can be utilized for demo effects.
Each Mode has default parameters that they can fall back to if a parameter is not provided, but it wouldn't hurt to set each parameter to your liking to avoid strange results.
Depending on what the generator does, if the resulting data values can't be saved in a single byte (for example making a sinwave between $00 and $03ff values), the generator creates two identically sized data outputs in a row. First the Low Bytes, then the High Bytes of each corresponding data value.
Format
[label] .generate "Mode", [Parameter(s)]
Modes with examples
//Generates a Sine Wave data table. .generate "sinwave", MinValue, MaxValue, Length, RotationDegrees .generate "sinwave", $00, $7f, $100, 270 //Make a wave starting with $00 by rotating it 270 degrees. .generate "sinwave", $00, $7f, $100 //No rotation requested. //Generates a Cosine Wave data table. .generate "coswave", MinValue, MaxValue, Length, RotationDegrees .generate "coswave", $20, $bf, $80, 180 //Start from the middle of the wave by rotating it 180 degrees. .generate "coswave", $20, $bf, $80 //No rotation requested. //Generates a Bounce Wave data table, an arch between Min-Max-Min. .generate "bouncewave", MinValue, MaxValue, Length, Flip .generate "bouncewave", $00, $7f, $100 //Bounce through $00-$7f-$00 .generate "bouncewave", $00, $7f, $100, 1 //Flip the data to be through $7f-$00-$7f
.memory
Executes various memory operations in the current memory segment.
The current memory location for new code/data doesn't get changed by this directive.
Format
.memory "Mode", StartAddress, Length, [Parameter(s)]
Mode | Parameter 1 | Parameter 2 | Description |
---|---|---|---|
fill | Byte value to fill with. | - | Fills the selected memory fragment with the selected byte. |
copy | Destination address. | - | Copies the selected memory fragment to the selected destination address. |
move | Destination address. | - | Moves the selected memory fragment to the selected destination address, then zeroes out the bytes at the original memory location. |
replace | Original byte value. | Replacement byte value. | Replaces a selected byte value to another in the selected memory fragment. |
add | Byte value to add. | - | Adds the selected byte to each byte in the selected memory fragment. |
subtract, sub | Byte value to subtract. | - | Subtracts the selected byte from each byte in the selected memory fragment. |
shiftleft, left | Number of bits to shift. | - | Shifts each byte in the selected memory fragment to the left by the selected number of bits. |
shiftright, right | Number of bits to shift. | - | Shifts each byte in the selected memory fragment to the right by the selected number of bits. |
negate, neg | - | - | Negates (inverts) the bits in each byte in the selected memory fragment. |
xor, eor | Byte value to use. | - | Performs a bit-wise XOR on each byte in the selected memory fragment. |
or | Byte value to use. | - | Performs a bit-wise OR on each byte in the selected memory fragment. |
and | Byte value to use. | - | Performs a bit-wise AND on each byte in the selected memory fragment. |
Example
.org $2000 .generate "sinwave", $00, $7f, $100 .memory "copy", $2000, $100, $2100 .memory "add", $2100, $100, $80
.memorydump
Saves the selected number of bytes from the target's virtual memory buffer into a text file, as .byte source code lines. There will be up to 8 bytes in each line and a separator after every 256 bytes in the generated source code text file.
This can be useful if you need to convert an existing binary file into a source code insert, or if you generate something inside the assembler that you want to save and then manage manually.
Optionally the bytes can be saved into a binary file, just in case that's a better way for the user.
Format
.memorydump "filename.ext", StartAddress, Length, [Binary mode]
Example
.org $2000 .generate "sinwave", $00, $7f, $100 .memorydump "MyWaveData.s", $2000, $100 .memorydump "MyWaveData.bin", $2000, $100, 1 //And the output in the file is like this: //Memory Dump - Start Address $2000 .byte $3f, $41, $42, $44, $45, $47, $48, $4a .byte $4b, $4d, $4e, $50, $51, $53, $54, $56 (...)
Instructions
Instructions are the actual assembly code that will be converted to an instruction opcode and operand bytes. The generally accepted format is:
[label] mnemonic [operand with chosen addressing type] [comment]
Example
BackgroundColor sta $d020 //In memory this looks like $8d $20 $d0
6502 Family
On the 6502 family the assembler uses the standard mnemonics and addressing types, like it's described on 6502 opcodes on 6502.org.
The standard mnemonics are:
adc, and, asl, bcc, bcs, beq, bit, bmi, bne, bpl brk, bvc, bvs, clc, cld, cli, clv, cmp, cpx, cpy dec, dex, dey, eor, inc, inx, iny, jmp, jsr, lda ldx, ldy, lsr, nop, ora, pha, php, pla, plp, rol ror, rti, rts, sbc, sec, sed, sei, sta, stx, sty tax, tay, tsx, txa, txs, tya
The assembler accepts xor as an alternative to eor, just in case it's in muscle memory.
Some undocumented (also called as illegal) instructions are also supported using the optional -u switch. They are described on oxyron.de (which is an awesome demo scene group, check out their work) but as a brief recap, here are the supported mnemonics and their alternatives:
ahx, alr, anc, arr, axs, dcp, isb, isc, las, lax rla, rra, sax, sbc, shx, shy, tas, xaa slo (aso), sre (lse), kil (jam, hlt)
Instead of using dnp or tnp or other variants for double and triple nop, the assembler uses the actual nop mnemonic with various addressing types. These are actually useful in demos, the other undocumented instructions not so much, and those can be unstable on different CPUs.
nop #$nn //$80, uses 2 CPU cycles nop $nn //$03, uses 3 CPU cycles nop $nn,x //$14, uses 4 CPU cycles nop $nnnn //$0c, uses 4 CPU cycles nop $nnnn,x //$1c, uses 4 CPU cycles, +1 for crossing a page boundary
These nops can be useful in precise timing, or the nop $nnnn variant is a good way to temporarily comment out a jsr or jmp instruction, by replacing their opcode with $0c in a self-modifying code. Then the memory address they refer to remains intact, the instruction gets ignored by the CPU, and it can be restored when needed.
Mind you, 65C02, 65SC02 and 65C816 use these opcodes to implement new instructions, so if you want 100% compatibility with those CPUs, just refrain from using illegal instructions, including the above mentioned NOPs. You can just use "bit $1234" to temporarily disable a jsr or jmp instruction.
65C02 Family
The 65C02 / 65SC02 CPU is an extension over the standard 6502, that came with three new addressing modes and several new instructions. Undocumented (illegal) instructions are not allowed for this CPU type. For information about the changes see 65C02 opcodes on 6502.org.
The additional mnemonics for 65SC02:
plx, ply, phx, phy, stz, trb, tsb inc a, dec a (also as inc, dec implied)
And some more for the WDC/Rockwell 65C02, above what's in 65SC02:
bbr0 - bbr7, bbs0 - bbs7 rmb0 - rmb7, smb0 - smb7 stp, wai
It's a bit of a chicken-egg thing, but you can just use 65C02 and utilize or ignore the vendor specific instructions, depending on the system you are targeting.
Integration with Text Editors
Source code for Retro Assembler can be written by using any text editor, but Notepad++ is specifically supported by the User Defined Language I created to work with Retro Assembler source code files. It supports syntax highlighting by colorization, folding of comments, functions and macros, and also regions.
You can set this all up by clicking at the menu item Language -> Define your language... and import one (or any/all) of these files that came with the assembler's package, using the Import... button.
File name | Description |
RetroAssembler_6502-npp.xml | Syntax highlighting for the 6502 CPU family's instructions and registers. |
RetroAssembler_65C02-npp.xml | Syntax highlighting for the 65C02 CPU family's instructions and registers. |
After this you must restart Notepad++ so it will utilize the imported file(s) correctly.
The automatically recognized source code file extension within Notepad++ is ".s", like "musicplayer.s", but you can change it or add more by going back to the language editor. Select Retro Assembler or one of its variants and edit the Ext. field on the top. By default it contains the value "s" but you can edit it to be "s code", by which it will recognize both "musicplayer.s" and "musicplayer.code" for example. Using ".asm" is certainly doable as well, but Notepad++'s built in x86 assembly support may clash with it, and you may have to choose the edited file's language manually in the Language menu.
If you create a syntax highlighter for your favorite editor, I would be happy to hear from you and it can be added to the assembler package and its documentation with proper crediting.
Change Log
2/5/2018
- Silent update before more changes.
- Bugs fixed in handling the addressing modes of the instruction JMP in 6502, 65C02 and 65SC02 CPUs.
- The Notepad++ syntax highlighting file for All CPUs (RetroAssembler-npp.xml) is discontinued and removed from the package.
12/12/2017
- Version number updated to 1.2
- 65C02 and 65SC02 CPU support added with new addressing types and mnemonics.
- New directive ".break" to create breakpoints for debugging in VICE.
- New Notepad++ syntax highlighting for 65C02.
- The supported CPU naming convention changed a little, 6502 and 6510 lost the MOS prefix. You should update the .config file.
- The D64 converter now works on Linux too, by calling "c1541" instead of "c1541.exe".
12/7/2017
- Version number updated to 1.1.1 for this quick fix release.
- Linux support added in form of making the application Mono compatible. File paths are handled correctly, using the system-native directory path separator.
- The default Include directories are located both as "Include" and "include" (lowercase) to support case-sensitive file systems.
- The default output directory of the ".memorydump" directive is now the input source code file's directory.
12/6/2017
- Version number updated to 1.1
- The .memorydump directive is now capable of saving binary files, too.
- New directive ".memory" added to manipulate bytes in a selected memory segment.
- For clarity, the following CPU types have been removed: 6502C, 7501, 8500, 8501, 8502.
You can always refer to 6502 or 6510 instead, they are the same opcode-wise anyway. - Bug fixes.
8/3/2017
- The .align directive became more advanced, now it works in relocatable segments too.
- Bugs fixed in the .storage directive, in the expression evaluator and in the label handler.
7/21/2017
- Initial release of Version 1.0
6/18/2017
- Retro Assembler development milestones in the beginning:
- June 18, 2017 - Development started, source code loader, partial tokenizer implemented
- June 21, 2017 - Tokenizer finished, target architecture memory handling implemented
- June 22, 2017 - Parser working OK, basic directives implemented
- June 23, 2017 - Parser is solid, 6502 instruction database built
- June 24, 2017 - The assembler compiled the first working Commodore 64 code: Music player
- June 26, 2017 - Loop directive implemented, error handling code prettied up
- June 29, 2017 - Macro directive implemented
- July 1, 2017 - Segment directive implemented, memory handing and file saving updated
- July 3, 2017 - More complex expression evaluator implemented, brackets supported
- July 4, 2017 - File includes improved by the usage of known include paths
- July 5, 2017 - If and While directives implemented, expression evaluator handles comparers
- July 6, 2017 - Disassembler implemented
- July 7, 2017 - Notepad++ syntax highlighter (User-Defined Language) created
- July 8, 2017 - Function directive implemented based on Macro
- July 11, 2017 - Documentation is nearly finished
- July 12, 2017 - T64, D64 and H6X output file formats implemented
- July 13, 2017 - Generate and Memorydump directives implemented, first samples coded
- July 15, 2017 - Var directive replaced .redefine, Regional labels implemented