.TH XA "1" "5 March 2024" .SH NAME xa \- 6502/R65C02/65816 cross-assembler .SH SYNOPSIS .B xa [\fIOPTION\fR]... \fIFILE\fR .SH DESCRIPTION .B xa is a multi-pass cross-assembler for the 8-bit processors in the 6502 series (such as the 6502, 65C02, 6504, 6507, 6510, 7501, 8500, 8501 and 8502), the Rockwell R65C02, and the 16-bit 65816 processor. For a description of syntax, see .B ASSEMBLER SYNTAX further in this manual page. .SH OPTIONS .TP .B \-E Do not stop after 20 errors, but show all errors. .TP .B \-v Verbose output. .TP .B \-C No CMOS opcodes (default is to allow R65C02 opcodes). .TP .B \-W No 65816 opcodes (default). .TP .B \-w Allow 65816 opcodes. .TP .B \-B Show lines with block open/close (see .BR PSEUDO-OPS ). .TP .B \-c Produce o65 object files instead of executable files (no linking performed); files may contain undefined references. .TP .B \-o filename Set output filename. The default is .BR a.o65 ; use the special filename .BR \- to output to standard output. .TP .B \-P filename Set listing filename. The default is none; use the special filename .BR \- to print the listing to standard output. .TP .B \-F format Set listing format; default is .BR plain . The only other currently supported format is .BR html . .TP .B \-e filename Set errorlog filename; default is none. .TP .B \-l filename Set labellist filename; default is none. This is the symbol table and can be used by disassemblers such as .BR dxa (1) to reconstruct source. .TP .B \-r Add cross-reference list to labellist (requires .BR \-l ). .TP .B \-Xcompatset Enables compatibility settings to become more (not fully!) compatible with other 6502 assemblers and codebases. Currently supported are compatibility sets .BR MASM , .BR CA65 and .BR C , with .B XA23 available as a deprecated option for codebases relying on compatibility with the previous version of .BR xa . Multiple compatibility sets may be specified and combined, e.g., .B \-XMASM .BR \-XXA23 . .IP .B \-XMASM allows colons to appear in comments for MASM compatibility. This does not affect colon interpretation elsewhere and may become the default in a future version. .IP .B \-XCA65 adds syntactic features more compatible with .BR ca65 (1). It permits .B := for defining labels (instead of plain .BR = ), and adds support for unnamed labels and "cheap" local labels using the .B @ character, but disables its other meaning for 24-bit mode (see .B ASSEMBLER .BR SYNTAX ). .IP .B \-XC enables the usage of .B 0xHEX and .B 0OCTAL C-style number encodings. .IP .B \-XXA23 restores partial compatibility with .B xa 2.3.x. In particular, it uses .B ^ for generating control characters, disables escaped characters with .BR \e , allows nested multi-line comments, and disables all predefined .B xa preprocessor macros. Although some portions of this option remain supported syntax, the option itself is inherently deprecated and may be removed in the next 2.x or 3.x release. .TP .B \-a Support .BR ca65 (1)-style unnamed labels using colons, but not the remainder of the other supported .BR ca65 (1) features. This allows their use with 65816 mode, for example. Implies .BR -XMASM . .TP .B \-M This option is deprecated and will be removed in a future version; use .B \-XMASM instead. Allows colons to appear in comments for MASM compatibility. This does not affect colon interpretation elsewhere, and may become the default in a future version. .TP .B \-k Allow the carat .RB ( ^ ) to mask a character with $1f/31. This can be used as a shorthand for control characters, such as .B "^m^j" becoming a carriage return followed by a linefeed. .TP .B \-R Start assembler in relocating mode, i.e. use segments. .TP .B \-U Do not allow undefined labels in relocating mode. .TP .B \-Llabel Defines .B label as an absolute (but undefined) label even when linking. .TP .B \-b? addr Set segment base for segment .B ? to address .BR addr \&. .B ? should be t, d, b or z for text, data, bss or zero segments, respectively. .TP .B \-A addr Make text segment start at an address such that when the file starts at address .BR addr , relocation is not necessary. Overrides .BR \-bt ; other segments still have to be taken care of with .BR \-b \&. .TP .B \-G Suppress list of exported globals. .TP .B \-DDEF=TEXT Define a preprocessor macro on the command line (see .BR PREPROCESSOR ). .TP .B \-I dir Add directory .B dir to the include path (before .BR XAINPUT ; see .BR ENVIRONMENT ). .TP .B \-O charset Define the output charset for character strings. Currently supported are ASCII (default), PETSCII (Commodore ASCII), PETSCREEN (Commodore screen codes) and HIGH (set high bit on all characters). .TP .B \-p? Set the alternative preprocessor character to .BR ? . This is useful when you wish to use .BR cpp (1) and the built-in preprocessor at the same time (see .BR PREPROCESSOR ). Characters may need to be quoted for your shell (example: .B \-p'~' ). .TP .B \-\-help Show summary of options .RB ( -? is a synonym). .TP .B \-\-version Show version of program. .SH ASSEMBLER SYNTAX An introduction to 6502 assembly language programming and mnemonics is beyond the scope of this manual page. We invite you to investigate any number of the excellent books on the subject; one useful title is "Machine Language For Beginners" by Richard Mansfield (COMPUTE!), covering the Atari, Commodore and Apple 8-bit systems, and is widely available on the used market. .LP .B xa supports both the standard NMOS 6502 opcodes as well as the Rockwell CMOS opcodes used in the 65C02 (R65C02). With the .B \-w option, .B xa will also accept opcodes for the 65816. NMOS 6502 undocumented opcodes are intentionally not supported, and should be entered manually using the .B \.byte pseudo-op (see .BR PSEUDO-OPS ). Due to conflicts between the R65C02 and 65816 instruction sets and undocumented instructions on the NMOS 6502, their use is discouraged. .LP In general, .B xa accepts the more-or-less standard 6502 assembler format as popularised by MASM and TurboAssembler. Values and addresses can be expressed either as literals, or as expressions; to wit, .TP 10 .B 123 decimal value .TP .B $234 hexadecimal value .RB ( 0x234 accepted with .BR -XC ) .TP .B &123 octal .RB ( 0123 accepted with .BR -XC ) .TP .B %010110 binary .TP .B * current value of the program counter .LP The ASCII value of any quoted character is inserted directly into the program text (example: .B """A""" inserts the byte "A" into the output stream); see also the .B PSEUDO-OPS section. This is affected by the currently selected character set, if any. .LP .B Labels define locations within the program text, just as in other multi-pass assemblers. A label is defined by anything that is not an opcode; for example, a line such as .IP .B label1 lda #0 .LP defines .B label1 to be the current location of the program counter (thus the address of the .B LDA opcode). A label can be explicitly defined by assigning it the value of an expression, such as .IP .B label2 = $d000 .LP which defines .B label2 to be the address $d000, namely, the start of the VIC-II register block on Commodore 64 computers. The program counter .B * is considered to be a special kind of label, and can be assigned to with statements such as .IP .B * = $c000 .LP which sets the program counter to decimal location 49152. If .B \-XCA65 is specified, you can also use .B := as well as .BR = . .LP With the exception of the program counter, labels cannot be assigned multiple times. To explicitly declare redefinition of a label, place a - (dash) before it, e.g., .IP .B \-label2 = $d020 .LP which sets .B label2 to the Commodore 64 border colour register. The scope of a label is affected by the block it resides within (see .B PSEUDO-OPS for block instructions). A label may also be hard-specified with the .B \-L command line option. .LP Redefining a label does not change previously assembled code that used the earlier value. Therefore, because the program counter is a special type of label, changing the program counter to a lower value does not reorder code assembled previously and changing it to a higher value does not issue padding to put subsequent code at the new location. This is intentional behaviour to facilitate generating relocatable and position-independent code, but can differ from other assemblers which use this behaviour for linking. However, it is possible to use pseudo-ops to simulate other assemblers' behaviour and use .B xa as a linker; see .B PSEUDO-OPS and .BR LINKING . .LP If .B \-XCA65 or .B \-a is specified, "unnamed" labels may be specified with .B : (i.e., no label, just a colon); branches may then reference these unnamed labels with a colon and plus signs for forward branching or minus signs for backward branching. For example (from the .BR ca65 (1) documentation), .LP : lda (ptr1),y ; #1 .BR cmp (ptr2),y .BR bne :+ ; -> #2 .BR tax .BR beq :+++ ; -> #4 .BR iny .BR bne :- ; -> #1 .BR inc ptr1+1 .BR inc ptr2+1 .BR bne :- ; -> #1 .BR .BR : bcs :+ ; #2 -> #3 .BR ldx #$FF .BR rts .BR .BR : ldx #$01 ; #3 .BR : rts ; #4 .LP Additionally, in .B \-XCA65 mode, "cheap" local labels may be used, marked by the .B @ prefix. These temporary labels exist only between two regular labels and automatically go out of scope with the next regular label. This allows, with reasonable care, reuse of common label names like "loop." .LP For those instructions where the accumulator is the implied argument (such as .B asl and .BR lsr ; .B inc and .B dec on R65C02; etc.), the idiom of explicitly specifying the accumulator with .B a is unnecessary as the proper form will be selected if there is no explicit argument. In fact, for consistency with label handling, if there is a label named .BR a , this will actually generate code referencing that label as a memory location and not the accumulator. Otherwise, the assembler will complain. .LP Labels and opcodes may take .B expressions as their arguments to allow computed values, and may themselves reference other labels and/or the program counter. An expression such as .B lab1+1 (which operates on the current value of label .B lab1 and increments it by one) may use the following operands, given from highest to lowest priority: .TP 8 .B * multiplication (priority 10) .TP .B / integer division (priority 10) .TP .B + addition (priority 9) .TP .B \- subtraction (9) .TP .B << shift left (8) .TP .B >> shift right (8) .TP .B >= => greater than or equal to (7) .TP .B > greater than (7) .TP .B <= =< less than or equal to (7) .TP .B < less than (7) .TP .B = equal to (6); .B == also accepted .TP .B <> >< does not equal (6); .B != also accepted .TP .B & bitwise AND (5) .TP .B ^ bitwise XOR (4) .TP .B | bitwise OR (3) .TP .B && logical AND (2) .TP .B || logical OR (1) .LP Parentheses are valid. When redefining a label, combining arithmetic or bitwise operators with the = (equals) operator such as .B += and so on are valid, e.g., .IP .B \-redeflabel += (label12/4) .LP Normally, .B xa attempts to ascertain the value of the operand and (when referring to a memory location) use zero page, 16-bit or (for 65816) 24-bit addressing where appropriate and where supported by the particular opcode. This generates smaller and faster code, and is almost always preferable. .LP Nevertheless, you can use these prefix operators to force a particular rendering of the operand. Those that generate an eight bit result can also be used in 8-bit addressing modes, such as immediate and zero page. .TP .B < low byte of expression, e.g., .B lda # high byte of expression .TP .B ! in situations where the expression could be understood as either an absolute or zero page value, do not attempt to optimize to a zero page argument for those opcodes that support it (i.e., keep as 16 bit word) .TP .B @ render as 24-bit quantity for 65816, even if smaller than 24 bits (must specify .B \-w command-line option, must not specify .BR \-XCA65 ) .TP .B ` force further optimization, even if the length of the instruction cannot be reliably determined (see .BR NOTES'N'BUGS ) .LP Expressions can occur as arguments to opcodes or within the preprocessor (see .B PREPROCESSOR for syntax). For example, .IP .B lda label2+1 .LP takes the value at .B label2+1 (using our previous label's value, this would be $d021), and will be assembled as .B $ad $21 $d0 to disk. Similarly, .IP .B lda # , .BR >= , .B != and .BR <> . .TP .B \.include "filename" Includes another file in place of the pseudo-op, as if the preprocessor had done so with an .B #include directive (see .BR PREPROCESSOR ), but at the assembler phase after preprocessing has already occurred. .LP The following pseudo-op applies to listing mode. .TP .B \.listbytes number In the listing output, sets the maximum number of hex bytes to be printed in the listing for pseudo-ops like .BR .byt , by default 8. The special argument .B unlimited sets no upper limit. If listing mode is disabled, this pseudo-op has no observable effect. .LP The following pseudo-ops apply primarily to relocatable .B .o65 objects. A full discussion of the relocatable format is beyond the scope of this manpage; see .B http://www.6502.org/users/andre/o65/ for the most current specification. .TP .B .text .data .bss .zero These pseudo-ops switch between the different segments, .B .text being the actual code section, .B .data being the data segment, .B .bss being uninitialized label space for allocation and .B .zero being uninitialized zero page space for allocation. In .B .bss and .BR .zero , only labels are evaluated. These pseudo-ops are valid in relocating and absolute modes. .TP .B .code For .B ca65 compatibility, this is currently mapped to .BR .text . .TP .B .zeropage For .B ca65 compatibility, this is currently mapped to .BR .zero . .TP .B .align value Aligns the current segment to a byte boundary (2, 4 or 256) as specified by .B value (and places it in the header when relocating mode is enabled). Other values generate an error. .TP .B .fopt type, value1, value2, value3, ... Acts like .B .byt/.asc except that the values are embedded into the object file as file options. The argument .B type is used to specify the file option being referenced. A table of these options is in the relocatable o65 file format description. The remainder of the options are interpreted as values to insert. Any number of values may be specified, and may also be strings. .TP .B .import label1, label2, label3, ... Defines the given labels as global labels which are imported and resolved during the link stage, like the .B -L command line parameter. .TP .B .importzp label1, label2, label3, ... Analogous to .BR .import , except that it only imports zeropage labels (i.e., byte values). .SH PREPROCESSOR .B xa implements a preprocessor very similar to that of the C-language preprocessor .BR cpp (1) and many oddiments apply to both. For example, as in C, the use of .B /* */ for comment delimiters is also supported in .BR xa , and so are comments using the double slash .BR // . The preprocessor also supports continuation lines, i.e., lines ending with a backslash (\\); the following line is then appended to it as if there were no dividing newline. This too is handled at the preprocessor level. .LP For reasons of memory and complexity, the full breadth of the .BR cpp (1) syntax is not fully supported. In particular, macro definitions may not be forward-defined (i.e., a macro definition can only reference a previously defined macro definition), except for macro functions, where recursive evaluation is supported; e.g., to .B #define WW AA , .B AA must have already been defined. Certain other directives are not supported, nor are most standard pre-defined macros, and there are other limits on evaluation and line length. Because the maintainers of .B xa recognize that some files will require more complicated preparsing than the built-in preprocessor can supply, the preprocessor will accept .BR cpp (1)-style line/filename/flags output. When these lines are seen in the input file, .B xa will treat them as .B cc would, except that flags are ignored. .B xa does not accept files on standard input for parsing reasons, so you should dump your .BR cpp (1) output to an intermediate temporary file, such as .IP .B cc -E test.s > test.xa .br .B xa test.xa .LP No special arguments need to be passed to .BR xa ; the presence of .BR cpp (1) output is detected automatically. .LP Note that passing your file through .BR cpp (1) may interfere with .BR xa 's own preprocessor directives. In this case, to mask directives from .BR cpp (1), use the .B \-p option to specify an alternative character instead of .BR # , such as the tilde (e.g., .B \-p'~' ). With this option and argument specified, then instead of .BR #include , for example, you can also use .BR ~include , in addition to .B #include (which will also still be accepted by the .B xa preprocessor, assuming any survive .BR cpp (1)). Any character can be used, although frankly pathologic choices may lead to amusing and frustrating glitches during parsing. You can also use this option to defer preprocessor directives that .BR cpp (1) may interpret too early until the file actually gets to .B xa itself for processing. .LP The following predefined macros are supported, except if .B \-XXA23 is specified: .TP .B XA_MAJOR The current major version of .BR xa . .TP .B XA_MINOR The current minor version of .BR xa . .LP The following preprocessor directives are supported: .TP .B #include """filename""" Inserts the contents of file .B filename at this position. If the file is not found, it is searched using paths specified by the .B \-I command line option or the environment variable .B XAINPUT (q.v.). When inserted, the file will also be parsed for preprocessor directives. .TP .B #echo comment Inserts comment .B comment into the errorlog file, specified with the .B \-e command line option. .TP .B #print expression Computes the value of expression .B expression and prints it into the errorlog file. .TP .B #error message Displays the message as an error and terminates assembly. .TP .B #define DEFINE text Equates macro .B DEFINE with text .B text such that wherever .B DEFINE appears in the assembly source, .B text is substituted in its place (just like .BR cpp (1) would do). In addition, .B #define can specify macro functions like .BR cpp (1) such that a directive like .B #define mult(a,b) ((a)*(b)) would generate the expected result wherever an expression of the form .B mult(a,b) appears in the source. This can also be specified on the command line with the .B \-D option. The arguments of a macro function may be recursively evaluated, unlike other .BR #define s; the preprocessor will attempt to re-evaluate any argument refencing another preprocessor definition up to ten times before complaining. .LP The following directives are conditionals. If the conditional is not satisfied, then the source code between the directive and its terminating .B #endif are expunged and not assembled. Up to fifteen levels of nesting are supported. .TP .B #ifdef DEFINE True only if macro .B DEFINE is defined. .TP .B #ifndef DEFINE The opposite; true only if macro .B DEFINE has not been previously defined. .TP .B #if expression True if expression .B expression evaluates to non-zero. .B expression may reference other macros. .TP .B #iflused label True if label .B label has been used (but not necessarily instantiated with a value). .I This works on labels, not macros! .TP .B #ifldef label True if label .B label is defined .I and assigned with a value. .I This works on labels, not macros! .TP .B #else Implements alternate path for a conditional block. .TP .B #endif Closes a conditional block. .LP Unclosed conditional blocks at the end of included files generate warnings; unclosed conditional blocks at the end of assembly generate an error. .LP .B #iflused and .B #ifldef are useful for building up a library based on labels. For example, you might use something like this in your library's code: .IP .B #iflused label .br .B #ifldef label .br .B #echo label already defined, library function label cannot be inserted .br .B #else .br .B label /* your code */ .br .B #endif .br .B #endif .SH LINKING .B xa is oriented towards generating sequential binaries. Code is strictly emitted in order even if the program counter is set to a lower location than previously assembled code, and padding is not automatically emitted if the program counter is set to a higher location. Changing the program location only changes new labels for code that is subsequently emitted; previous emitted code remains unchanged. Fortunately, for many object files these conventions have no effect on their generation. .LP However, some applications may require generating an object file built from several previously generated components, and/or submodules which may need to be present at specific memory locations. With a minor amount of additional specification, it is possible to use .B xa for this purpose as well. .LP The first means of doing so uses the o65 format to make relocatable objects that in turn can be linked by .BR ldo65 (1) (q.v.). .LP The second means involves either assembled code, or insertion of previously built object or data files with .BR .bin , using .B .dsb pseudo-ops with computed expression arguments to insert any necessary padding between them, in the sequential order they are to reside in memory. Consider this example: .LP .br .word $1000 .br * = $1000 .br .br ; this is your code at $1000 .br part1 rts .br ; this label marks the end of code .br endofpart1 .br .br ; DON'T PUT A NEW .word HERE! .br * = $2000 .br .dsb (*-endofpart1), 0 .br ; yes, set it again .br * = $2000 .br .br ; this is your code at $2000 .br part2 rts .br .LP This example, written for Commodore microcomputers using a 16-bit starting address, has two "modules" in it: one block of code at $1000 (4096), indicated by the code between labels .B part1 and .BR endofpart1 , and a second block at $2000 (8192) starting at label .BR part2 . .LP The padding is computed by the .B .dsb pseudo-op between the two modules. Note that the program counter is set to the new address and then a computed expression inserts the proper number of fill bytes from the end of the assembled code in part 1 up to the new program counter address. Since this itself advances the program counter, the program counter is reset again, and assembly continues. .LP When the object this source file generates is loaded, there will be an .B rts instruction at address 4096 and another at address 8192, with null bytes between them. .LP Should one of these areas need to contain a pre-built file, instead of assembly code, simply use a .B .bin pseudo-op to load whatever portions of the file are required into the output. The computation of addresses and number of necessary fill bytes is done in the same fashion. .LP Although this example used the program counter itself to compute the difference between addresses, you can use any label for this purpose, keeping in mind that only the program counter determines where relative addresses within assembled code are resolved. .SH ENVIRONMENT .B xa utilises the following environment variables, if they exist: .TP .B XAINPUT Include file path; components should be separated by `,'. .TP .B XAOUTPUT Output file path. .SH NOTES'N'BUGS The R65C02 instructions .B ina (often rendered .B inc .BR a ) and .B dea .RB ( dec .BR a ) must be rendered as bare .B inc and .B dec instructions respectively. .LP The 65816 instructions .B mvn and .B mvp use two eight bit parameters, the only instructions in the entire instruction set to do so. Older versions of .B xa took a single 16-bit absolute value. As of 2.4.0, this old syntax is no longer accepted. .LP Forward-defined labels -- that is, labels that are defined after the current instruction is processed -- cannot be optimized into zero page instructions even if the label does end up being defined as a zero page location, because the assembler does not know the value of the label in advance during the first pass when the length of an instruction is computed. On the second pass, a warning will be issued when an instruction that could have been optimized can't be because of this limitation. (Obviously, this does not apply to branching or jumping instructions because they're not optimizable anyhow, and those instructions that can .I only take an 8-bit parameter will always be casted to an 8-bit quantity.) If the label cannot otherwise be defined ahead of the instruction, the backtick prefix .B ` may be used to force further optimization no matter where the label is defined as long as the instruction supports it. Indiscriminately forcing the issue can be fraught with peril, however, and is not recommended; to discourage this, the assembler will complain about its use in addressing mode situations where no ambiguity exists, such as indirect indexed, branching and so on. .SH "SEE ALSO" .BR file65 (1), .BR ldo65 (1), .BR reloc65 (1), .BR uncpk (1), .BR dxa (1) .SH AUTHOR This manual page was written by David Weinehall , Andre Fachat and Cameron Kaiser . Original xa package (C)1989-1997 Andre Fachat. Additional changes (C)1989-2024 Andre Fachat, Jolse Maginnis, David Weinehall, Cameron Kaiser. The official maintainer is Cameron Kaiser. .SH OVER 30 YEARS OF XA Yay us? .SH WEBSITE http://www.floodgap.com/retrotech/xa/