NAME¶
xa - 6502/R65C02/65816 cross-assembler
SYNOPSIS¶
xa [
OPTION]...
FILE
DESCRIPTION¶
xa is a multi-pass cross-assembler for the 8-bit processors in the 6502
series (such as the 6502, 65C02, 6504, 6507, 6510, 7501, 8500, 8501 and 8502),
the Rockwell R65C02, and the 16-bit 65816 processor. For a description of
syntax, see
ASSEMBLER SYNTAX further in this manual page.
OPTIONS¶
- -v
- Verbose output.
- -x
- Use old filename behaviour (overrides -o, -e and -l).
This option is now deprecated.
- -C
- No CMOS opcodes (default is to allow R65C02 opcodes)
- -W
- No 65816 opcodes (default).
- -w
- Allow 65816 opcodes.
- -B
- Show lines with block open/close (see PSEUDO-OPS).
- -c
- Produce o65 object files instead of executable files (no linking
performed); files may contain undefined references.
- -o filename
- Set output filename. The default is a.o65; use the special filename
- to output to standard output.
- -e filename
- Set errorlog filename, default is none.
- -l filename
- Set labellist filename, default is none. This is the symbol table and can
be used by disassemblers such as dxa(1) to reconstruct source.
- -r
- Add cross-reference list to labellist (requires -l).
- -M
- Allow colons to appear in comments; for MASM compatibility. This does not
affect colon interpretation elsewhere.
- -R
- Start assembler in relocating mode.
- -Llabel
- Defines label as an absolute (but undefined) label even when
linking.
- -b? addr
- Set segment base for segment ? to address addr. ?
should be t, d, b or z for text, data, bss or zero segments,
respectively.
- -A addr
- Make text segment start at an address such that when the file starts at
address addr, relocation is not necessary. Overrides -bt;
other segments still have to be taken care of with -b.
- -G
- Suppress list of exported globals.
- -DDEF=TEXT
- Define a preprocessor macro on the command line (see
PREPROCESSOR).
- -I dir
- Add directory dir to the include path (before XAINPUT; see
ENVIRONMENT).
- -O charset
- Define the output charset for character strings. Currently supported are
ASCII (default), PETSCII (Commodore ASCII), PETSCREEN (Commodore screen
codes) and HIGH (set high bit on all characters).
- -p?
- Set the alternative preprocessor character to ?. This is useful
when you wish to use cpp(1) and the built-in preprocessor at the
same time (see PREPROCESSOR). Characters may need to be quoted for
your shell (example: -p'~' ).
- --help
- Show summary of options.
- --version
- Show version of program.
ASSEMBLER SYNTAX¶
An introduction to 6502 assembly language programming and mnemonics is beyond
the scope of this manual page. We invite you to investigate any number of the
excellent books on the subject; one useful title is "Machine Language For
Beginners" by Richard Mansfield (COMPUTE!), covering the Atari, Commodore
and Apple 8-bit systems, and is widely available on the used market.
xa supports both the standard NMOS 6502 opcodes as well as the Rockwell
CMOS opcodes used in the 65C02 (R65C02). With the
-w option,
xa
will also accept opcodes for the 65816. NMOS 6502 undocumented opcodes are
intentionally not supported, and should be entered manually using the
.byte pseudo-op (see
PSEUDO-OPS). Due to conflicts between the
R65C02 and 65816 instruction sets and undocumented instructions on the NMOS
6502, their use is discouraged.
In general,
xa accepts the more-or-less standard 6502 assembler format as
popularised by MASM and TurboAssembler. Values and addresses can be expressed
either as literals, or as expressions; to wit,
- 123
- decimal value
- $234
- hexadecimal value
- &123
- octal
- %010110
- binary
- *
- current value of the program counter
The ASCII value of any quoted character is inserted directly into the program
text (example:
"A" inserts the byte "A" into the
output stream); see also the
PSEUDO-OPS section. This is affected by
the currently selected character set, if any.
Labels define locations within the program text, just as in other
multi-pass assemblers. A label is defined by anything that is not an opcode;
for example, a line such as
- label1 lda #0
defines
label1 to be the current location of the program counter (thus
the address of the
LDA opcode). A label can be explicitly defined by
assigning it the value of an expression, such as
- label2 = $d000
which defines
label2 to be the address $d000, namely, the start of the
VIC-II register block on Commodore 64 computers. The program counter
*
is considered to be a special kind of label, and can be assigned to with
statements such as
- * = $c000
which sets the program counter to decimal location 49152. With the exception of
the program counter, labels cannot be assigned multiple times. To explicitly
declare redefinition of a label, place a - (dash) before it, e.g.,
- -label2 = $d020
which sets
label2 to the Commodore 64 border colour register. The scope
of a label is affected by the block it resides within (see
PSEUDO-OPS
for block instructions). A label may also be hard-specified with the
-L
command line option.
For those instructions where the accumulator is the implied argument (such as
asl and
lsr;
inc and
dec on R65C02; etc.), the
idiom of explicitly specifying the accumulator with
a is unnecessary as
the proper form will be selected if there is no explicit argument. In fact,
for consistency with label handing, if there is a label named
a, this
will actually generate code referencing that label as a memory location and
not the accumulator. Otherwise, the assembler will complain.
Labels and opcodes may take
expressions as their arguments to allow
computed values, and may themselves reference other labels and/or the program
counter. An expression such as
lab1+1 (which operates on the current
value of label
lab1 and increments it by one) may use the following
operands, given from highest to lowest priority:
- *
- multiplication (priority 10)
- /
- integer division (priority 10)
- +
- addition (priority 9)
- -
- subtraction (9)
- <<
- shift left (8)
- >>
- shift right (8)
- >= =>
- greater than or equal to (7)
- <
- greater than (7)
- <= =<
- less than or equal to (7)
- <
- less than (7)
- =
- equal to (6)
- <> ><
- does not equal (6)
- &
- bitwise AND (5)
- ^
- bitwise XOR (4)
- |
- bitwise OR (3)
- &&
- logical AND (2)
- ||
- logical OR (1)
Parentheses are valid. When redefining a label, combining arithmetic or bitwise
operators with the = (equals) operator such as
+= and so on are valid,
e.g.,
- -redeflabel += (label12/4)
Normally,
xa attempts to ascertain the value of the operand and (when
referring to a memory location) use zero page, 16-bit or (for 65816) 24-bit
addressing where appropriate and where supported by the particular opcode.
This generates smaller and faster code, and is almost always preferable.
Nevertheless, you can use these prefix operators to force a particular rendering
of the operand. Those that generate an eight bit result can also be used in
8-bit addressing modes, such as immediate and zero page.
- <
- low byte of expression, e.g., lda #<vector
- >
- high byte of expression
- !
- in situations where the expression could be understood as either an
absolute or zero page value, do not attempt to optimize to a zero page
argument for those opcodes that support it (i.e., keep as 16 bit
word)
- @
- render as 24-bit quantity for 65816 (must specify -w command-line
option). This is required to specify any 24-bit
quantity!
- `
- force further optimization, even if the length of the instruction cannot
be reliably determined (see NOTES'N'BUGS)
Expressions can occur as arguments to opcodes or within the preprocessor (see
PREPROCESSOR for syntax). For example,
- lda label2+1
takes the value at
label2+1 (using our previous label's value, this would
be $d021), and will be assembled as
$ad $21 $d0 to disk. Similarly,
- lda #<label2
will take the lowest 8 bits of
label2 (i.e., $20), and assign them to the
accumulator (assembling the instruction as
$a9 $20 to disk).
Comments are specified with a semicolon (;), such as
- ;this is a comment
They can also be specified in the C language style, using
/* */ and
// which are understood at the
PREPROCESSOR level (q.v.).
Normally, the colon (:) separates statements, such as
- label4 lda #0:sta $d020
or
- label2: lda #2
(note the use of a colon for specifying a label, similar to some other
assemblers, which
xa also understands with or without the colon). This
also applies to semicolon comments, such that
- ; a comment:lda #0
is understood as a comment followed by an opcode. To defeat this, use the
-M command line option to allow colons within comments. This does not
apply to
/* */ and
// comments, which are dealt with at the
preprocessor level (q.v.).
PSEUDO-OPS¶
Pseudo-ops are false opcodes used by the assembler to denote meta- or
inlined commands. Like most assemblers,
xa has a rich set.
- .byt value1,value2,value3,...
- Specifies a string of bytes to be directly placed into the assembled
object. The arguments may be expressions. Any number of bytes can be
specified.
- .asc "text1" ,"text2",...
- Specifies a character string which will be inserted into the assembled
object. Strings are understood according to the currently specified
character set; for example, if ASCII is specified, they will be rendered
as ASCII, and if PETSCII is specified, they will be translated into the
equivalent Commodore ASCII equivalent. Other non-standard ASCIIs such as
ATASCII for Atari computers should use the ASCII equivalent characters;
graphic and control characters should be specified explicitly using
.byt for the precise character you want. Note that when specifying
the argument of an opcode, .asc is not necessary; the quoted
character can simply be inserted (e.g., lda #"A" ), and
is also affected by the current character set. Any number of character
strings can be specified.
.byt and
.asc are synonymous, so you can mix things such as
.byt $43, 22, "a character string" and get the expected
result. The string is subject to the current character set, but the remaining
bytes are inserted wtihout modification.
- .aasc "text1" ,"text2",...
- Specifies a character string that is always rendered in true ASCII
regardless of the current character set. Like .asc, it is
synonymous with .byt.
- .word value1,value2,value3...
- Specifies a string of 16-bit words to be placed into the assembled object
in 6502 little-endian format (that is, low-byte/high-byte). The arguments
may be expressions. Any number of words can be specified.
- .dsb length,fillbyte
- Specifies a data block; a total of length repetitions of
fillbyte will be inserted into the assembled object. For example,
.dsb 5,$10 will insert five bytes, each being 16 decimal, into the
object. The arguments may be expressions.
- .bin offset,length,"filename"
- Inlines a binary file without further interpretation specified by
filename from offset offset to length length. This
allows you to insert data such as a previously assembled object file or an
image or other binary data structure, inlined directly into this file's
object. If length is zero, then the length of filename,
minus the offset, is used instead. The arguments may be expressions.
- .(
- Opens a new block for scoping. Within a block, all labels defined are
local to that block and any sub-blocks, and go out of scope as soon as the
enclosing block is closed (i.e., lexically scoped). All labels defined
outside of the block are still visible within it. To explicitly declare a
global label within a block, precede the label with + or precede it
with & to declare it within the previous level only (or
globally if you are only one level deep). Sixteen levels of scoping are
permitted.
- .)
- Closes a block.
- .as .al .xs .xl
- Only relevant in 65816 mode (with the -w option specified). These
pseudo-ops set what size accumulator and X/Y-register should be used for
future instructions; .as and .xs set 8-bit operands for the
accumulator and index registers, respectively, and .al and
.xl set 16-bit operands. These pseudo-ops on purpose do not
automatically issue sep and rep instructions to set the
specified width in the CPU; set the processor bits as you need, or
consider constructing a macro. .al and .xl generate errors
if -w is not specified.
The following pseudo-ops apply primarily to relocatable .o65 objects. A full
discussion of the relocatable format is beyond the scope of this manpage, as
it is currently a format in flux. Documentation on the proposed v1.2 format is
in
doc/fileformat.txt within the
xa installation directory.
- .text .data .bss .zero
- These pseudo-ops switch between the different segments, .text being the
actual code section, .data being the data segment, .bss being
uninitialized label space for allocation and .zero being uninitialized
zero page space for allocation. In .bss and .zero, only labels are
evaluated. These pseudo-ops are valid in relative and absolute modes.
- .align value
- Aligns the current segment to a byte boundary (2, 4 or 256) as specified
by value (and places it in the header when relative mode is
enabled). Other values generate an error.
- .fopt type,value1,value2,value3,...
- Acts like .byt/.asc except that the values are embedded into the
object file as file options. The argument type is used to specify
the file option being referenced. A table of these options is in the
relocatable o65 file format description. The remainder of the options are
interpreted as values to insert. Any number of values may be specified,
and may also be strings.
PREPROCESSOR¶
xa implements a preprocessor very similar to that of the C-language
preprocessor
cpp(1) and many oddiments apply to both. For example, as
in C, the use of
/* */ for comment delimiters is also supported in
xa, and so are comments using the double slash
//. The
preprocessor also supports continuation lines, i.e., lines ending with a
backslash (\); the following line is then appended to it as if there were no
dividing newline. This too is handled at the preprocessor level.
For reasons of memory and complexity, the full breadth of the
cpp(1)
syntax is not fully supported. In particular, macro definitions may not be
forward-defined (i.e., a macro definition can only reference a previously
defined macro definition), except for macro functions, where recursive
evaluation is supported; e.g., to
#define WW AA ,
AA must have
already been defined. Certain other directives are not supported, nor are most
standard pre-defined macros, and there are other limits on evaluation and line
length. Because the maintainers of
xa recognize that some files will
require more complicated preparsing than the built-in preprocessor can supply,
the preprocessor will accept
cpp(1)-style line/filename/flags output.
When these lines are seen in the input file,
xa will treat them as
cc would, except that flags are ignored.
xa does not accept
files on standard input for parsing reasons, so you should dump your
cpp(1) output to an intermediate temporary file, such as
- cc -E test.s > test.xa
xa test.xa
No special arguments need to be passed to
xa; the presence of
cpp(1) output is detected automatically.
Note that passing your file through
cpp(1) may interfere with
xa's
own preprocessor directives. In this case, to mask directives from
cpp(1), use the
-p option to specify an alternative character
instead of
#, such as the tilde (e.g.,
-p'~' ). With this option
and argument specified, then instead of
#include, for example, you can
also use
~include, in addition to
#include (which will also
still be accepted by the
xa preprocessor, assuming any survive
cpp(1)). Any character can be used, although frankly pathologic choices
may lead to amusing and frustrating glitches during parsing. You can also use
this option to defer preprocessor directives that
cpp(1) may interpret
too early until the file actually gets to
xa itself for processing.
The following preprocessor directives are supported.
- #include "filename"
- Inserts the contents of file filename at this position. If the file
is not found, it is searched using paths specified by the -I
command line option or the environment variable XAINPUT (q.v.).
When inserted, the file will also be parsed for preprocessor
directives.
- #echo comment
- Inserts comment comment into the errorlog file, specified with the
-e command line option.
- #print expression
- Computes the value of expression expression and prints it into the
errorlog file.
- #define DEFINE text
- Equates macro DEFINE with text text such that wherever
DEFINE appears in the assembly source, text is substituted
in its place (just like cpp(1) would do). In addition,
#define can specify macro functions like cpp(1) such that a
directive like #define mult(a,b) ((a)*(b)) would generate the
expected result wherever an expression of the form mult(a,b)
appears in the source. This can also be specified on the command line with
the -D option. The arguments of a macro function may be recursively
evaluated, unlike other #defines; the preprocessor will attempt to
re-evaluate any argument refencing another preprocessor definition up to
ten times before complaining.
The following directives are conditionals. If the conditional is not satisfied,
then the source code between the directive and its terminating
#endif
are expunged and not assembled. Up to fifteen levels of nesting are supported.
- #endif
- Closes a conditional block.
- #else
- Implements alternate path for a conditional block.
- #ifdef DEFINE
- True only if macro DEFINE is defined.
- #ifndef DEFINE
- The opposite; true only if macro DEFINE has not been previously
defined.
- #if expression
- True if expression expression evaluates to non-zero.
expression may reference other macros.
- #iflused label
- True if label label has been used (but not necessarily instantiated
with a value). This works on labels, not macros!
- #ifldef label
- True if label label is defined and assigned with a value.
This works on labels, not macros!
Unclosed conditional blocks at the end of included files generate warnings;
unclosed conditional blocks at the end of assembly generate an error.
#iflused and
#ifldef are useful for building up a library based on
labels. For example, you might use something like this in your library's code:
- #iflused label
#ifldef label
#echo label already defined, library function label cannot be
inserted
#else
label /* your code */
#endif
#endif
ENVIRONMENT¶
xa utilises the following environment variables, if they exist:
- XAINPUT
- Include file path; components should be separated by `,'.
- XAOUTPUT
- Output file path.
NOTES'N'BUGS¶
The R65C02 instructions
ina (often rendered
inc a) and
dea (
dec a) must be rendered as bare
inc and
dec instructions respectively.
Forward-defined labels -- that is, labels that are defined after the current
instruction is processed -- cannot be optimized into zero page instructions
even if the label does end up being defined as a zero page location, because
the assembler does not know the value of the label in advance during the first
pass when the length of an instruction is computed. On the second pass, a
warning will be issued when an instruction that could have been optimized
can't be because of this limitation. (Obviously, this does not apply to
branching or jumping instructions because they're not optimizable anyhow, and
those instructions that can
only take an 8-bit parameter will always be
casted to an 8-bit quantity.) If the label cannot otherwise be defined ahead
of the instruction, the backtick prefix
` may be used to force further
optimization no matter where the label is defined as long as the instruction
supports it. Indiscriminately forcing the issue can be fraught with peril,
however, and is not recommended; to discourage this, the assembler will
complain about its use in addressing mode situations where no ambiguity
exists, such as indirect indexed, branching and so on.
Also, as a further consequence of the way optimization is managed, we repeat
that
all 24-bit quantities and labels that reference a 24-bit quantity
in 65816 mode, anteriorly declared or otherwise,
MUST be prepended with
the
@ prefix. Otherwise, the assembler will attempt to optimize to 16
bits, which may be undesirable.
SEE ALSO¶
file65(1),
ldo65(1),
printcbm(1),
reloc65(1),
uncpk(1),
dxa(1)
AUTHOR¶
This manual page was written by David Weinehall <tao@acc.umu.se>, Andre
Fachat <fachat@web.de> and Cameron Kaiser <ckaiser@floodgap.com>.
Original xa package (C)1989-1997 Andre Fachat. Additional changes (C)1989-2009
Andre Fachat, Jolse Maginnis, David Weinehall, Cameron Kaiser. The official
maintainer is Cameron Kaiser.
WEBSITE¶
http://www.floodgap.com/retrotech/xa/