NAME¶
xa - 6502/R65C02/65816 cross-assembler
SYNOPSIS¶
xa [
OPTION]...
FILE
DESCRIPTION¶
xa is a multi-pass cross-assembler for the 8-bit processors in the 6502
series (such as the 6502, 65C02, 6504, 6507, 6510, 7501, 8500, 8501 and 8502),
the Rockwell R65C02, and the 16-bit 65816 processor. For a description of
syntax, see
ASSEMBLER SYNTAX further in this manual page.
OPTIONS¶
- -v
- Verbose output.
- -x
- Use old filename behaviour (overrides -o, -e
and -l). This option is now deprecated.
- -C
- No CMOS opcodes (default is to allow R65C02 opcodes)
- -W
- No 65816 opcodes (default).
- -w
- Allow 65816 opcodes.
- -B
- Show lines with block open/close (see
PSEUDO-OPS).
- -c
- Produce o65 object files instead of executable files (no
linking performed); files may contain undefined references.
- -o filename
- Set output filename. The default is a.o65; use the
special filename - to output to standard output.
- -e filename
- Set errorlog filename, default is none.
- -l filename
- Set labellist filename, default is none. This is the symbol
table and can be used by disassemblers such as dxa(1) to
reconstruct source.
- -r
- Add cross-reference list to labellist (requires
-l).
- -M
- Allow colons to appear in comments; for MASM compatibility.
This does not affect colon interpretation elsewhere.
- -R
- Start assembler in relocating mode.
- -Llabel
- Defines label as an absolute (but undefined) label
even when linking.
- -b? addr
- Set segment base for segment ? to address
addr. ? should be t, d, b or z for text, data, bss or zero
segments, respectively.
- -A addr
- Make text segment start at an address such that when the
file starts at address addr, relocation is not necessary. Overrides
-bt; other segments still have to be taken care of with -b.
- -G
- Suppress list of exported globals.
- -DDEF=TEXT
- Define a preprocessor macro on the command line (see
PREPROCESSOR).
- -I dir
- Add directory dir to the include path (before
XAINPUT; see ENVIRONMENT).
- -O charset
- Define the output charset for character strings. Currently
supported are ASCII (default), PETSCII (Commodore ASCII), PETSCREEN
(Commodore screen codes) and HIGH (set high bit on all characters).
- -p?
- Set the alternative preprocessor character to ?.
This is useful when you wish to use cpp(1) and the built-in
preprocessor at the same time (see PREPROCESSOR). Characters may
need to be quoted for your shell (example: -p'~' ).
- --help
- Show summary of options.
- --version
- Show version of program.
ASSEMBLER SYNTAX¶
An introduction to 6502 assembly language programming and mnemonics is beyond
the scope of this manual page. We invite you to investigate any number of the
excellent books on the subject; one useful title is "Machine Language For
Beginners" by Richard Mansfield (COMPUTE!), covering the Atari, Commodore
and Apple 8-bit systems, and is widely available on the used market.
xa supports both the standard NMOS 6502 opcodes as well as the Rockwell
CMOS opcodes used in the 65C02 (R65C02). With the
-w option,
xa
will also accept opcodes for the 65816. NMOS 6502 undocumented opcodes are
intentionally not supported, and should be entered manually using the
.byte pseudo-op (see
PSEUDO-OPS). Due to conflicts between the
R65C02 and 65816 instruction sets and undocumented instructions on the NMOS
6502, their use is discouraged.
In general,
xa accepts the more-or-less standard 6502 assembler format as
popularised by MASM and TurboAssembler. Values and addresses can be expressed
either as literals, or as expressions; to wit,
- 123
- decimal value
- $234
- hexadecimal value
- &123
- octal
- %010110
- binary
- *
- current value of the program counter
The ASCII value of any quoted character is inserted directly into the program
text (example:
"A" inserts the byte "A" into the
output stream); see also the
PSEUDO-OPS section. This is affected by
the currently selected character set, if any.
Labels define locations within the program text, just as in other
multi-pass assemblers. A label is defined by anything that is not an opcode;
for example, a line such as
- label1 lda #0
defines
label1 to be the current location of the program counter (thus
the address of the
LDA opcode). A label can be explicitly defined by
assigning it the value of an expression, such as
- label2 = $d000
which defines
label2 to be the address $d000, namely, the start of the
VIC-II register block on Commodore 64 computers. The program counter
*
is considered to be a special kind of label, and can be assigned to with
statements such as
- * = $c000
which sets the program counter to decimal location 49152. With the exception of
the program counter, labels cannot be assigned multiple times. To explicitly
declare redefinition of a label, place a - (dash) before it, e.g.,
- -label2 = $d020
which sets
label2 to the Commodore 64 border colour register. The scope
of a label is affected by the block it resides within (see
PSEUDO-OPS
for block instructions). A label may also be hard-specified with the
-L
command line option.
For those instructions where the accumulator is the implied argument (such as
asl and
lsr;
inc and
dec on R65C02; etc.), the
idiom of explicitly specifying the accumulator with
a is unnecessary as
the proper form will be selected if there is no explicit argument. In fact,
for consistency with label handing, if there is a label named
a, this
will actually generate code referencing that label as a memory location and
not the accumulator. Otherwise, the assembler will complain.
Labels and opcodes may take
expressions as their arguments to allow
computed values, and may themselves reference other labels and/or the program
counter. An expression such as
lab1+1 (which operates on the current
value of label
lab1 and increments it by one) may use the following
operands, given from highest to lowest priority:
- *
- multiplication (priority 10)
- /
- integer division (priority 10)
- +
- addition (priority 9)
- -
- subtraction (9)
- <<
- shift left (8)
- >>
- shift right (8)
- >= =>
- greater than or equal to (7)
- <
- greater than (7)
- <= =<
- less than or equal to (7)
- <
- less than (7)
- =
- equal to (6)
- <> ><
- does not equal (6)
- &
- bitwise AND (5)
- ^
- bitwise XOR (4)
- |
- bitwise OR (3)
- &&
- logical AND (2)
- ||
- logical OR (1)
Parentheses are valid. When redefining a label, combining arithmetic or bitwise
operators with the = (equals) operator such as
+= and so on are valid,
e.g.,
- -redeflabel += (label12/4)
Normally,
xa attempts to ascertain the value of the operand and (when
referring to a memory location) use zero page, 16-bit or (for 65816) 24-bit
addressing where appropriate and where supported by the particular opcode.
This generates smaller and faster code, and is almost always preferable.
Nevertheless, you can use these prefix operators to force a particular rendering
of the operand. Those that generate an eight bit result can also be used in
8-bit addressing modes, such as immediate and zero page.
- <
- low byte of expression, e.g., lda #<vector
- >
- high byte of expression
- !
- in situations where the expression could be understood as
either an absolute or zero page value, do not attempt to optimize to a
zero page argument for those opcodes that support it (i.e., keep as 16 bit
word)
- @
- render as 24-bit quantity for 65816 (must specify -w
command-line option). This is required to specify any 24-bit
quantity!
- `
- force further optimization, even if the length of the
instruction cannot be reliably determined (see NOTES'N'BUGS)
Expressions can occur as arguments to opcodes or within the preprocessor (see
PREPROCESSOR for syntax). For example,
- lda label2+1
takes the value at
label2+1 (using our previous label's value, this would
be $d021), and will be assembled as
$ad $21 $d0 to disk. Similarly,
- lda #<label2
will take the lowest 8 bits of
label2 (i.e., $20), and assign them to the
accumulator (assembling the instruction as
$a9 $20 to disk).
Comments are specified with a semicolon (;), such as
- ;this is a comment
They can also be specified in the C language style, using
/* */ and
// which are understood at the
PREPROCESSOR level (q.v.).
Normally, the colon (:) separates statements, such as
- label4 lda #0:sta $d020
or
- label2: lda #2
(note the use of a colon for specifying a label, similar to some other
assemblers, which
xa also understands with or without the colon). This
also applies to semicolon comments, such that
- ; a comment:lda #0
is understood as a comment followed by an opcode. To defeat this, use the
-M command line option to allow colons within comments. This does not
apply to
/* */ and
// comments, which are dealt with at the
preprocessor level (q.v.).
PSEUDO-OPS¶
Pseudo-ops are false opcodes used by the assembler to denote meta- or
inlined commands. Like most assemblers,
xa has a rich set.
- .byt value1,value2,value3,...
- Specifies a string of bytes to be directly placed into the
assembled object. The arguments may be expressions. Any number of bytes
can be specified.
- .asc "text1" ,"text2",...
- Specifies a character string which will be inserted into
the assembled object. Strings are understood according to the currently
specified character set; for example, if ASCII is specified, they will be
rendered as ASCII, and if PETSCII is specified, they will be translated
into the equivalent Commodore ASCII equivalent. Other non-standard ASCIIs
such as ATASCII for Atari computers should use the ASCII equivalent
characters; graphic and control characters should be specified explicitly
using .byt for the precise character you want. Note that when
specifying the argument of an opcode, .asc is not necessary; the
quoted character can simply be inserted (e.g., lda #"A"
), and is also affected by the current character set. Any number of
character strings can be specified.
.byt and
.asc are synonymous, so you can mix things such as
.byt $43, 22, "a character string" and get the expected
result. The string is subject to the current character set, but the remaining
bytes are inserted wtihout modification.
- .aasc "text1" ,"text2",...
- Specifies a character string that is always rendered
in true ASCII regardless of the current character set. Like .asc,
it is synonymous with .byt.
- .word value1,value2,value3...
- Specifies a string of 16-bit words to be placed into the
assembled object in 6502 little-endian format (that is,
low-byte/high-byte). The arguments may be expressions. Any number of words
can be specified.
- .dsb length,fillbyte
- Specifies a data block; a total of length
repetitions of fillbyte will be inserted into the assembled object.
For example, .dsb 5,$10 will insert five bytes, each being 16
decimal, into the object. The arguments may be expressions.
- .bin offset,length,"filename"
- Inlines a binary file without further interpretation
specified by filename from offset offset to length
length. This allows you to insert data such as a previously
assembled object file or an image or other binary data structure, inlined
directly into this file's object. If length is zero, then the
length of filename, minus the offset, is used instead. The
arguments may be expressions.
- .(
- Opens a new block for scoping. Within a block, all labels
defined are local to that block and any sub-blocks, and go out of scope as
soon as the enclosing block is closed (i.e., lexically scoped). All labels
defined outside of the block are still visible within it. To explicitly
declare a global label within a block, precede the label with + or
precede it with & to declare it within the previous level only
(or globally if you are only one level deep). Sixteen levels of scoping
are permitted.
- .)
- Closes a block.
- .as .al .xs .xl
- Only relevant in 65816 mode (with the -w option
specified). These pseudo-ops set what size accumulator and X/Y-register
should be used for future instructions; .as and .xs set
8-bit operands for the accumulator and index registers, respectively, and
.al and .xl set 16-bit operands. These pseudo-ops on purpose
do not automatically issue sep and rep instructions to set
the specified width in the CPU; set the processor bits as you need, or
consider constructing a macro. .al and .xl generate errors
if -w is not specified.
The following pseudo-ops apply primarily to relocatable .o65 objects. A full
discussion of the relocatable format is beyond the scope of this manpage, as
it is currently a format in flux. Documentation on the proposed v1.2 format is
in
doc/fileformat.txt within the
xa installation directory.
- .text .data .bss .zero
- These pseudo-ops switch between the different segments,
.text being the actual code section, .data being the data segment, .bss
being uninitialized label space for allocation and .zero being
uninitialized zero page space for allocation. In .bss and .zero, only
labels are evaluated. These pseudo-ops are valid in relative and absolute
modes.
- .align value
- Aligns the current segment to a byte boundary (2, 4 or 256)
as specified by value (and places it in the header when relative
mode is enabled). Other values generate an error.
- .fopt type,value1,value2,value3,...
- Acts like .byt/.asc except that the values are
embedded into the object file as file options. The argument type is
used to specify the file option being referenced. A table of these options
is in the relocatable o65 file format description. The remainder of the
options are interpreted as values to insert. Any number of values may be
specified, and may also be strings.
PREPROCESSOR¶
xa implements a preprocessor very similar to that of the C-language
preprocessor
cpp(1) and many oddiments apply to both. For example, as
in C, the use of
/* */ for comment delimiters is also supported in
xa, and so are comments using the double slash
//. The
preprocessor also supports continuation lines, i.e., lines ending with a
backslash (\); the following line is then appended to it as if there were no
dividing newline. This too is handled at the preprocessor level.
For reasons of memory and complexity, the full breadth of the
cpp(1)
syntax is not fully supported. In particular, macro definitions may not be
forward-defined (i.e., a macro definition can only reference a previously
defined macro definition), except for macro functions, where recursive
evaluation is supported; e.g., to
#define WW AA ,
AA must have
already been defined. Certain other directives are not supported, nor are most
standard pre-defined macros, and there are other limits on evaluation and line
length. Because the maintainers of
xa recognize that some files will
require more complicated preparsing than the built-in preprocessor can supply,
the preprocessor will accept
cpp(1)-style line/filename/flags output.
When these lines are seen in the input file,
xa will treat them as
cc would, except that flags are ignored.
xa does not accept
files on standard input for parsing reasons, so you should dump your
cpp(1) output to an intermediate temporary file, such as
- cc -E test.s > test.xa
xa test.xa
No special arguments need to be passed to
xa; the presence of
cpp(1) output is detected automatically.
Note that passing your file through
cpp(1) may interfere with
xa's
own preprocessor directives. In this case, to mask directives from
cpp(1), use the
-p option to specify an alternative character
instead of
#, such as the tilde (e.g.,
-p'~' ). With this option
and argument specified, then instead of
#include, for example, you can
also use
~include, in addition to
#include (which will also
still be accepted by the
xa preprocessor, assuming any survive
cpp(1)). Any character can be used, although frankly pathologic choices
may lead to amusing and frustrating glitches during parsing. You can also use
this option to defer preprocessor directives that
cpp(1) may interpret
too early until the file actually gets to
xa itself for processing.
The following preprocessor directives are supported.
- #include "filename"
- Inserts the contents of file filename at this
position. If the file is not found, it is searched using paths specified
by the -I command line option or the environment variable
XAINPUT (q.v.). When inserted, the file will also be parsed for
preprocessor directives.
- #echo comment
- Inserts comment comment into the errorlog file,
specified with the -e command line option.
- #print expression
- Computes the value of expression expression and
prints it into the errorlog file.
- #define DEFINE text
- Equates macro DEFINE with text text such that
wherever DEFINE appears in the assembly source, text is
substituted in its place (just like cpp(1) would do). In addition,
#define can specify macro functions like cpp(1) such that a
directive like #define mult(a,b) ((a)*(b)) would generate the
expected result wherever an expression of the form mult(a,b)
appears in the source. This can also be specified on the command line with
the -D option. The arguments of a macro function may be recursively
evaluated, unlike other #defines; the preprocessor will attempt to
re-evaluate any argument refencing another preprocessor definition up to
ten times before complaining.
The following directives are conditionals. If the conditional is not satisfied,
then the source code between the directive and its terminating
#endif
are expunged and not assembled. Up to fifteen levels of nesting are supported.
- #endif
- Closes a conditional block.
- #else
- Implements alternate path for a conditional block.
- #ifdef DEFINE
- True only if macro DEFINE is defined.
- #ifndef DEFINE
- The opposite; true only if macro DEFINE has not been
previously defined.
- #if expression
- True if expression expression evaluates to non-zero.
expression may reference other macros.
- #iflused label
- True if label label has been used (but not
necessarily instantiated with a value). This works on labels, not
macros!
- #ifldef label
- True if label label is defined and assigned
with a value. This works on labels, not macros!
Unclosed conditional blocks at the end of included files generate warnings;
unclosed conditional blocks at the end of assembly generate an error.
#iflused and
#ifldef are useful for building up a library based on
labels. For example, you might use something like this in your library's code:
- #iflused label
#ifldef label
#echo label already defined, library function label cannot be
inserted
#else
label /* your code */
#endif
#endif
ENVIRONMENT¶
xa utilises the following environment variables, if they exist:
- XAINPUT
- Include file path; components should be separated by
`,'.
- XAOUTPUT
- Output file path.
NOTES'N'BUGS¶
The R65C02 instructions
ina (often rendered
inc a) and
dea (
dec a) must be rendered as bare
inc and
dec instructions respectively.
Forward-defined labels -- that is, labels that are defined after the current
instruction is processed -- cannot be optimized into zero page instructions
even if the label does end up being defined as a zero page location, because
the assembler does not know the value of the label in advance during the first
pass when the length of an instruction is computed. On the second pass, a
warning will be issued when an instruction that could have been optimized
can't be because of this limitation. (Obviously, this does not apply to
branching or jumping instructions because they're not optimizable anyhow, and
those instructions that can
only take an 8-bit parameter will always be
casted to an 8-bit quantity.) If the label cannot otherwise be defined ahead
of the instruction, the backtick prefix
` may be used to force further
optimization no matter where the label is defined as long as the instruction
supports it. Indiscriminately forcing the issue can be fraught with peril,
however, and is not recommended; to discourage this, the assembler will
complain about its use in addressing mode situations where no ambiguity
exists, such as indirect indexed, branching and so on.
Also, as a further consequence of the way optimization is managed, we repeat
that
all 24-bit quantities and labels that reference a 24-bit quantity
in 65816 mode, anteriorly declared or otherwise,
MUST be prepended with
the
@ prefix. Otherwise, the assembler will attempt to optimize to 16
bits, which may be undesirable.
SEE ALSO¶
file65(1),
ldo65(1),
printcbm(1),
reloc65(1),
uncpk(1),
dxa(1)
AUTHOR¶
This manual page was written by David Weinehall <tao@acc.umu.se>, Andre
Fachat <fachat@web.de> and Cameron Kaiser <ckaiser@floodgap.com>.
Original xa package (C)1989-1997 Andre Fachat. Additional changes (C)1989-2009
Andre Fachat, Jolse Maginnis, David Weinehall, Cameron Kaiser. The official
maintainer is Cameron Kaiser.
WEBSITE¶
http://www.floodgap.com/retrotech/xa/