NAME¶
Funfilters - Filtering Rows in a Table
SYNOPSIS¶
This document contains a summary of the user interface for filtering rows in
binary tables.
DESCRIPTION¶
Table filtering allows a program to select rows from an table (e.g., X\-ray
event list) by checking each row against one or more expressions involving the
columns in the table. When a table is filtered, only valid rows satisfying
these expressions are passed through for processing.
A filter expression is specified using bracket notation appended to the filename
of the data being processed:
foo.fits[pha==1&&pi==2]
It is also possible to put region specification inside a file and then pass the
filename in bracket notation:
foo.fits[@my.reg]
Filters must be placed after the extension and image section information, when
such information is present. The correct order is:
- •
- file[fileinfo,sectioninfo][filters]
- •
- file[fileinfo,sectioninfo,filters]
where:
- •
- file is the Funtools file name
- •
- fileinfo is an ARRAY, EVENT, FITS extension, or FITS index
- •
- sectioninfo is the image section to extract
- •
- filters are spatial region and table (row) filters to apply
See Funtools Files for more information on file and image section
specifications.
Filter Expressions
Table filtering can be performed on columns of data in a FITS binary table or a
raw event file. Table filtering is accomplished by means of
table filter
specifications. An table filter specification consists of one or more
filter expressions Filter specifications also can contain comments and
local/global processing directives.
More specifically, a filter specification consist of one or more lines
containing:
# comment until end of line
# include the following file in the table descriptor
@file
# each row expression can contain filters separated by operators
[filter_expression] BOOLOP [filter_expression2], ...
# each row expression can contain filters separated by the comma operator
[filter_expression1], [filter_expression2], ...
# the special row# keyword allows a range of rows to be processed
row#=m:n
# or a single row
row#=m
# regions are supported -- but are described elsewhere
[spatial_region_expression]
A single filter expression consists of an arithmetic, logical, or other
operations involving one or more column values from a table. Columns can be
compared to other columns, to header values, or to numeric constants. Standard
math functions can be applied to columns. Separate filter expressions can be
combined using boolean operators. Standard C semantics can be used when
constructing expressions, with the usual precedence and associativity rules
holding sway:
Operator Associativity
-------- -------------
() left to right
!! (logical not) right to left
! (bitwise not) - (unary minus) right to left
* / left to right
+ - left to right
< <= > >= left to right
== != left to right
& (bitwise and) left to right
^ (bitwise exclusive or) left to right
⎪ (bitwise inclusive or) left to right
&& (logical and) left to right
⎪⎪ (logical or) left to right
= right to left
For example, if energy and pha are columns in a table, then the following are
valid expressions:
pha>1
energy == pha
(pha>1) && (energy<=2)
max(pha,energy)>=2.5
Comparison values can be integers or floats. Integer comparison values can be
specified in decimal, octal (using '0' as prefix), hex (using '0x' as prefix)
or binary (using '0b' as prefix). Thus, the following all specify the same
comparison test of a status mask:
(status & 15) == 8 # decimal
(status & 017) == 010 # octal
(status & 0xf) == 0x8 # hex
(status & 0b1111) == 0b1000 # binary
The special keyword row# allows you to process a range of rows. When row# is
specified, the filter code skips to the designated row and only processes the
specified number of rows. The "*" character can be utilized as the
high limit value to denote processing of the remaining rows. Thus:
row#=100:109
processes 10 rows, starting with row 100 (counting from 1), while:
row#=100:*
specifies that all but the first 99 rows are to be processed.
Spatial region filtering allows a program to select regions of an image or rows
of a table (e.g., X\-ray events) using simple geometric shapes and boolean
combinations of shapes. For a complete description of regions, see Spatial
Region Filtering.
Separators Also Are Operators
As mentioned previously, multiple filter expressions can be specified in a
filter descriptor, separated by commas or new\-lines. When such a comma or
new-line separator is used, the boolean AND operator is automatically
generated in its place. Thus and expression such as:
pha==1,pi=2:4
is equivalent to:
(pha==1) && (pi>=2&&pi<=4)
[Note that the behavior of separators is different for filter expressions and
spatial region expressions. The former uses AND as the operator, while the
latter user OR. See Combining Region and Table Filters for more information
about these conventions and how they are treated when combined.]
Range Lists
Aside from the standard C syntax, filter expressions can make use of IRAF-style
range lists which specify a range of values. The syntax requires that
the column name be followed by an '=' sign, which is followed by one or more
comma-delimited range expressions of the form:
col = vv # col == vv in range
col = :vv # col <= vv in range
col = vv: # col >= vv in range
col = vv1:vv2 # vv1 <= col <= vv2 in range
The vv's above must be numeric constants; the right hand side of a range list
cannot contain a column name or header value.
Note that, unlike an ordinary comma separator, the comma separator used between
two or more range expressions denotes OR. Thus, when two or more range
expressions are combined with a comma separator, the resulting expression is a
shortcut for more complicated boolean logic. For example:
col = :3,6:8,10:
is equivalent to:
(col=6 && col =10)
Note also that the single-valued rangelist:
col = val
is equivalent to the C\-based filter expression:
col == val
assuming, of course, that val is a numeric constant.
Math Operations and Functions
It is permissible to specify C math functions as part of the filter syntax. When
the filter parser recognizes a function call, it automatically includes the
math.h and links in the C math library. Thus, it is possible to filter rows by
expressions such as these:
- •
- (pi+pha)>(2+log(pi)\-pha)
- •
- min(pi,pha)*14>x
- •
- max(pi,pha)==(pi+1)
- •
- feq(pi,pha)
- •
- div(pi,pha)>0
The function feq(a,b) returns true (1) if the difference between a and b (taken
as double precision values) is less than approximately 10E\-15. The function
div(a,b) divides a by b, but returns NaN (not a number) if b is 0. It is a
safe way to avoid floating point errors when dividing one column by another.
Include Files
The special
@filename directive specifies an include file containing
filter expressions. This file is processed as part of the overall filter
descriptor:
foo.fits[pha==1,@foo]
Header Parameters
The filter syntax supports comparison between a column value and a header
parameter value of a FITS binary tables (raw event files have no such header).
The header parameters can be taken from the binary table header or the primary
header. For example, assuming there is a header value MEAN_PHA in one of these
headers, you can select photons having exactly this value using:
- •
- pha==MEAN_PHA
Table filtering is more easily described by means of examples. Consider data
containing the following table structure:
- •
- double TIME
- •
- int X
- •
- int Y
- •
- short PI
- •
- short PHA
- •
- int DX
- •
- int DY
Tables can be filtered on these columns using IRAF/QPOE range syntax or any
valid C syntax. The following examples illustrate the possibilities:
- •
- pha=10
- •
- pha==10
select rows whose pha value is exactly 10
- •
- pha=10:50
select rows whose pha value is in the range of 10 to 50
- •
- pha=10:50,100
select rows whose pha value is in the range of 10 to 50 or is equal to
100
- •
- pha>=10 && pha<=50
select rows whose pha value is in the range of 10 to 50
- •
- pi=1,2&&pha>3
select rows whose pha value is 1 or 2 and whose pi value is 3
- •
- pi=1,2 ⎪⎪ pha>3
select rows whose pha value is 1 or 2 or whose pi value is 3
- •
- pha==pi+1
select rows whose pha value is 1 less than the pi value
- •
- (pha==pi+1) && (time>50000.0)
select rows whose pha value is 1 less than the pi value and whose time value
is greater than 50000
- •
- (pi+pha)>20
select rows in which the sum of the pi and pha values is greater than
20
- •
- pi%2==1
select rows in which the pi value is odd
Currently, integer range list limits cannot be specified in binary notation (use
decimal, hex, or octal instead). Please contact us if this is a problem.
SEE ALSO¶
See
funtools(7) for a list of Funtools help pages