Scroll to navigation

STILTS-SQLSKYMATCH(1) Stilts commands STILTS-SQLSKYMATCH(1)

NAME

stilts-sqlskymatch - Crossmatches table on sky position against SQL table

SYNOPSIS

stilts sqlskymatch [ifmt=<in-format>] [istream=true|false] [in=<table>] [icmd=<cmds>] [ocmd=<cmds>] [omode=out|meta|stats|count|checksum|cgi|discard|topcat|samp|tosql|gui] [out=<out-table>] [ofmt=<out-format>] [ra=<expr>] [dec=<expr>] [sr=<expr/deg>] [find=best|all|each] [usefoot=true|false] [footnside=<int-value>] [copycols=<colid-list>] [scorecol=<col-name>] [erract=abort|ignore|retry|retry<n>] [ostream=true|false] [fixcols=none|dups|all] [suffix0=<label>] [suffix1=<label>] [db=<jdbc-url>] [user=<value>] [password=<value>] [dbtable=<table-name>] [dbra=<sql-col>] [dbdec=<sql-col>] [dbunit=deg|rad] [tiling=hpx<K>|healpixnest<K>|healpixring<K>|htm<K>] [dbtile=<sql-col>] [selectcols=<sql-cols>] [where=<sql-condition>] [preparesql=true|false]

DESCRIPTION

sqlskymatch resembles coneskymatch, but instead of sending an HTTP query to a remote cone search service for each match (i.e. each row of the input table), it executes an SQL query directly. The query is a SELECT statement with a WHERE clause which makes restrictions on Right Ascension and Declination columns; the names of these columns must be given as parameters. The effect is that of a spatial join between a client-side table and a table stored in the database.

This command can only be used if you have access to an SQL database via JDBC. The details of how to configure a JDBC connection to a database are discussed in SUN/256 - obviously you will need a database to connect to and appropriate read permissions on it as well as the relevant drivers.

Note: this task was known as sqlcone in its experimental form in STILTS v1.3.

OPTIONS

Specifies the format of the input table as specified by parameter in. The known formats are listed in SUN/256. This flag can be used if you know what format your table is in. If it has the special value (auto) (the default), then an attempt will be made to detect the format of the table automatically. This cannot always be done correctly however, in which case the program will exit with an error explaining which formats were attempted. This parameter is ignored for scheme-specified tables.

If set true, the input table specified by the in parameter will be read as a stream. It is necessary to give the ifmt parameter in this case. Depending on the required operations and processing mode, this may cause the read to fail (sometimes it is necessary to read the table more than once). It is not normally necessary to set this flag; in most cases the data will be streamed automatically if that is the best thing to do. However it can sometimes result in less resource usage when processing large files in certain formats (such as VOTable). This parameter is ignored for scheme-specified tables.

The location of the input table. This may take one of the following forms:

  • A filename.
  • A URL.
  • The special value "-", meaning standard input. In this case the input format must be given explicitly using the ifmt parameter. Note that not all formats can be streamed in this way.
  • A scheme specification of the form :<scheme-name>:<scheme-args>.
  • A system command line with either a "<" character at the start, or a "|" character at the end ("<syscmd" or "syscmd|"). This executes the given pipeline and reads from its standard output. This will probably only work on unix-like systems.

In any case, compressed data in one of the supported compression formats (gzip, Unix compress or bzip2) will be decompressed transparently.

Specifies processing to be performed on the input table as specified by parameter in, before any other processing has taken place. The value of this parameter is one or more of the filter commands described in SUN/256. If more than one is given, they must be separated by semicolon characters (";"). This parameter can be repeated multiple times on the same command line to build up a list of processing steps. The sequence of commands given in this way defines the processing pipeline which is performed on the table.

Commands may alteratively be supplied in an external file, by using the indirection character '@'. Thus a value of "@filename" causes the file filename to be read for a list of filter commands to execute. The commands in the file may be separated by newline characters and/or semicolons, and lines which are blank or which start with a '#' character are ignored.

Specifies processing to be performed on the output table, after all other processing has taken place. The value of this parameter is one or more of the filter commands described in SUN/256. If more than one is given, they must be separated by semicolon characters (";"). This parameter can be repeated multiple times on the same command line to build up a list of processing steps. The sequence of commands given in this way defines the processing pipeline which is performed on the table.

Commands may alteratively be supplied in an external file, by using the indirection character '@'. Thus a value of "@filename" causes the file filename to be read for a list of filter commands to execute. The commands in the file may be separated by newline characters and/or semicolons, and lines which are blank or which start with a '#' character are ignored.

The mode in which the result table will be output. The default mode is out, which means that the result will be written as a new table to disk or elsewhere, as determined by the out and ofmt parameters. However, there are other possibilities, which correspond to uses to which a table can be put other than outputting it, such as displaying metadata, calculating statistics, or populating a table in an SQL database. For some values of this parameter, additional parameters (<mode-args>) are required to determine the exact behaviour.

Possible values are

  • out
  • meta
  • stats
  • count
  • checksum
  • cgi
  • discard
  • topcat
  • samp
  • tosql
  • gui

Use the help=omode flag or see SUN/256 for more information.

The location of the output table. This is usually a filename to write to. If it is equal to the special value "-" (the default) the output table will be written to standard output.

This parameter must only be given if omode has its default value of "out".

Specifies the format in which the output table will be written (one of the ones in SUN/256 - matching is case-insensitive and you can use just the first few letters). If it has the special value "(auto)" (the default), then the output filename will be examined to try to guess what sort of file is required usually by looking at the extension. If it's not obvious from the filename what output format is intended, an error will result.

This parameter must only be given if omode has its default value of "out".

Right ascension in degrees in the coordinate system for the position of each row of the input table. This may simply be a column name, or it may be an algebraic expression calculated from columns as explained in SUN/256. If left blank, an attempt is made to guess from UCDs, column names and unit annotations what expression to use.

Declination in degrees in the coordinate system for the position of each row of the input table. This may simply be a column name, or it may be an algebraic expression calculated from columns as explained in SUN/256. If left blank, an attempt is made to guess from UCDs, column names and unit annotations what expression to use.

Expression which evaluates to the search radius in degrees for the request at each row of the input table. This will often be a constant numerical value, but may be the name or ID of a column in the input table, or a function involving one.

Determines which matches are retained.

  • best: Only the matching query table row closest to the input table row will be output. Input table rows with no matches will be omitted. (Note this corresponds to the best1 option in the pair matching commands, and best1 is a permitted alias).
  • all: All query table rows which match the input table row will be output. Input table rows with no matches will be omitted.
  • each: There will be one output table row for each input table row. If matches are found, the closest one from the query table will be output, and in the case of no matches, the query table columns will be blank.

Determines whether an attempt will be made to restrict searches in accordance with available footprint information. If this is set true, then before any of the per-row queries are performed, an attempt may be made to acquire footprint information about the servce. If such information can be obtained, then queries which fall outside the footprint, and hence which are known to yield no results, are skipped. This can speed up the search considerably.

Currently, the only footprints available are those provided by the CDS MOC (Multi-Order Coverage map) service, which covers VizieR and a few other cone search services.

Determines the HEALPix Nside parameter for use with the MOC footprint service. This tuning parameter determines the resolution of the footprint if available. Larger values give better resolution, hence a better chance of avoiding unnecessary queries, but processing them takes longer and retrieving and storing them is more expensive.

The value must be a power of 2, and at the time of writing, the MOC service will not supply footprints at resolutions greater than nside=512, so it should be <=512.

Only used if usefoot=true.

List of columns from the input table which are to be copied to the output table. Each column identified here will be prepended to the columns of the combined output table, and its value for each row taken from the input table row which provided the parameters of the query which produced it. See SUN/256 for list syntax. The default setting is "*", which means that all columns from the input table are included in the output.

Gives the name of a column in the output table to contain the distance between the requested central position and the actual position of the returned row. The distance returned is an angular distance in degrees. If a null value is chosen, no distance column will appear in the output table.

Determines what will happen if any of the individual cone search requests fails. By default the task aborts. That may be the best thing to do, but for unreliable or poorly implemented services you may find that some searches fail and others succeed so it can be best to continue operation in the face of a few failures. The options are:

  • abort: Failure of any query terminates the task.
  • ignore: Failure of a query is treated the same as a query which returns no rows.
  • retry: Failed queries are retried until they succeed; an increasing delay is introduced for each failure. Use with care - if the failure is for some good, or at least reproducible reason this could prevent the task from ever completing.
  • retry<n>: Failed queries are retried at most a fixed number <n> of times; an increasing delay is introduced for each failure. If failures persist the task terminates.

If set true, this will cause the operation to stream on output, so that the output table is built up as the results are obtained from the cone search service. The disadvantage of this is that some output modes and formats need multiple passes through the data to work, so depending on the output destination, the operation may fail if this is set. Use with care (or be prepared for the operation to fail).

Determines how input columns are renamed before use in the output table. The choices are:

  • none: columns are not renamed
  • dups: columns which would otherwise have duplicate names in the output will be renamed to indicate which table they came from
  • all: all columns will be renamed to indicate which table they came from

If columns are renamed, the new ones are determined by suffix* parameters.

If the fixcols parameter is set so that input columns are renamed for insertion into the output table, this parameter determines how the renaming is done. It gives a suffix which is appended to all renamed columns from the input table.

If the fixcols parameter is set so that input columns are renamed for insertion into the output table, this parameter determines how the renaming is done. It gives a suffix which is appended to all renamed columns from the cone result table.

URL which defines a connection to a database. This has the form jdbc:<subprotocol>:<subname> - the details are database- and driver-dependent. Consult Sun's JDBC documentation and that for the particular JDBC driver you are using for details. Note that the relevant driver class will need to be on your classpath and referenced in the jdbc.drivers system property as well for the connection to be made.

User name for logging in to SQL database. Defaults to the current username.

Password for logging in to SQL database.

The name of the table in the SQL database which provides the remote data.

The name of a column in the SQL database table dbtable which gives the right ascension. Units are given by dbunit.

The name of a column in the SQL database table dbtable which gives the declination. Units are given by dbunit.

Units of the right ascension and declination columns identified in the database table. May be either deg[rees] (the default) or rad[ians].

Describes the sky tiling scheme that is in use. One of the following values may be used:

  • hpxK: alias for healpixnestK
  • healpixnestK: HEALPix using the Nest scheme at order K
  • healpixringK: HEALPix using the Ring scheme at order K
  • htmK: Hierarchical Triangular Mesh at level K

So for instance hpx5 or healpixnest5 would both indicate the HEALPix NEST tiling scheme at order 5.

At level K, there are 12*4^K HEALPix pixels, or 8*4^K HTM pixels on the sky. More information about these tiling schemes can be found at the HEALPix and HTM web sites.

The name of a column in the SQL database table dbtable which contains a sky tiling pixel index. The tiling scheme is given by the tiling parameter. Use of a tiling column is optional, but if present (and if the column is indexed in the database table) it may serve to speed up searches. Set to null if the database table contains no tiling column or if you do not wish to use one.

An SQL expression for the list of columns to be selected from the table in the database. A value of "*" retrieves all columns.

An SQL expression further limiting the rows to be selected from the database. This will be combined with the constraints on position implied by the cone search centres and radii. The value of this parameter should just be a condition, it should not contain the WHERE keyword. A null value indicates no additional criteria.

If true, the JDBC connection will use PreparedStatements for the SQL SELECTs otherwise it will use simple Statements. This is a tuning parameter and affects only performance. On some database/driver combinations it's a lot faster set false (the default); on others it may be faster, who knows?

SEE ALSO

stilts(1)

If the package stilts-doc is installed, the full documentation SUN/256 is available in HTML format:
file:///usr/share/doc/stilts/sun256/index.html

VERSION

STILTS version 3.4.9-debian

This is the Debian version of Stilts, which lack the support of some file formats and network protocols. For differences see
file:///usr/share/doc/stilts/README.Debian

AUTHOR

Mark Taylor (Bristol University)

Mar 2017