NAME¶
SQL::Parser -- validate and parse SQL strings
SYNOPSIS¶
use SQL::Parser; # CREATE A PARSER OBJECT
my $parser = SQL::Parser->new();
$parser->feature( $class, $name, $value ); # SET OR FIND STATUS OF
my $has_feature = $parser->feature( $class, $name ); # A PARSER FEATURE
$parser->dialect( $dialect_name ); # SET OR FIND STATUS OF
my $current_dialect = $parser->dialect; # A PARSER DIALECT
DESCRIPTION¶
SQL::Parser is part of the SQL::Statement distribution and, most interaction
with the parser should be done through SQL::Statement. The methods shown above
create and modify a parser object. To use the parser object to parse SQL and
to examine the resulting structure, you should use SQL::Statement.
Important Note: Previously SQL::Parser had its own hash-based interface
for parsing, but that is now deprecated and will eventually be phased out in
favor of the object-oriented parsing interface of SQL::Statement. If you are
unable to transition some features to the new interface or have concerns about
the phase out, please contact me. See "The Parse Structure" for
details of the now-deprecated hash method if you still need them.
METHODS¶
new()¶
Create a new parser object
use SQL::Parser;
my $parser = SQL::Parser->new();
The
new() method creates a SQL::Parser object which can then be used to
parse and validate the syntax of SQL strings. It takes two optional parameters
- 1) the name of the SQL dialect that will define the syntax rules for the
parser and 2) a reference to a hash which can contain additional attributes of
the parser. If no dialect is specified, 'AnyData' is the default.
use SQL::Parser;
my $parser = SQL::Parser->new( $dialect_name, \%attrs );
The dialect_name parameter is a string containing any valid dialect such as
'ANSI', 'AnyData', or 'CSV'. See the section on the
dialect() method
below for details.
The "attrs" parameter is a reference to a hash that can contain error
settings for the PrintError and RaiseError attributes.
An example:
use SQL::Parser;
my $parser = SQL::Parser->new('AnyData', {RaiseError=>1} );
This creates a new parser that uses the grammar rules
contained in the .../SQL/Dialects/AnyData.pm file and which
sets the RaiseError attribute to true.
dialect()¶
$parser->dialect( $dialect_name ); # load a dialect configuration file
my $dialect = $parser->dialect; # get the name of the current dialect
For example:
$parser->dialect('AnyData'); # loads the AnyData config file
print $parser->dialect; # prints 'AnyData'
The $dialect_name parameter may be the name of any dialect configuration file on
your system. Use the $parser->list('dialects') method to see a list of
available dialects. At a minimum it will include "ANSI",
"CSV", and "AnyData". For backwards compatibility 'Ansi'
is accepted as a synonym for 'ANSI', otherwise the names are case sensitive.
Loading a new dialect configuration file erases all current parser features and
resets them to those defined in the configuration file.
feature()¶
Features define the rules to be used by a specific parser instance. They are
divided into the following classes:
* valid_commands
* valid_options
* valid_comparison_operators
* valid_data_types
* reserved_words
Within each class a feature name is either enabled or disabled. For example,
under "valid_data_types" the name "BLOB" may be either
disabled or enabled. If it is not enabled (either by being specifically
disabled, or simply by not being specified at all) then any SQL string using
"BLOB" as a data type will throw a syntax error "Invalid data
type: 'BLOB'".
The
feature() method allows you to enable, disable, or check the status
of any feature.
$parser->feature( $class, $name, 1 ); # enable a feature
$parser->feature( $class, $name, 0 ); # disable a feature
my $feature = $parser->feature( $class, $name ); # return status of a feature
For example:
$parser->feature('reserved_words','FOO',1); # make 'FOO' a reserved word
$parser->feature('valid_data_types','BLOB',0); # disallow 'BLOB' as a
# data type
# determine if the LIKE
# operator is supported
my $LIKE = $parser->feature('valid_operators','LIKE');
See the section below on "Backwards Compatibility" for use of the
feature() method with SQL::Statement 0.1x style parameters.
Supported SQL syntax¶
The SQL::Statement distribution can be used to either just parse SQL statements
or to execute them against actual data. A broader set of syntax is supported
in the parser than in the executor. For example the parser allows you to
specify column constraints like PRIMARY KEY. Currently, these are ignored by
the execution engine. Likewise syntax such as RESTRICT and CASCADE on DROP
statements or LOCAL GLOBAL TEMPORARY tables in CREATE are supported by the
parser but ignored by the executor.
To see the list of Supported SQL syntax formerly kept in this pod, see
SQL::Statement.
Subclassing SQL::Parser¶
In the event you need to either extend or modify SQL::Parser's default behavior,
the following methods may be overridden:
- "$self-">"get_btwn($string)"
- Processes the BETWEEN...AND... predicates; default converts to 2 range
predicates.
- "$self-">"get_in($string)"
- Process the IN (...list...) predicates; default converts to a series of
OR'd '=' predicate, or AND'd '<>' predicates for NOT IN.
- "$self-">"transform_syntax($string)"
- Abstract method; default simply returns the original string. Called after
repl_btwn() and repl_in(), but before any further predicate
processing is applied. Possible uses include converting other predicate
syntax not recognized by SQL::Parser into user-defined functions.
The parse structure¶
This section outlines the
now-deprecated hash interface to the parsed
structure. It is included
for backwards compatibility only. You should
use the SQL::Statement object interface to the structure instead. See
SQL::Statement.
Parse Structures
Here are some further examples of the data structures returned by the
structure() method after a call to
parse(). Only specific
details are shown for each SQL instance, not the entire structure.
parse()
Once a SQL::Parser object has been created with the
new() method, the
parse() method can be used to parse any number of SQL strings. It takes
a single required parameter -- a string containing a SQL command. The SQL
string may optionally be terminated by a semicolon. The
parse() method
returns a true value if the parse is successful and a false value if the parse
finds SQL syntax errors.
Examples:
1) my $success = $parser->parse('SELECT * FROM foo');
2) my $sql = 'SELECT * FROM foo';
my $success = $parser->parse( $sql );
3) my $success = $parser->parse(qq!
SELECT id,phrase
FROM foo
WHERE id < 7
AND phrase <> 'bar'
ORDER BY phrase;
!);
4) my $success = $parser->parse('SELECT * FRoOM foo ');
In examples #1,#2, and #3, the value of $success will be true because the
strings passed to the
parse() method are valid SQL strings.
In example #4, however, the value of $success will be false because the string
contains a SQL syntax error ('FRoOM' instead of 'FROM').
In addition to checking the return value of
parse() with a variable like
$success, you may use the PrintError and RaiseError attributes as you would in
a DBI script:
* If PrintError is true, then SQL syntax errors will be sent as
warnings to STDERR (i.e. to the screen or to a file if STDERR
has been redirected). This is set to true by default which
means that unless you specifically turn it off, all errors
will be reported.
* If RaiseError is true, then SQL syntax errors will cause the
script to die, (i.e. the script will terminate unless wrapped
in an eval). This is set to false by default which means
that unless you specifically turn it on, scripts will
continue to operate even if there are SQL syntax errors.
Basically, you should leave PrintError on or else you will not be warned when an
error occurs. If you are simply validating a series of strings, you will want
to leave RaiseError off so that the script can check all strings regardless of
whether some of them contain SQL errors. However, if you are going to try to
execute the SQL or need to depend that it is correct, you should set
RaiseError on so that the program will only continue to operate if all SQL
strings use correct syntax.
IMPORTANT NOTE #1: The
parse() method only checks syntax, it does NOT
verify if the objects listed actually exist. For example, given the string
"SELECT model FROM cars", the
parse() method will report that
the string contains valid SQL but that will not tell you whether there
actually is a table called "cars" or whether that table contains a
column called 'model'. Those kinds of verifications are performed by the
SQL::Statement module, not by SQL::Parser by itself.
IMPORTANT NOTE #2: The
parse() method uses rules as defined by the
selected dialect configuration file and the
feature() method. This
means that a statement that is valid in one dialect may not be valid in
another. For example the 'CSV' and 'AnyData' dialects define 'BLOB' as a valid
data type but the 'ANSI' dialect does not. Therefore the statement 'CREATE
TABLE foo (picture BLOB)' would be valid in the first two dialects but would
produce a syntax error in the 'ANSI' dialect.
structure()
After a SQL::Parser object has been created and the
parse() method used
to parse a SQL string, the
structure() method returns the data
structure of that string. This data structure may be passed on to other
modules (e.g. SQL::Statement) or it may be printed out using, for example, the
Data::Dumper module.
The data structure contains all of the information in the SQL string as parsed
into its various components. To take a simple example:
$parser->parse('SELECT make,model FROM cars');
use Data::Dumper;
print Dumper $parser->structure;
Would produce:
$VAR1 = {
'column_defs' => [
{ 'type' => 'column',
'value' => 'make', },
{ 'type' => 'column',
'value' => 'model', },
],
'command' => 'SELECT',
'table_names' => [
'cars'
]
};
'SELECT make,model, FROM cars'
command => 'SELECT',
table_names => [ 'cars' ],
column_names => [ 'make', 'model' ],
'CREATE TABLE cars ( id INTEGER, model VARCHAR(40) )'
column_defs => {
id => { data_type => INTEGER },
model => { data_type => VARCHAR(40) },
},
'SELECT DISTINCT make FROM cars'
set_quantifier => 'DISTINCT',
'SELECT MAX (model) FROM cars'
set_function => {
name => 'MAX',
arg => 'models',
},
'SELECT * FROM cars LIMIT 5,10'
limit_clause => {
offset => 5,
limit => 10,
},
'SELECT * FROM vars ORDER BY make, model DESC'
sort_spec_list => [
{ make => 'ASC' },
{ model => 'DESC' },
],
"INSERT INTO cars VALUES ( 7, 'Chevy', 'Impala' )"
values => [ 7, 'Chevy', 'Impala' ],
SUPPORT¶
You can find documentation for this module with the perldoc command.
perldoc SQL::Parser
perldoc SQL::Statement
You can also look for information at:
- •
- RT: CPAN's request tracker
http://rt.cpan.org/NoAuth/Bugs.html?Dist=SQL-Statement
<http://rt.cpan.org/NoAuth/Bugs.html?Dist=SQL-Statement>
- •
- AnnoCPAN: Annotated CPAN documentation
http://annocpan.org/dist/SQL-Statement
<http://annocpan.org/dist/SQL-Statement>
- •
- CPAN Ratings
http://cpanratings.perl.org/s/SQL-Statement
<http://cpanratings.perl.org/s/SQL-Statement>
- •
- Search CPAN
http://search.cpan.org/dist/SQL-Statement/
<http://search.cpan.org/dist/SQL-Statement/>
Where can I go for help?¶
For questions about installation or usage, please ask on the dbi-users@perl.org
mailing list or post a question on PerlMonks
(<
http://www.perlmonks.org/>, where Jeff is known as jZed). Jens does
not visit PerlMonks on a regular basis.
If you have a bug report, a patch or a suggestion, please open a new report
ticket at CPAN (but please check previous reports first in case your issue has
already been addressed). You can mail any of the module maintainers, but you
are more assured of an answer by posting to the dbi-users list or reporting
the issue in RT.
Report tickets should contain a detailed description of the bug or enhancement
request and at least an easily verifiable way of reproducing the issue or fix.
Patches are always welcome, too.
Where can I go for help with a concrete version?¶
Bugs and feature requests are accepted against the latest version only. To get
patches for earlier versions, you need to get an agreement with a developer of
your choice - who may or not report the the issue and a suggested fix upstream
(depends on the license you have chosen).
Business support and maintenance¶
For business support you can contact Jens via his CPAN email address
rehsackATcpan.org. Please keep in mind that business support is neither
available for free nor are you eligible to receive any support based on the
license distributed with this package.
AUTHOR & COPYRIGHT¶
This module is
copyright (c) 2001,2005 by Jeff Zucker and
copyright (c) 2008,2010 by Jens Rehsack.
All rights reserved.
The module may be freely distributed under the same terms as Perl itself using
either the "GPL License" or the "Artistic License" as
specified in the Perl README file.
Jeff can be reached at: jzuckerATcpan.org Jens can be reached at:
rehsackATcpan.org or via dbi-devATperl.org