SELECT <field-list> FROM <table_def> [LEFT JOIN <table_def> ON [<table_ref>.]<key_field> = [<table_ref>.].<key_field>]* [WHERE <where-expr>] [ORDER BY <sort specification list>] <field-list> ::= <column-spec> [ { , <column-spec> }... ] <column-spec> ::= <field-spec> [ <as clause> ] | CAST ( <field-spec> AS <data type> ) [ <as clause> ] <field-spec> ::= [DISTINCT] <field_ref> | <field_func> ( [DISTINCT] <field-ref> ) | Count(*) <as clause> ::= [ AS ] <column_name> <data type> ::= character [ ( field_length ) ] | float [ ( field_length ) ] | numeric [ ( field_length [, field_precision ] ) ] | integer [ ( field_length ) ] | date [ ( field_length ) ] | time [ ( field_length ) ] | timestamp [ ( field_length ) ] <field-func> ::= AVG | MAX | MIN | SUM | COUNT <field_ref> ::= [<table_ref>.]field_name <sort specification list> ::= <sort specification> [ { <comma> <sort specification> }... ] <sort specification> ::= <sort key> [ <ordering specification> ] <sort key> ::= <field_ref> <ordering specification> ::= ASC | DESC <table_def> ::= ['<datasource name>'.]table_name [table_alias] <table_ref> ::= table_name | table_alias
SELECT * FROM polylayer
In this case all features are fetched from the layer named "polylayer", and all attributes of those features are returned. This is essentially equivelent to accessing the layer directly. In this example the "*" is the list of fields to fetch from the layer, with "*" meaning that all fields should be fetched.
This slightly more sophisticated form still pulls all features from the layer but the schema will only contain the EAS_ID and PROP_VALUE attributes. Any other attributes would be discarded.
SELECT eas_id, prop_value FROM polylayer
A much more ambitious SELECT, restricting the features fetched with a WHERE clause, and sorting the results might look like:
SELECT * from polylayer WHERE prop_value > 220000.0 ORDER BY prop_value DESC
This select statement will produce a table with just one feature, with one attribute (named something like "count_eas_id") containing the number of distinct values of the eas_id attribute.
SELECT COUNT(DISTINCT eas_id) FROM polylayer
A special form of the field list uses the DISTINCT keyword. This returns a list of all the distinct values of the named attribute. When the DISTINCT keyword is used, only one attribute may appear in the field list. The DISTINCT keyword may be used against any type of field. Currently the distinctness test against a string value is case insensitive in OGR SQL. The result of a SELECT with a DISTINCT keyword is a layer with one column (named the same as the field operated on), and one feature per distinct value. Geometries are discarded. The distinct values are assembled in memory, so alot of memory may be used for datasets with a large number of distinct values.
SELECT DISTINCT areacode FROM polylayer
There are also several summarization operators that may be applied to columns. When a summarization operator is applied to any field, then all fields must have summarization operators applied. The summarization operators are COUNT (a count of instances), AVG (numerical average), SUM (numericla sum), MIN (lexical or numerical minimum), and MAX (lexical or numerical maximum). This example produces a variety of sumarization information on parcel property values:
SELECT MIN(prop_value), MAX(prop_value), AVG(prop_value), SUM(prop_value), COUNT(prop_value) FROM polylayer WHERE prov_name = "Ontario"
As a special case, the COUNT() operator can be given a "*" argument instead of a field name which is a short form for count all the records though it would get the same result as giving it any of the column names. It is also possible to apply the COUNT() operator to a DISTINCT SELECT to get a count of distinct values, for instance:
SELECT COUNT(DISTINCT areacode) FROM polylayer
Field names can also be prefixed by a table name though this is only really meaningful when performing joins. It is further demonstrated in the JOIN section.
SELECT select *, OGR_STYLE AS 'STYLE' FROM polylayer
The field name alias can be used as the last operation in the column specification. Therefore we cannot rename the fields inside an operator, but we can rename whole column expression, like:
SELECT COUNT(areacode) AS 'count' FROM polylayer
We can optionally omit the AS keyword in the field name aliases, like:
SELECT *, OGR_STYLE 'STYLE' FROM polylayer
SELECT *, CAST(OGR_STYLE AS character(255)) FROM rivers
Currently casting to the following target types are supported:
Specifying the field_length and/or the field_precision is optional. Conversion to the 'integer list', 'double list' and 'string list' OGR data types are not supported, which doesn't conform to the SQL92 specification.
SELECT prop_value / area FROM invoices
A WHERE clause consists of a set of attribute tests. Each basic test is of the form fieldname operator value. The fieldname is any of the fields in the source layer. The operator is one of =, !=, <>, <, >, <=, >=, LIKE and ILIKE and IN.
Most of the operators are self explanitory, but is is worth nothing that != is the same as <>, the string equality is case insensitive, but the <, >, <= and >= operators are case sensitive. Both the LIKE and ILIKE operators are case insensitive.
The value argument to the LIKE operator is a pattern against which the value string is matched. In this pattern percent (%) matches any number of characters, and underscore ( _ ) matches any one character.
String Pattern Matches? ------ ------- -------- Alberta ALB% Yes Alberta _lberta Yes St. Alberta _lberta No St. Alberta %lberta Yes Robarts St. %Robarts% Yes 12345 123%45 Yes 123.45 12?45 No N0N 1P0 %N0N% Yes L4C 5E2 %N0N% No
The IN takes a list of values as it's argument and tests the attribute value for membership in the provided set.
Value Value Set Matches? ------ ------- -------- 321 IN (456,123) No "Ontario" IN ("Ontario","BC") Yes "Ont" IN ("Ontario","BC") No 1 IN (0,2,4,6) No
In addition to the above binary operators, there are additional operators for testing if a field is null or not. These are the IS NULL and IS NOT NULL operators.
Basic field tests can be combined in more complicated predicates using logical operators include AND, OR, and the unary logical NOT. Subexpressions should be bracketed to make precidence clear. Some more complicated predicates are:
SELECT * FROM poly WHERE (prop_value >= 100000) AND (prop_value < 200000) SELECT * FROM poly WHERE NOT (area_code LIKE "N0N%") SELECT * FROM poly WHERE (prop_value IS NOT NULL) AND (prop_value < 100000)
SELECT * FROM property WHERE class_code = 7 ORDER BY prop_value DESC SELECT * FROM property ORDER BY prop_value SELECT * FROM property ORDER BY prop_value ASC SELECT DISTINCT zip_code FROM property ORDER BY zip_code
Note that ORDER BY clauses cause two passes through the feature set. One to build an in-memory table of field values corresponded with feature ids, and a second pass to fetch the features by feature id in the sorted order. For formats which cannot efficiently randomly read features by feature id this can be a very expensive operation.
Sorting of string field values is case sensitive, not case insensitive like in most other parts of OGR SQL.
SELECT city.*, nation.name FROM city LEFT JOIN nation ON city.nation_id = nation.id
This query would result in a table with all the fields from the city table, and an additional "nation.name" field with the nation name pulled from the nation table by looking for the record in the nation table that has the "id" field with the same value as the city.nation_id field.
Joins introduce a number of additional issues. One is the concept of table qualifiers on field names. For instance, referring to city.nation_id instead of just nation_id to indicate the nation_id field from the city layer. The table name qualifiers may only be used in the field list, and within the ON clause of the join.
Wildcards are also somewhat more involved. All fields from the primary table (city in this case) and the secondary table (nation in this case) may be selected using the usual * wildcard. But the fields of just one of the primary or secondary table may be selected by prefixing the asterix with the table name.
The field names in the resulting query layer will be qualified by the table name, if the table name is given as a qualifier in the field list. In addition field names will be qualified with a table name if they would conflict with earlier fields. For instance, the following select would result might result in a results set with a name, nation_id, nation.nation_id and nation.name field if the city and nation tables both have the nation_id and name fieldnames.
SELECT * FROM city LEFT JOIN nation ON city.nation_id = nation.nation_id
On the other hand if the nation table had a continent_id field, but the city table did not, then that field would not need to be qualified in the result set. However, if the selected instead looked like the following statement, all result fields would be qualified by the table name.
SELECT city.*, nation.* FROM city LEFT JOIN nation ON city.nation_id = nation.nation_id
In the above examples, the nation table was found in the same datasource as the city table. However, the OGR join support includes the ability to join against a table in a different data source, potentially of a different format. This is indicated by qualifying the secondary table name with a datasource name. In this case the secondary datasource is opened using normal OGR semantics and utilized to access the secondary table untill the query result is no longer needed.
SELECT * FROM city
LEFT JOIN '/usr2/data/nation.dbf'.nation ON city.nation_id = nation.nation_id
While not necessarily very useful, it is also possible to introduce table aliases to simplify some SELECT statements. This can also be useful to disambiguate situations where ables of the same name are being used from different data sources. For instance, if the actual tables names were messy we might want to do something like:
SELECT c.name, n.name FROM project_615_city c
LEFT JOIN '/usr2/data/project_615_nation.dbf'.project_615_nation n
ON c.nation_id = n.nation_id
It is possible to do multiple joins in a single query.
SELECT city.name, prov.name, nation.name FROM city LEFT JOIN province ON city.prov_id = province.id LEFT JOIN nation ON city.nation_id = nation.id
SELECT FID, * FROM nation
SELECT * FROM nation WHERE OGR_GEOMETRY='POINT' OR OGR_GEOMETRY='POLYGON'
SELECT OGR_GEOM_WKT, * FROM nation
Using the OGR_GEOM_WKT and the LIKE operator in the WHERE clause we can get similar effect as using OGR_GEOMETRY:
SELECT OGR_GEOM_WKT, * FROM nation WHERE OGR_GEOM_WKT LIKE 'POINT%' OR OGR_GEOM_WKT LIKE 'POLYGON%'
The OGR_GEOM_AREA special field returns the area of the feature's geometry computed by the OGRSurface::get_Area() method. For OGRGeometryCollection and OGRMultiPolygon the value is the sum of the areas of its members. For non-surface geometries the returned area is 0.0.
For example, to select only polygon features larger than a given area:
SELECT * FROM nation WHERE OGR_GEOM_AREA > 10000000'
SELECT * FROM nation WHERE OGR_STYLE LIKE 'LABEL%'
CREATE INDEX ON nation USING nation_id
DROP INDEX ON nation USING nation_id DROP INDEX ON nation
OGRLayer * OGRDataSource::ExecuteSQL( const char *pszSQLCommand, OGRGeometry *poSpatialFilter, const char *pszDialect );
The pszDialect argument is in theory intended to allow for support of different command languages against a provider, but for now applications should always pass an empty (not NULL) string to get the default dialect.
The poSpatialFilter argument is a geometry used to select a bounding rectangle for features to be returned in a manner similar to the OGRLayer::SetSpatialFilter() method. It may be NULL for no special spatial restriction.
The result of an ExecuteSQL() call is usually a temporary OGRLayer representing the results set from the statement. This is the case for a SELECT statement for instance. The returned temporary layer should be released with OGRDataSource::ReleaseResultsSet() method when no longer needed. Failure to release it before the datasource is destroyed may result in a crash.