.TH "OSMIUM-EXPORT" "1" "1.13.1" "" "" .SH NAME .PP osmium-export - export OSM data .SH SYNOPSIS .PP \f[B]osmium export\f[R] [\f[I]OPTIONS\f[R]] \f[I]OSM-FILE\f[R] .SH DESCRIPTION .PP The OSM data model with its nodes, ways, and relations is very different from the data model usually used for geodata with features having point, linestring, or polygon geometries (or their cousins, the multipoint, multilinestring, or multipolygon geometries). .PP The \f[B]export\f[R] command transforms OSM data into a more usual GIS data model. Nodes will be translated into points and ways into linestrings or polygons (if they are closed ways). Multipolygon and boundary relations will be translated into multipolygons. This transformation is not loss-less, especially information in non-multipolygon, non-boundary relations is lost. .PP All tags are preserved in this process. Note that most GIS formats (such as Shapefiles, etc.) do not support arbitrary tags. Transformation into other GIS formats will need extra steps mapping tags to a limited list of attributes. This is outside the scope of this command. .PP The \f[B]osmium export\f[R] command has to keep an index of the node locations in memory or in a temporary file on disk while doing its work. There are several different ways it can do that which have different advantages and disadvantages. The default is good enough for most cases, but see the \f[B]osmium-index-types\f[R](5) man page for details. .PP Objects with invalid geometries are silently omitted from the output. This is the case for ways with less than two nodes or closed ways or relations that can\[cq]t be assembled into a valid (multi)polygon. See the options \f[B]--show-errors/-e\f[R] and \f[B]--stop-on-error/-E\f[R] for how to modify this behaviour. .PP The input file will be read twice (once for the relations, once for nodes and ways), so this command can not read its input from STDIN. .PP This command will not work on full history files. .PP This command will work with negative IDs on OSM objects (for instance on files created with JOSM). .SH OPTIONS .TP -c, --config=FILE Read configuration from specified file. .TP -C, --print-default-config Print the default config to STDOUT. Useful if you want to change it and not write the whole thing manually. If you use this option all other options are ignored. .TP -e, --show-errors Output any geometry errors on STDERR. This includes ways with a single node or areas that can\[cq]t be assembled from multipolygon relations. This output is not suitable for automated use, there are other tools that can create very detailed errors reports that are better for that (see https://osmcode.org/osm-area-tools/). .TP -E, --stop-on-error Usually geometry errors (due to missing node locations or broken polygons) are ignored and the features are omitted from the output. If this option is set, any error will immediately stop the program. .TP --geometry-types=TYPES Specify the geometry types that should be written out. Usually all created geometries (points, linestrings, and (multi)polygons) are written to the output, but you can restrict the types using this option. TYPES is a comma-separated list of the types (\[lq]point\[rq], \[lq]linestring\[rq], and \[lq]polygon\[rq]). .TP -i, --index-type=TYPE Set the index type. For details see the \f[B]osmium-index-types\f[R](5) man page. .TP -I, --show-index-types Shows a list of available index types. For details see the \f[B]osmium-index-types\f[R](5) man page. If you use this options all other options are ignored. .TP -n, --keep-untagged If this is set, features without any tags will be in the exported data. By default these features will be omitted from the output. Tags are the OSM tags, not attributes (like id, version, uid, \&...) without the tags removed by the \f[B]exclude_tags\f[R] or \f[B]include_tags\f[R] settings. .TP -r, --omit-rs Do not print the RS (0x1e, record separator) character when using the GeoJSON Text Sequence Format. Ignored for other formats. THIS OPTION IS DEPRECATED, PLEASE USE \[lq]-x print_record_separator=false\[rq] INSTEAD. .TP -u, --add-unique-id=TYPE Add a unique ID to each feature. TYPE can be either \f[I]counter\f[R] in which case the first feature will get ID 1, the next ID 2 and so on. The type of object does not matter in this case. Or the TYPE is \f[I]type_id\f[R] in which case the ID is a string, the first character is the type of object (`n' for nodes, `w' for linestrings created from ways, and `a' for areas created from ways and/or relations, after that there is a unique ID based on the original OSM object ID(s). If the input file has negative IDs, this can create IDs such as `w-12'. In spaten exports the ID is written into the \[at]fid field. For \f[I]counter\f[R] the value will be an integer, for \f[I]type_id\f[R] it will be a string. .TP -x, --format-option=OPTION(=VALUE) Set an output format option. The options available depend on the output format. See the \f[B]OUTPUT FORMAT OPTIONS\f[R] section for available options. If the VALUE is not set, the OPTION will be set to \[lq]true\[rq]. If needed you can specify this option multiple times to set several options. Options set on the command line overwrite options set in the config file. .SH COMMON OPTIONS .TP -h, --help Show usage help. .TP -v, --verbose Set verbose mode. The program will output information about what it is doing to STDERR. .TP --progress Show progress bar. Usually a progress bar is only displayed if STDOUT and STDERR are detected to be TTY. With this option a progress bar is always shown. Note that a progress bar will never be shown when reading from STDIN or a pipe. .TP --no-progress Do not show progress bar. Usually a progress bar is displayed if STDOUT and STDERR are detected to be a TTY. With this option the progress bar is suppressed. Note that a progress bar will never be shown when reading from STDIN or a pipe. .SH INPUT OPTIONS .TP -F, --input-format=FORMAT The format of the input file(s). Can be used to set the input format if it can\[cq]t be autodetected from the file name(s). This will set the format for all input files, there is no way to set the format for some input files only. See \f[B]osmium-file-formats\f[R](5) or the libosmium manual for details. .SH OUTPUT OPTIONS .TP -f, --output-format=FORMAT The format of the output file. Can be used to set the output file format if it can\[cq]t be autodetected from the output file name. See the OUTPUT FORMATS section for a list of formats. .TP --fsync Call fsync after writing the output file to force flushing buffers to disk. .TP -o, --output=FILE Name of the output file. Default is `-' (STDOUT). .TP -O, --overwrite Allow an existing output file to be overwritten. Normally \f[B]osmium\f[R] will refuse to write over an existing file. .SH CONFIG FILE .PP The config file is in JSON format. The top-level is an object which contains the following optional names: .IP \[bu] 2 \f[C]attributes\f[R]: An object specifying which attributes of OSM objects to export. See the ATTRIBUTES section. .IP \[bu] 2 \f[C]format_options\f[R]: An object specifying output format options. The options available depend on the output format. See the \f[B]OUTPUT FORMAT OPTIONS\f[R] section for available options. These options can also be set using the command line option \f[B]--format-option/-x\f[R]. .IP \[bu] 2 \f[C]linear_tags\f[R]: An expression specifying tags that should be treated as linear tags. See below for details and also look at the AREA HANDLING section. .IP \[bu] 2 \f[C]area_tags\f[R]: An expression specifying tags that should be treated as area tags. See below for details and also look at the AREA HANDLING section. .IP \[bu] 2 \f[C]exclude_tags\f[R]: A list of tag expressions. Tags matching these expressions are excluded from the output. See the FILTER EXPRESSION section. .IP \[bu] 2 \f[C]include_tags\f[R]: A list of tag expressions. Tags matching these expressions are included in the output. See the FILTER EXPRESSION section. .PP The \f[C]area_tags\f[R] and \f[C]linear_tags\f[R] can have the following values: .TP true All tags match. (An empty list \f[C][]\f[R] can also be used to mean the same, but this use is deprecated because it can be confusing.) .TP false No tags match. .TP Array The array contains one or more expressions as described in the FILTER EXPRESSION section. .TP null If the \f[C]area_tags\f[R] or \f[C]linear_tags\f[R] is set to null or not set at all, the inverse of the other setting is used. So if you do not set the \f[C]linear_tags\f[R] but have some expressions in \f[C]area_tags\f[R], areas will be created for all objects matching those expressions and linestrings for everything else. This can be simpler, because you only have to keep one list, but in cases where an object can be interpreted as both an area and a linestring, only one interpretation will be used. .PP The \f[C]exclude_tags\f[R] and \f[C]include_tags\f[R] options are mutually exclusive. If you want to just exclude some tags but leave most tags untouched, use the \f[C]exclude_tags\f[R] setting. If you only want a defined list of tags, use \f[C]include_tags\f[R]. .PP When no config file is specified, the following settings are used: .IP .nf \f[C] { \[dq]attributes\[dq]: { \[dq]type\[dq]: false, \[dq]id\[dq]: false, \[dq]version\[dq]: false, \[dq]changeset\[dq]: false, \[dq]timestamp\[dq]: false, \[dq]uid\[dq]: false, \[dq]user\[dq]: false, \[dq]way_nodes\[dq]: false }, \[dq]format_options\[dq]: { }, \[dq]linear_tags\[dq]: true, \[dq]area_tags\[dq]: true, \[dq]exclude_tags\[dq]: [], \[dq]include_tags\[dq]: [] } \f[R] .fi .SH FILTER EXPRESSIONS .PP A filter expression specifies a tag or tags that should be matched in the data. .PP Some examples: .TP amenity Matches all objects with the key \[lq]amenity\[rq]. .TP highway=primary Matches all objects with the key \[lq]highway\[rq] and value \[lq]primary\[rq]. .TP highway!=primary Matches all objects with the key \[lq]highway\[rq] and a value other than \[lq]primary\[rq]. .TP type=multipolygon,boundary Matches all objects with key \[lq]type\[rq] and value \[lq]multipolygon\[rq] or \[lq]boundary\[rq]. .TP name,name:de=Kastanienallee,Kastanienstrasse Matches any object with a \[lq]name\[rq] or \[lq]name:de\[rq] tag with the value \[lq]Kastanienallee\[rq] or \[lq]Kastanienstrasse\[rq]. .TP addr:* Matches all objects with any key starting with \[lq]addr:\[rq] .TP name=*Paris Matches all objects with a name that contains the word \[lq]Paris\[rq]. .PP If there is no equal sign (\[lq]=\[rq]) in the expression only keys are matched and values can be anything. If there is an equal sign (\[lq]=\[rq]) in the expression, the key is to the left and the value to the right. An exclamation sign (\[lq]!\[rq]) before the equal sign means: A tag with that key, but not the value(s) to the right of the equal sign. A leading or trailing asterisk (\[lq]*\[rq]) can be used for substring or prefix matching, respectively. Commas (\[lq],\[rq]) can be used to separate several keys or values. .PP All filter expressions are case-sensitive. There is no way to escape the special characters such as \[lq]=\[rq], \[lq]*\[rq] and \[lq],\[rq]. You can not mix comma-expressions and \[lq]*\[rq]-expressions. .SH ATTRIBUTES .PP All OSM objects (nodes, ways, and relations) have \f[I]attributes\f[R], areas inherit their attributes from the ways and/or relations they were created from. The attributes known to \f[C]osmium export\f[R] are: .IP \[bu] 2 \f[C]type\f[R] (`node', `way', or `relation') .IP \[bu] 2 \f[C]id\f[R] (64 bit object ID) .IP \[bu] 2 \f[C]version\f[R] (version number) .IP \[bu] 2 \f[C]changeset\f[R] (changeset ID) .IP \[bu] 2 \f[C]timestamp\f[R] (time of object creation in seconds since Jan 1 1970) .IP \[bu] 2 \f[C]uid\f[R] (user ID) .IP \[bu] 2 \f[C]user\f[R] (user name) .IP \[bu] 2 \f[C]way_nodes\f[R] (ways only, array with node IDs) .PP For areas, the type will be \f[C]way\f[R] or \f[C]relation\f[R] if the area was created from a closed way or a multipolygon or boundary relation, respectively. The \f[C]id\f[R] for areas is the id of the closed way or the multipolygon or boundary relation. .PP By default the attributes will not be in the export, because they are not necessary for most uses of OSM data. If you are interested in some (or all) attributes, add an \f[C]attributes\f[R] object to the config file. Add a member for each attribute you are interested in, the value can be either \f[C]false\f[R] (do not output this attribute), \f[C]true\f[R] (output this attribute with the attribute name prefixed by the \f[C]\[at]\f[R] sign) or any string, in which case the string will be used as the attribute name. .PP Depending on your choice of values for the \f[C]attributes\f[R] objects, attributes can have the same name as tag keys. If this is the case, the conflicting tag is silently dropped. So if there is a tag \[lq]\[at]id=foo\[rq] and you have set \f[C]id\f[R] to \f[C]true\f[R] in the \f[C]attributes\f[R] object, the tag will not show up in the output. .PP Note that the \f[C]id\f[R] is not necessarily unique. Even the combination \f[C]type\f[R] and \f[C]id\f[R] is not unique, because a way may end up in the output file as LineString and as (Multi)Polygon. See the \f[B]--add-unique-id/-u\f[R] option for a unique ID. .SH AREA HANDLING .PP Multipolygon relations will be assembled into multipolygon geometries forming areas. Some closed ways will also form areas. Here are the detailed rules: .TP Non-closed way A non-closed way (with the last node location not the same as the first node location) is always (regardless of any tags) a linestring, not an area. .TP Relation A relation tagged \f[C]type=multipolygon\f[R] or \f[C]type=boundary\f[R] is always (regardless of any tags) assembled into an area. .TP Closed way For a closed way (with the last node location the same as the first node location) the tags are checked: If the way has an \f[C]area=yes\f[R] tag, an area is created. If the way has an \f[C]area=no\f[R] tag, a linestring is created. An \f[C]area\f[R] tag with a value other than \f[C]yes\f[R] or \f[C]no\f[R] is ignored. The configuration settings \f[C]area_tags\f[R] and \f[C]linear_tags\f[R] can be used to augment the area check. If any of the tags matches the \f[C]area_tags\f[R], an area is created. If any of the tags matches the \f[C]linear_tags\f[R], a linestring is created. If both match, an area and a linestring is created. This is important because some objects have tags that make them both, an area and a linestring. .SH OUTPUT FORMATS .PP The following output formats are supported: .IP \[bu] 2 \f[C]geojson\f[R] (alias: \f[C]json\f[R]): GeoJSON (RFC7946). The output file will contain a single \f[C]FeatureCollection\f[R] object. This is the default format. .IP \[bu] 2 \f[C]geojsonseq\f[R] (alias: \f[C]jsonseq\f[R]): GeoJSON Text Sequence (RFC8142). Each line (beginning with a RS (0x1e, record separator) and ending in a linefeed character) contains one GeoJSON object. Used for streaming GeoJSON. .IP \[bu] 2 \f[C]pg\f[R]: PostgreSQL COPY text format. One line per object containing the WGS84 geometry as WKB, the tags in JSON format and, optionally, more columns for id and attributes. You have to create the table manually, then use the PostgreSQL COPY command to import the data. Enable verbose output to see the SQL commands needed to create the table and load the data. .IP \[bu] 2 \f[C]spaten\f[R]: Spaten, a binary format that is suitable for large data sets. .IP \[bu] 2 \f[C]text\f[R] (alias: \f[C]txt\f[R]): A simple text format with the geometry in WKT format followed by the comma-delimited tags. This is mainly intended for debugging at the moment. THE FORMAT MIGHT CHANGE WITHOUT NOTICE! .SH OUTPUT FORMAT OPTIONS .IP \[bu] 2 \f[C]print_record_separator\f[R] (default: \f[C]true\f[R]). Set to \f[C]false\f[R] to not print the RS (0x1e, record separator) character when using the GeoJSON Text Sequence Format. Ignored for other formats. .IP \[bu] 2 \f[C]tags_type\f[R] (default: \f[C]jsonb\f[R]). Set to \f[C]hstore\f[R] to use HSTORE format instead of JSON/JSONB when using the Pg Format. Ignored in other formats. .SH DIAGNOSTICS .PP \f[B]osmium export\f[R] exits with exit code .TP 0 if everything went alright, .TP 1 if there was an error processing the data, or .TP 2 if there was a problem with the command line arguments. .SH MEMORY USAGE .PP \f[B]osmium export\f[R] will usually keep all node locations and all objects needed for assembling the areas in memory. For larger data files, this can need several tens of GBytes of memory. See the \f[B]osmium-index-types\f[R](5) man page for details. .SH EXAMPLES .PP Export into GeoJSON format: .IP .nf \f[C] osmium export data.osm.pbf -o data.geojson \f[R] .fi .PP Use a config file and export into GeoJSON Text Sequence format: .IP .nf \f[C] osmium export data.osm.pbf -o data.geojsonseq -c export-config.json \f[R] .fi .SH SEE ALSO .IP \[bu] 2 \f[B]osmium\f[R](1), \f[B]osmium-file-formats\f[R](5), \f[B]osmium-index-types\f[R](5), \f[B]osmium-add-node-locations-to-ways\f[R](1) .IP \[bu] 2 Osmium website (https://osmcode.org/osmium-tool/) .IP \[bu] 2 GeoJSON (http://geojson.org/) .IP \[bu] 2 RFC7946 (https://tools.ietf.org/html/rfc7946) .IP \[bu] 2 RFC8142 (https://tools.ietf.org/html/rfc8142) .IP \[bu] 2 Line delimited JSON (https://en.wikipedia.org/wiki/JSON_Streaming#Line_delimited_JSON) .IP \[bu] 2 Spaten Geo Format (https://thomas.skowron.eu/spaten/) .SH COPYRIGHT .PP Copyright (C) 2013\-2021 Jochen Topf . License GPLv3+: GNU GPL version 3 or later . This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. .SH CONTACT .PP If you have any questions or want to report a bug, please go to https://osmcode.org/contact.html .SH AUTHORS Jochen Topf .