table of contents
- bookworm 3.6.7-8+b1
- testing 3.6.7-13
- unstable 3.6.7-15
- experimental 3.6.8-4
dsr2xml(1) | OFFIS DCMTK | dsr2xml(1) |
NAME¶
dsr2xml - Convert DICOM SR file and data set to XML
SYNOPSIS¶
dsr2xml [options] dsrfile-in [xmlfile-out]
DESCRIPTION¶
The dsr2xml utility converts the contents of a DICOM Structured Reporting (SR) document (file format or raw data set) to XML (Extensible Markup Language). The XML Schema dsr2xml.xsd does not yet follow any standard format. However, the dsr2xml application might be enhanced in this aspect in the future (e.g. by supporting HL7/CDA - Clinical Document Architecture).
If dsr2xml reads a raw data set (DICOM data without a file format meta-header) it will attempt to guess the transfer syntax by examining the first few bytes of the file. It is not always possible to correctly guess the transfer syntax and it is better to convert a data set to a file format whenever possible (using the dcmconv utility). It is also possible to use the -f and -t[ieb] options to force dsr2xml to read a dataset with a particular transfer syntax.
PARAMETERS¶
dsrfile-in DICOM SR input filename to be converted ('-' for stdin) xmlfile-out XML output filename (default: stdout)
OPTIONS¶
general options¶
-h --help
print this help text and exit
--version
print version information and exit
--arguments
print expanded command line arguments
-q --quiet
quiet mode, print no warnings and errors
-v --verbose
verbose mode, print processing details
-d --debug
debug mode, print debug information
-ll --log-level [l]evel: string constant
(fatal, error, warn, info, debug, trace)
use level l for the logger
-lc --log-config [f]ilename: string
use config file f for the logger
input options¶
input file format:
+f --read-file
read file format or data set (default)
+fo --read-file-only
read file format only
-f --read-dataset
read data set without file meta information input transfer syntax:
-t= --read-xfer-auto
use TS recognition (default)
-td --read-xfer-detect
ignore TS specified in the file meta header
-te --read-xfer-little
read with explicit VR little endian TS
-tb --read-xfer-big
read with explicit VR big endian TS
-ti --read-xfer-implicit
read with implicit VR little endian TS
processing options¶
error handling:
-Er --unknown-relationship
accept unknown/missing relationship type
-Ev --invalid-item-value
accept invalid content item value
(e.g. violation of VR or VM definition)
-Ec --ignore-constraints
ignore relationship content constraints
-Ee --ignore-item-errors
do not abort on content item errors, just warn
(e.g. missing value type specific attributes)
-Ei --skip-invalid-items
skip invalid content items (including sub-tree)
-Dv --disable-vr-checker
disable check for VR-conformant string values specific character set:
+Cr --charset-require
require declaration of extended charset (default)
+Ca --charset-assume [c]harset: string
assume charset c if no extended charset declared
+Cc --charset-check-all
check all data elements with string values
(default: only PN, LO, LT, SH, ST, UC and UT)
# this option is only used for the extended check whether
# the Specific Character Set (0008,0005) attribute should be
# present, but not for the conversion of unaffected element
# values to UTF-8 (e.g. element values with a VR of CS)
+U8 --convert-to-utf8
convert all element values that are affected
by Specific Character Set (0008,0005) to UTF-8
# requires support from an underlying character encoding
# library (see output of --version on which one is available)
output options¶
encoding:
+Ea --attr-all
encode everything as XML attribute
(shortcut for +Ec, +Er, +Ev and +Et)
+Ec --attr-code
encode code value, coding scheme designator
and coding scheme version as XML attribute
+Er --attr-relationship
encode relationship type as XML attribute
+Ev --attr-value-type
encode value type as XML attribute
+Et --attr-template-id
encode template id as XML attribute
+Ee --template-envelope
template element encloses content items
(requires +Wt, implies +Et) XML structure:
+Xs --add-schema-reference
add reference to XML Schema 'dsr2xml.xsd'
(not with +Ea, +Ec, +Er, +Ev, +Et, +Ee, +We)
+Xn --use-xml-namespace
add XML namespace declaration to root element writing:
+We --write-empty-tags
write all tags even if their value is empty
+Wi --write-item-id
always write item identifier
+Wt --write-template-id
write template identification information
NOTES¶
DICOM Conformance¶
The dsr2xml utility supports the following SOP Classes:
SpectaclePrescriptionReportStorage 1.2.840.10008.5.1.4.1.1.78.6 MacularGridThicknessAndVolumeReportStorage 1.2.840.10008.5.1.4.1.1.79.1 BasicTextSRStorage 1.2.840.10008.5.1.4.1.1.88.11 EnhancedSRStorage 1.2.840.10008.5.1.4.1.1.88.22 ComprehensiveSRStorage 1.2.840.10008.5.1.4.1.1.88.33 Comprehensive3DSRStorage 1.2.840.10008.5.1.4.1.1.88.34 ProcedureLogStorage 1.2.840.10008.5.1.4.1.1.88.40 MammographyCADSRStorage 1.2.840.10008.5.1.4.1.1.88.50 KeyObjectSelectionDocumentStorage 1.2.840.10008.5.1.4.1.1.88.59 ChestCADSRStorage 1.2.840.10008.5.1.4.1.1.88.65 XRayRadiationDoseSRStorage 1.2.840.10008.5.1.4.1.1.88.67 RadiopharmaceuticalRadiationDoseSRStorage 1.2.840.10008.5.1.4.1.1.88.68 ColonCADSRStorage 1.2.840.10008.5.1.4.1.1.88.69 ImplantationPlanSRStorage 1.2.840.10008.5.1.4.1.1.88.70 AcquisitionContextSRStorage 1.2.840.10008.5.1.4.1.1.88.71 SimplifiedAdultEchoSRStorage 1.2.840.10008.5.1.4.1.1.88.72 PatientRadiationDoseSRStorage 1.2.840.10008.5.1.4.1.1.88.73 PlannedImagingAgentAdministrationSRStorage 1.2.840.10008.5.1.4.1.1.88.74 PerformedImagingAgentAdministrationSRStorage 1.2.840.10008.5.1.4.1.1.88.75
Please note that currently only mandatory and some optional attributes are supported.
Character Encoding¶
The XML encoding is determined automatically from the DICOM attribute (0008,0005) 'Specific Character Set' using the following mapping:
ASCII (ISO_IR 6) => 'UTF-8' UTF-8 'ISO_IR 192' => 'UTF-8' ISO Latin 1 'ISO_IR 100' => 'ISO-8859-1' ISO Latin 2 'ISO_IR 101' => 'ISO-8859-2' ISO Latin 3 'ISO_IR 109' => 'ISO-8859-3' ISO Latin 4 'ISO_IR 110' => 'ISO-8859-4' ISO Latin 5 'ISO_IR 148' => 'ISO-8859-9' ISO Latin 9 'ISO_IR 203' => 'ISO-8859-15' Cyrillic 'ISO_IR 144' => 'ISO-8859-5' Arabic 'ISO_IR 127' => 'ISO-8859-6' Greek 'ISO_IR 126' => 'ISO-8859-7' Hebrew 'ISO_IR 138' => 'ISO-8859-8' Thai 'ISO_IR 166' => 'TIS-620' Japanese 'ISO 2022 IR 13\ISO 2022 IR 87' => 'ISO-2022-JP' Korean 'ISO 2022 IR 6\ISO 2022 IR 149' => 'ISO-2022-KR' Chinese 'ISO 2022 IR 6\ISO 2022 IR 58' => 'ISO-2022-CN' Chinese 'GB18030' => 'GB18030' Chinese 'GBK' => 'GBK'
If this DICOM attribute is missing in the input file, although needed, option --charset-assume can be used to specify an appropriate character set manually (using one of the DICOM defined terms). For reasons of backward compatibility with previous versions of this tool, the following terms are also supported and mapped automatically to the associated DICOM defined terms: latin-1, latin-2, latin-3, latin-4, latin-5, latin-9, cyrillic, arabic, greek, hebrew.
Option --convert-to-utf8 can be used to convert the DICOM file or data set to UTF-8 encoding prior to the conversion to XML format.
If no mapping is defined and option --convert-to-utf8 is not used, non-ASCII characters and those below #32 are stored as '&#nnn;' where 'nnn' refers to the numeric character code. This might lead to invalid character entity references (such as '' for ESC) and will cause most XML parsers to reject the document.
Error Handling¶
Please be careful with the processing options --unknown-relationship, --invalid-item-value, --ignore-constraints, --ignore-item-errors and --skip-invalid-items since they disable certain validation checks on the DICOM SR input file and, therefore, might result in non-standard conformant output. However, there might be reasons for using one or more of these options, e.g. in order to read and process an incorrectly encoded SR document.
Limitations¶
The XML Schema dsr2xml.xsd does not support all variations of the dsr2xml output format. However, the default output format (plus option --use-xml-namespace) should work.
LOGGING¶
The level of logging output of the various command line tools and underlying libraries can be specified by the user. By default, only errors and warnings are written to the standard error stream. Using option --verbose also informational messages like processing details are reported. Option --debug can be used to get more details on the internal activity, e.g. for debugging purposes. Other logging levels can be selected using option --log-level. In --quiet mode only fatal errors are reported. In such very severe error events, the application will usually terminate. For more details on the different logging levels, see documentation of module 'oflog'.
In case the logging output should be written to file (optionally with logfile rotation), to syslog (Unix) or the event log (Windows) option --log-config can be used. This configuration file also allows for directing only certain messages to a particular output stream and for filtering certain messages based on the module or application where they are generated. An example configuration file is provided in <etcdir>/logger.cfg.
COMMAND LINE¶
All command line tools use the following notation for parameters: square brackets enclose optional values (0-1), three trailing dots indicate that multiple values are allowed (1-n), a combination of both means 0 to n values.
Command line options are distinguished from parameters by a leading '+' or '-' sign, respectively. Usually, order and position of command line options are arbitrary (i.e. they can appear anywhere). However, if options are mutually exclusive the rightmost appearance is used. This behavior conforms to the standard evaluation rules of common Unix shells.
In addition, one or more command files can be specified using an '@' sign as a prefix to the filename (e.g. @command.txt). Such a command argument is replaced by the content of the corresponding text file (multiple whitespaces are treated as a single separator unless they appear between two quotation marks) prior to any further evaluation. Please note that a command file cannot contain another command file. This simple but effective approach allows one to summarize common combinations of options/parameters and avoids longish and confusing command lines (an example is provided in file <datadir>/dumppat.txt).
ENVIRONMENT¶
The dsr2xml utility will attempt to load DICOM data dictionaries specified in the DCMDICTPATH environment variable. By default, i.e. if the DCMDICTPATH environment variable is not set, the file <datadir>/dicom.dic will be loaded unless the dictionary is built into the application (default for Windows).
The default behavior should be preferred and the DCMDICTPATH environment variable only used when alternative data dictionaries are required. The DCMDICTPATH environment variable has the same format as the Unix shell PATH variable in that a colon (':') separates entries. On Windows systems, a semicolon (';') is used as a separator. The data dictionary code will attempt to load each file specified in the DCMDICTPATH environment variable. It is an error if no data dictionary can be loaded.
Depending on the command line options specified, the dsr2xml utility will attempt to load character set mapping tables. This happens when DCMTK was compiled with the oficonv library (which is the default) and the mapping tables are not built into the library (default when DCMTK uses shared libraries).
The mapping table files are expected in DCMTK's <datadir>. The DCMICONVPATH environment variable can be used to specify a different location. If a different location is specified, those mapping tables also replace any built-in tables.
FILES¶
<datadir>/dsr2xml.xsd - XML Schema file
SEE ALSO¶
COPYRIGHT¶
Copyright (C) 2000-2023 by OFFIS e.V., Escherweg 2, 26121 Oldenburg, Germany.
Tue Dec 19 2023 | Version 3.6.8 |