|simriscparams(7)||simrisc configuration file organization||simriscparams(7)|
NAME¶simriscparams - The description of the configuration files
DESCRIPTION¶This page describes the organization of the simrisc configuration files. These files are formatted like standard unix configuration files. Lines are interpreted after removing initial white-space (blanks and tabs). If a line ends in \ (a backslash), then the next line (initial white-space removed) is appended to the current line.
While processing the configuration files trailing blanks and information on lines starting at the first # character are removed.
Note that all parameter identifiers are interpreted case sensitively. E.g., Costs: is a different parameter than costs:. The numeric values used in this man-page are for illustration purpose only. Some restrictions apply though: standard deviations cannot be negative; proportions and probabilities must lie in the range 0..1; multiple probabilities (like the ones used for breast densities) must add up to 1; etc. If restrictions apply then they are mentioned at the various parameter descriptions below.
DEFAULT CONFIGURATION FILE¶A configuration file provided in the simrisc distribution is
Usually this file is unzipped to the ~/.config directory:
gunzip < /usr/share/doc/simrisc/simrisc.gz > ~/.config/whereafter ~/.config/simrisc can be edited to contain local modifications.
Various parameters specify probability distributions. Usually the Normal distribution is specified. The program also recognizes the LogNormal and Uniform distributions.
Parameter specifications start with keywords, followed by a colon. The keywords are listed in the following overview. The format of the specifications is also fixed, but empty lines and white space may be used to improve the specifications’ readabilities. Also, all characters starting at # characters until the end of the line are considered comment and are ignored.
Parameter specifications starting with uppercase letters (like Scenario:) specify (sub)sections and contain no additional specifications. Specifications starting with lowercase letters (like ageGroup:) are followed by actual parameter values.
The configuration file must define all parameters of all configuration sections, but configuration parameters can be modified using a separate analysis file or using overriding command-line parameters.
Changes introduced in version 14.04.00¶
- Parameters affected by spread: true
Parameters that may vary are specified using triplets: value, spread and distribution. In all cases the spread values and distribution names are optional: they can both be omitted or both must be specified. If these parameters are not specified then their value parameter won’t vary if spread: true is specified;
- The Mammo, Tomo, and MRI modalities are provided with std.dev and distribution parameters for their Dose, M, Beta, Specificity, and Sensitivity parameters;
- When spread: true is specified the actually used and original parameter values are listed in a file, by default spread-$.txt, where $ is replaced by the loop iteration index. Use the option -s to specify a non-default filename (cf. simrisc(1));
- Age ranges no longer have trailing colons;
- The Case-specific data matrix defines an extra (18th) column, showing the results of the screening rounds for each simulated case;
- The order of the beir7 beta and eta parameters is reversed: eta is specified first, followed by beta. The spread and distribution parameters following beta apply to beta, and not to eta, which is a fixed value.
The Scenario section¶This section starts with a line containing Scenario: and it defines some general parameters that are used during the simulation process. The default configuration file contains the following specifications:
- spread: false
when specified as true then parameter spreading is used;
- iterations: 1
the (positive) number of iterations used in a simulation loop;
- generator: random
in addition to random modes fixed and increasing are available.
This parameter specificies the way simrisc’s random number generators are initialized. When mode random is specified the random number generators are initialized using randomly selected seeds and seed (below) is not used. When mode fixed is used the random number generators are initialized with seed’s value. When mode increasing is used the seeds of the random number generators are incremented using a fixed increment at each iteration;
- seed: 1
the (positive) value to seed the random number generator with. This parameter is ignored when generator: random was specified;
- cases: 100000
the (positive) number of cases to simulate;
The Costs section¶This section starts with a line containing Costs: and it defines several parameters used for cost-calculations. Modality-specific cost parameters are specified at the Modalities section. The default configuration file contains the following specifications:
- biop: 176
the (positive) cost of performing a biopsy;
- diameters: 0: 6438 20: 7128 50: 7701
pairs of diameter: cost values specifying the treatment cost starting at the specified tumor diameter, up to the next pair’s diameter (if specified) or all diameters starting at the diameter specified at the last pair. The first diameter must be 0. The second value of each pair specifies the (non-negative) treatment costs for that age-group.
the costs discount proportion starting at some age. This line is followed by two additional lines specifying the starting age and discount proportion:
age: 50 proportion: 0
The BreastDensities section¶This section starts with a line containing BreastDensities: and it defines breast density values for various age groups, covering ages 0 through the maximum age for simulated cases. The default configuration file contains the following specifications:
# bi-rad: a b c d ageGroup: 0 - 40 0.05 0.30 0.48 0.17 ageGroup: 40 - 50 0.06 0.34 0.47 0.13 ageGroup: 50 - 60 0.08 0.50 0.37 0.05 ageGroup: 60 - 70 0.15 0.53 0.29 0.03 ageGroup: 70 - * 0.18 0.54 0.26 0.02Age groups are half-open ranges: they start at their first ages, and end at (not including) their second ages. The first ages of subsequent age groups must be equal to the second ages of their previous age groups. For the last age group the specification * can be used, indicating that all ages at or above the last age group’s begin age are handled by that group.
For each age group the probabilities of the four bi-rad classifications must sum to 1.0.
the Modalities section¶This section starts with a line containing Modalities: and it specifies cancer-scanning modalities. Currently three modalities are supported: Mammo, Tomo and MRI.
Some modalities specify age groups, which are (like the age ranges used for breastDensities) half-open ranges: they start at their first ages, and end at (not including) their second-ages, while subsequent age ranges must connect. Also, the last age group may use the end-age specification *.
The default configuration file contains (below the line Modalities:) the following specifications (if modalities aren’t used their specifications are optional):
For the Mammo modality the costs, radiation doses and m: parameter specifications per bi-rad category, specificity probabilities for age groups, the parameters of the beta-function, and the systematic error probability must be specified.
The default configuration file contains (below the line Mammo:) the following specifications
costs: 64 # default: systematicError: 0.1 Dose: # mean spread dist bi-rad: a 3 1 Normal bi-rad: b 3 1 Normal bi-rad: c 3 1 Normal bi-rad: d 3 1 Normal M: # proportion spread dist bi-rad: a .061 .021 Normal bi-rad: b .163 .045 Normal bi-rad: c .400 .106 Normal bi-rad: d .826 .088 Normal Beta: # mean spread dist nr: 1 -4.38 .002 Normal nr: 2 .49 .0005 Normal nr: 3 -1.34 .0074 Normal nr: 4 -7.18 .0340 Normal Specificity: # range proportion spread dist ageGroup: 0 - 40 .961 .005 Normal ageGroup: 40 - * .965 .005 NormalFor this modality the sensitivity is computed using the beta-function published by Isheden and Humphreys (2017, Statistical Methods in Medical Research, 28(3), 681-702). From a randomly generated probability and a case’s age the case’s bi-rad category is determined and that category is then used to select the m-parameter that is used in the beta-function;
For the Tomo modality the costs, radiation doses per bi-rad category, sensitivity probabilities per bi-rad category, and specificity probabilities for age groups must be specified.
The default configuration file contains (below the line Tomo:) the following specifications:
costs: 64 Dose: # mean spread dist bi-rad: a 3 1 Normal bi-rad: b 3 1 Normal bi-rad: c 3 1 Normal bi-rad: d 3 1 Normal Sensitivity: # proportion spread dist bi-rad: a .87 .05 Normal bi-rad: b .84 .05 Normal bi-rad: c .73 .05 Normal bi-rad: d .65 .05 Normal Specificity: # range proportion spread dist ageGroup: 0 - 40 .961 .0025 Normal ageGroup: 40 - * .965 .0025 Normal
For the MRI modality the costs, and the sensitivity and specificity probabilities must be specified.
The default configuration file contains (below the line MRI:) the following specifications:
costs: 280 # proportion spread dist sensitivity: .94 .005 Normal specificity: .95 .005 Normal
The Screening section¶This section starts with a line containing Screening: and it defines the ages at which screenings are performed as well as the screenings attendance rate. If no screening rounds should be used then specify a single round-specification line
round: noneOtherwise, each screening round is defined by the keyword round: followed by an age which in turn is followed by a list of at least one space delimited modality specification (currently Mammo, Tomo and MRI). The default configuration file contains (below the line Screening:) the following specifications:
round: 50 Mammo round: 52 Mammo round: 54 Mammo round: 56 Mammo round: 58 Mammo round: 60 Mammo round: 62 Mammo round: 64 Mammo round: 66 Mammo round: 68 Mammo round: 70 Mammo round: 72 Mammo round: 74 Mammo
In addition to the round specification line(s) the Screening section specifies the attendance rate proportion. The default configuration file specifies:
# proportion: attendanceRate: .8
The Tumor section¶This section starts with a line containing Tumor: and it defines the parameters specifying tumor characteristics. Several of the parameters in this section can be provided with a spread and distribution specification. When spread: true is specified then these spread and distribution specifications are used to apply statistical variations to these parameters.
Supported distributions are Normal, Uniform, and LogNormal. If value is the specified value parameter value, and spread the specified spread parameter then the values that are actually used during the simulations are:
- when using the Normal distribution N(mean, stddev):
- when using the Uniform distribution U(begin, end):
U(value - spread / 2, value + spread / 2)
- when using the LogNormal distribution L(mean, stddev):
The spread parameters may not be negative. If spread is specified then the distribution must also be specified. If spread is not specified, then the value parameter won’t vary if spread: true is specified in the Scenario section.
The Tumor: section has four subsections: beir7:, Growth, Incidence:, and Survival:. They contain the following parameter specifications:
BEIR (tumor induction) parameters: only tumor induction type 7 (i.e., beir7) is used. The default configuration file contains this specification:
# eta beta spread dist. beir7: -2.0 0.51 0.32 NormalIf spread: true is specified then the actually used beta parameter is drawn from the specified distribution having the specified std.dev. (spread).
Tumor growth specifications consist of three elements: the start diameter, the self-detect parameters and the doubling time specifications.
The start parameter defines the start diameter of emerging tumors. The default configuration file contains the following specification:
Four parameters are used to determine the diameter at which self-detection is possible. These parameters are:
- the standard deviation (stdev, see below) used by the lognormal distribution to compute the diameter at which self-detection occurs. This parameter is required and cannot be negative;
- the mean (see below) used by the lognormal distribution. This parameter is required and cannot be negative. Its value will vary using the following two parameters if spread: true was specified;
- the spread (standard deviation) used by the distribution that is used to vary the mean if spread: true was specified. It can be omitted in which case the mean won’t vary;
- the distribution used to vary the mean. If the previous parameter is omitted then this parameter must also be omitted.
The actually used self-detect diameter is computed using:
diameter = L(mean, stdev)
The default configuration file contains these parameter specifications:
# stdev value spread dist. selfDetect: .70 2.92 .084 Normal
Finally, the Growth: subsection also defines tumor doubling times for various age groups. Doubling times are computed like the self-detect diameters, i.e., using lognormal distributions. Thus, age groups are followed by four parameter specifications (of which the last two are optional): the standard deviation of the lognormal distribution, the mean value of the lognormal distribution, and the spread and name of the distribution that is used when spread: true was specified. The age groups must cover ages 0 through the maximum age for simulated cases, and are specified as described at section BreastDensities:. The default configuration file contains the following specifications:
DoublingTime: # stdev mean spread dist. ageGroup: 1 - 50 .61 4.38 .43 Normal ageGroup: 50 - 70 .26 5.06 .17 Normal ageGroup: 70 - * .45 5.24 .23 Normal
Three carrier types are supported: Normal, BRCA1 and BRCA2. Each having a probability of occurrence. The probabilities of specified carriers must add to 1. Each carrier is identified by its name (e.g., Normal:) followed by four parameter specifications:
- the probability that the carrier is observed;
- the standard deviation used when computing the risk of getting a tumor. As this standard deviation is used in the denominator of expressions it must be larger than zero.
- the lifetime risk: three parameters specifying a probability, optionally followed by the standard deviation and distribution that is used to vary the probability when spread: true is specified;
- the mean age: three parameters specifying the mean age, optionally followed by the standard deviation and distribution that is used to vary the probability when spread: true is specified;
The default configuration file specifies the Normal carrier’s probability as 1, effectively suppressing the other carriers. The default configuration file contains (below the Incidence: parameter line) the following specifications:
Normal: probability: 1 stdDev: 21.1 # value spread distr. lifetimeRisk: .226 .0053 Normal meanAge: 72.9 .552 Normal BRCA1: probability: 0 stdDev: 16.51 # value spread distr. lifetimeRisk: .96 meanAge: 53.9 BRCA2: probability: 0 stdDev: 16.51 # value spread distr. lifetimeRisk: .96 meanAge: 53.9
Four types of survival parameters must be specified. Each type specifies a distribution type, (a..d), a mean, and an (optional) spread and distribution which is used when spread: true is specified. The default configuration file specifies:
# value spread dist: type: a .00004475 .000004392 Normal type: b 1.85867 .0420 Normal type: c -.271 .0101 Normal type: d 2.0167 .0366 Normal
PARAMETER RESPECIFICATION¶Parameters can be respecified by defining a separate parameter configuration file or by providing alternate parameter specifications in analyses: sections of the program’s input file, or by providing alternative parameter specifications as command-line arguments (cf. the simrisc(3) man-page)
- ~/.config/simrisc: the default location of the program’s configuration file;
- the simrisc distribution archive contains the default configuration file as simrisc-VERSION/stdconfig/simrisc, where VERSION is replaced by simrisc’s actual release version;
- when installing simrisc using Linux distribution archives (e.g., .deb files) the default configuration file is commonly available as /usr/shared/doc/simrisc/simrisc.gz
COPYRIGHT¶This is free software, distributed under the terms of the GNU General Public License (GPL).
AUTHOR¶Frank B. Brokken (email@example.com),