.TH "slurmdbd.conf" "5" "Slurm Configuration File" "August 2018" "Slurm Configuration File" .SH "NAME" slurmdbd.conf \- Slurm Database Daemon (SlurmDBD) configuration file .SH "DESCRIPTION" \fBslurmdb.conf\fP is an ASCII file which describes Slurm Database Daemon (SlurmDBD) configuration information. The file location can be modified at system build time using the DEFAULT_SLURM_CONF parameter or at execution time by setting the SLURM_CONF environment variable. .LP The contents of the file are case insensitive except for the names of nodes and files. Any text following a "#" in the configuration file is treated as a comment through the end of that line. Changes to the configuration file take effect upon restart of SlurmDbd or daemon receipt of the SIGHUP signal unless otherwise noted. .LP This file should be only on the computer where SlurmDBD executes and should only be readable by the user which executes SlurmDBD (e.g. "slurm"). If the slurmdbd daemon is started as user root and changes to another user ID, the configuration file will initially be read as user root, but will be read as the other user ID in response to a SIGHUP signal. This file should be protected from unauthorized access since it contains a database password. The overall configuration parameters available include: .TP \fBArchiveDir\fR If ArchiveScript is not set the slurmdbd will generate a file that can be read in anytime with sacctmgr load filename. This directory is where the file will be placed after a purge event has happened and archive for that element is set to true. Default is /tmp. The format for this files name is .na $ArchiveDir/$ClusterName_$ArchiveObject_archive_$BeginTimeStamp_$endTimeStamp .ad .TP \fBArchiveEvents\fR When purging events also archive them. Boolean, yes to archive event data, no otherwise. Default is no. .TP \fBArchiveJobs\fR When purging jobs also archive them. Boolean, yes to archive job data, no otherwise. Default is no. .TP \fBArchiveResvs\fR When purging reservations also archive them. Boolean, yes to archive reservation data, no otherwise. Default is no. .TP \fBArchiveScript\fR This script can be executed every time a rollup happens (every hour, day and month), depending on the Purge*After options. This script is used to transfer accounting records out of the database into an archive. It is used in place of the internal process used to archive objects. The script is executed with a no arguments, The following environment variables are set. .RS .TP \fBSLURM_ARCHIVE_EVENTS\fR 1 for archive events 0 otherwise. .TP \fBSLURM_ARCHIVE_LAST_EVENT\fR Time of last event start to archive. .TP \fBSLURM_ARCHIVE_JOBS\fR 1 for archive jobs 0 otherwise. .TP \fBSLURM_ARCHIVE_LAST_JOB\fR Time of last job submit to archive. .TP \fBSLURM_ARCHIVE_STEPS\fR 1 for archive steps 0 otherwise. .TP \fBSLURM_ARCHIVE_LAST_STEP\fR Time of last step start to archive. .TP \fBSLURM_ARCHIVE_SUSPEND\fR 1 for archive suspend data 0 otherwise. .TP \fBSLURM_ARCHIVE_TXN\fR 1 for archive transaction data 0 otherwise. .TP \fBSLURM_ARCHIVE_USAGE\fR 1 for archive usage data 0 otherwise. .TP \fBSLURM_ARCHIVE_LAST_SUSPEND\fR Time of last suspend start to archive. .TP .RE .TP \fBArchiveSteps\fR When purging steps also archive them. Boolean, yes to archive step data, no otherwise. Default is no. .TP \fBArchiveSuspend\fR When purging suspend data also archive it. Boolean, yes to archive suspend data, no otherwise. Default is no. .TP \fBArchiveTXN\fR When purging transaction data also archive it. Boolean, yes to archive transaction data, no otherwise. Default is no. .TP \fBArchiveUsage\fR When purging usage data (Cluster, Association and WCKey) also archive it. Boolean, yes to archive transaction data, no otherwise. Default is no. .TP \fBAuthInfo\fR Additional information to be used for authentication of communications with the Slurm control daemon (slurmctld) on each cluster. The interpretation of this option is specific to the configured \fBAuthType\fR. In the case of \fIauth/munge\fR, this can be configured to use a Munge daemon specifically configured to provide authentication between clusters while the default Munge daemon provides authentication within a cluster. In that case, this will specify the pathname of the socket to use. Per default this value is left unspecified, which results in the default authentication mechanism being used. .TP \fBAuthType\fR Define the authentication method for communications between Slurm components. Acceptable values at present include "auth/none" and "auth/munge". The default value is "auth/munge". \fBDo not use "auth/none" if you desire any security\fR. "auth/munge" indicates that LLNL's MUNGE system is to be used (this is the supported authentication mechanism for Slurm; see "https://dun.github.io/munge/" for more information). SlurmDBD must be terminated prior to changing the value of \fBAuthType\fR and later restarted. .TP \fBCommitDelay\fR How many seconds between commits on a connection from a Slurmctld. This speeds up inserts into the database dramatically. If you are running a very high throughput of jobs you should consider setting this. In testing, 1 second improves the slurmdbd performance dramatically and reduces overhead. There is a small probability of data loss though since this creates a window in which if the slurmdbd seg faults or exits abnormally for any reason the data not committed could be lost. While this situation should be very rare, it does present an extremely small risk, but may be the only way to run in extremely heavy environments. In all honesty, the risk is quite low, but still present. .TP \fBDbdBackupHost\fR The short, or long, name of the machine where the backup Slurm Database Daemon is executed (i.e. the name returned by the command "hostname \-s"). This host must have access to the same underlying database specified by the 'Storage' options mentioned below. .TP \fBDbdAddr\fR Name that \fBDbdHost\fR should be referred to in establishing a communications path. This name will be used as an argument to the gethostbyname() function for identification. For example, "elx0000" might be used to designate the Ethernet address for node "lx0000". By default the \fBDbdAddr\fR will be identical in value to \fBDbdHost\fR. .TP \fBDbdHost\fR The short, or long, name of the machine where the Slurm Database Daemon is executed (i.e. the name returned by the command "hostname \-s"). This value must be specified. .TP \fBDbdPort\fR The port number that the Slurm Database Daemon (slurmdbd) listens to for work. The default value is SLURMDBD_PORT as established at system build time. If none is explicitly specified, it will be set to 6819. This value must be equal to the \fBAccountingStoragePort\fR parameter in the slurm.conf file. .TP \fBDebugFlags\fR Defines specific subsystems which should provide more detailed event logging. Multiple subsystems can be specified with comma separators. Most DebugFlags will result in verbose logging for the identified subsystems and could impact performance. Valid subsystems available today (with more to come) include: .RS .TP 17 \fBDB_ARCHIVE\fR SQL statements/queries when dealing with archiving and purging the database. .TP \fBDB_ASSOC\fR SQL statements/queries when dealing with associations in the database. .TP \fBDB_EVENT\fR SQL statements/queries when dealing with (node) events in the database. .TP \fBDB_JOB\fR SQL statements/queries when dealing with jobs in the database. .TP \fBDB_QOS\fR SQL statements/queries when dealing with QOS in the database. .TP \fBDB_QUERY\fR SQL statements/queries when dealing with transactions and such in the database. .TP \fBDB_RESERVATION\fR SQL statements/queries when dealing with reservations in the database. .TP \fBDB_RESOURCE\fR SQL statements/queries when dealing with resources like licenses in the database. .TP \fBDB_STEP\fR SQL statements/queries when dealing with steps in the database. .TP \fBDB_USAGE\fR SQL statements/queries when dealing with usage queries and inserts in the database. .TP \fBDB_WCKEY\fR SQL statements/queries when dealing with wckeys in the database. .TP \fBFEDERATION\fR SQL statements/queries when dealing with federations in the database. .RE .TP \fBDebugLevel\fR The level of detail to provide the Slurm Database Daemon's logs. The default value is \fBinfo\fR. .RS .TP 10 \fBquiet\fR Log nothing .TP \fBfatal\fR Log only fatal errors .TP \fBerror\fR Log only errors .TP \fBinfo\fR Log errors and general informational messages .TP \fBverbose\fR Log errors and verbose informational messages .TP \fBdebug\fR Log errors and verbose informational messages and debugging messages .TP \fBdebug2\fR Log errors and verbose informational messages and more debugging messages .TP \fBdebug3\fR Log errors and verbose informational messages and even more debugging messages .TP \fBdebug4\fR Log errors and verbose informational messages and even more debugging messages .TP \fBdebug5\fR Log errors and verbose informational messages and even more debugging messages .RE .TP \fBDebugLevelSyslog\fR The slurmdbd daemon will log events to the syslog file at the specified level of detail. If not set, the slurmdbd daemon will log to syslog at level \fBfatal\fR, unless there is no \fBLogFile\fR and it is running in the background, in which case it will log to syslog at the level specified by \fBDebugLevel\fR (at \fBfatal\fR in the case that \fBDebugLevel\fR is set to \fBquiet\fR) or it is run in the foreground, when it will be set to quiet. .RS .TP 10 \fBquiet\fR Log nothing .TP \fBfatal\fR Log only fatal errors .TP \fBerror\fR Log only errors .TP \fBinfo\fR Log errors and general informational messages .TP \fBverbose\fR Log errors and verbose informational messages .TP \fBdebug\fR Log errors and verbose informational messages and debugging messages .TP \fBdebug2\fR Log errors and verbose informational messages and more debugging messages .TP \fBdebug3\fR Log errors and verbose informational messages and even more debugging messages .TP \fBdebug4\fR Log errors and verbose informational messages and even more debugging messages .TP \fBdebug5\fR Log errors and verbose informational messages and even more debugging messages .RE .TP \fBDefaultQOS\fR When adding a new cluster this will be used as the qos for the cluster unless something is explicitly set by the admin with the create. .TP \fBLogFile\fR Fully qualified pathname of a file into which the Slurm Database Daemon's logs are written. The default value is none (performs logging via syslog). .br See the section \fBLOGGING\fR in the slurm.conf man page if a pathname is specified. .TP \fBLogTimeFormat\fR Format of the timestamp in slurmdbd log files. Accepted values are "iso8601", "iso8601_ms", "rfc5424", "rfc5424_ms", "clock", and "short". The values ending in "_ms" differ from the ones without in that fractional seconds with millisecond precision are printed. The default value is "iso8601_ms". The "rfc5424" formats are the same as the "iso8601" formats except that the timezone value is also shown. The "clock" format shows a timestamp in microseconds retrieved with the C standard clock() function. The "short" format is a short date and time format. The "thread_id" format shows the timestamp in the C standard ctime() function form without the year but including the microseconds, the daemon's process ID and the current thread ID. .TP \fBMaxQueryTimeRange\fR Return an error if a query is against too large of a time span, to prevent ill-formed queries from causing performance problems within SlurmDBD. Default value is INFINITE which allows any queries to proceed. Accepted time formats are the same as the MaxTime option in slurm.conf. User \fBSlurmUser\fR and \fBroot\fR are exempt from this restriction. Note that queries which attempt to return over 3GB of data will still fail to complete with ESLURM_RESULT_TOO_LARGE. .TP \fBMessageTimeout\fR Time permitted for a round\-trip communication to complete in seconds. Default value is 10 seconds. .TP \fBParameters\fR Contains arbitrary comma separated parameters used to alter the behavior of the slurmdbd. .RS .TP \fBPreserveCaseUser\fR When defining users do not force lower case which is the default behavior. .RE .TP \fBPidFile\fR Fully qualified pathname of a file into which the Slurm Database Daemon may write its process ID. This may be used for automated signal processing. The default value is "/var/run/slurmdbd.pid". .TP \fBPluginDir\fR Identifies the places in which to look for Slurm plugins. This is a colon\-separated list of directories, like the PATH environment variable. The default value is "/usr/local/lib/slurm". .TP \fBPrivateData\fR This controls what type of information is hidden from regular users. By default, all information is visible to all users. User \fBSlurmUser\fR, \fBroot\fR, and users with AdminLevel=Admin can always view all information. Multiple values may be specified with a comma separator. Acceptable values include: .RS .TP \fBaccounts\fR prevents users from viewing any account definitions unless they are coordinators of them. .TP \fBevents\fR prevents users from viewing event information unless they have operator status or above. .TP \fBjobs\fR prevents users from viewing job records belonging to other users unless they are coordinators of the association running the job when using sacct. .TP \fBreservations\fR restricts getting reservation information to users with operator status and above. .TP \fBusage\fR prevents users from viewing usage of any other user. This applys to sreport. .TP \fBusers\fR prevents users from viewing information of any user other than themselves, this also makes it so users can only see associations they deal with. Coordinators can see associations of all users they are coordinator of, but can only see themselves when listing users. .RE .TP \fBPurgeEventAfter\fR Events happening on the cluster over this age are purged from the database. This includes node down times and such. The time is a numeric value and is a number of months. If you want to purge more often you can include "hours", or "days" behind the numeric value to get those more frequent purges (i.e. a value of "12hours" would purge everything older than 12 hours). The purge takes place at the start of the each purge interval. For example, if the purge time is 2 months, the purge would happen at the beginning of each month. If not set (default), then job step records are never purged. .TP \fBPurgeJobAfter\fR Individual job records over this age are purged from the database. Aggregated information will be preserved to "PurgeUsageAfter". The time is a numeric value and is a number of months. If you want to purge more often you can include "hours", or "days" behind the numeric value to get those more frequent purges (i.e. a value of "12hours" would purge everything older than 12 hours). The purge takes place at the start of the each purge interval. For example, if the purge time is 2 months, the purge would happen at the beginning of each month. If not set (default), then job records are never purged. .TP \fBPurgeResvAfter\fR Individual reservation records over this age are purged from the database. Aggregated information will be preserved to "PurgeUsageAfter". The time is a numeric value and is a number of months. If you want to purge more often you can include "hours", or "days" behind the numeric value to get those more frequent purges (i.e. a value of "12hours" would purge everything older than 12 hours). The purge takes place at the start of the each purge interval. For example, if the purge time is 2 months, the purge would happen at the beginning of each month. If not set (default), then reservation records are never purged. .TP \fBPurgeStepAfter\fR Individual job step records over this age are purged from the database. Aggregated information will be preserved to "PurgeUsageAfter". The time is a numeric value and is a number of months. If you want to purge more often you can include "hours", or "days" behind the numeric value to get those more frequent purges (i.e. a value of "12hours" would purge everything older than 12 hours). The purge takes place at the start of the each purge interval. For example, if the purge time is 2 months, the purge would happen at the beginning of each month. If not set (default), then job step records are never purged. .TP \fBPurgeSuspendAfter\fR Records of individual suspend times for jobs over this age are purged from the database. Aggregated information will be preserved to "PurgeUsageAfter". The time is a numeric value and is a number of months. If you want to purge more often you can include "hours", or "days" behind the numeric value to get those more frequent purges (i.e. a value of "12hours" would purge everything older than 12 hours). The purge takes place at the start of the each purge interval. For example, if the purge time is 2 months, the purge would happen at the beginning of each month. If not set (default), then job step records are never purged. .TP \fBPurgeTXNAfter\fR Records of individual transaction times for transactions over this age are purged from the database. The time is a numeric value and is a number of months. If you want to purge more often you can include "hours", or "days" behind the numeric value to get those more frequent purges (i.e. a value of "12hours" would purge everything older than 12 hours). The purge takes place at the start of the each purge interval. For example, if the purge time is 2 months, the purge would happen at the beginning of each month. If not set (default), then job step records are never purged. .TP \fBPurgeUsageAfter\fR Usage Records (Cluster, Association and WCKey) over this age are purged from the database. The time is a numeric value and is a number of months. If you want to purge more often you can include "hours", or "days" behind the numeric value to get those more frequent purges (i.e. a value of "12hours" would purge everything older than 12 hours). The purge takes place at the start of the each purge interval. For example, if the purge time is 2 months, the purge would happen at the beginning of each month. If not set (default), then job step records are never purged. .TP \fBSlurmUser\fR The name of the user that the \fBslurmctld\fR daemon executes as. This user must exist on the machine executing the Slurm Database Daemon and have the same user ID as the hosts on which \fBslurmctld\fR execute. For security purposes, a user other than "root" is recommended. The default value is "root". .TP \fBStorageHost\fR Define the name of the host the database is running where we are going to store the data. Ideally this should be the host on which slurmdbd executes. .TP \fBStorageBackupHost\fR Define the name of the backup host the database is running where we are going to store the data. This can be viewed as a backup solution when the StorageHost is not responding. It is up to the backup solution to enforce the coherency of the accounting information between the two hosts. With clustered database solutions (active/passive HA), you would not need to use this feature. Default is none. .TP \fBStorageLoc\fR Specify the name of the database as the location where accounting records are written. Defaults to "slurm_acct_db". .TP \fBStoragePass\fR Define the password used to gain access to the database to store the job accounting data. The '#' character is not permitted in a password. .TP \fBStoragePort\fR The port number that the Slurm Database Daemon (slurmdbd) communicates with the database. .TP \fBStorageType\fR Define the accounting storage mechanism type. Acceptable values at present include "accounting_storage/mysql". The value "accounting_storage/mysql" indicates that accounting records should be written to a MySQL or MariaDB database specified by the \fBStorageLoc\fR parameter. This value must be specified. .TP \fBStorageUser\fR Define the name of the user we are going to connect to the database with to store the job accounting data. .TP \fBTCPTimeout\fR Time permitted for TCP connection to be established. Default value is 2 seconds. .TP \fBTrackWCKey\fR Boolean yes or no. Used to set display and track of the Workload Characterization Key. Must be set to track wckey usage. This must be set to generate rolled up usage tables from WCKeys. NOTE: If TrackWCKey is set here and not in your various slurm.conf files all jobs will be attributed to their default WCKey. .TP \fBTrackSlurmctldDown\fR Boolean yes or no. If set the slurmdbd will mark all idle resources on the cluster as down when a slurmctld disconnects or is no longer reachable. The default is no. .SH "EXAMPLE" .LP # .br # Sample /etc/slurmdbd.conf .br # .br ArchiveEvents=yes .br ArchiveJobs=yes .br ArchiveResvs=yes .br ArchiveSteps=no .br ArchiveSuspend=no .br ArchiveTXN=no .br ArchiveUsage=no .br #ArchiveScript=/usr/sbin/slurm.dbd.archive .br AuthInfo=/var/run/munge/munge.socket.2 .br AuthType=auth/munge .br DbdHost=db_host .br DebugLevel=info .br PurgeEventAfter=1month .br PurgeJobAfter=12month .br PurgeResvAfter=1month .br PurgeStepAfter=1month .br PurgeSuspendAfter=1month .br PurgeTXNAfter=12month .br PurgeUsageAfter=24month .br LogFile=/var/log/slurmdbd.log .br PidFile=/var/tmp/jette/slurmdbd.pid .br SlurmUser=slurm_mgr .br StoragePass=shazaam .br StorageType=accounting_storage/mysql .br StorageUser=database_mgr .SH "COPYING" Copyright (C) 2008-2010 Lawrence Livermore National Security. Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER). .br Copyright (C) 2010\-2014 SchedMD LLC. .LP This file is part of Slurm, a resource management program. For details, see . .LP Slurm is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. .LP Slurm is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. .SH "FILES" /etc/slurmdbd.conf .SH "SEE ALSO" .LP \fBslurm.conf\fR(5), \fBslurmctld\fR(8), \fBslurmdbd\fR(8) \fBsyslog\fR (2)