.\" -*- nroff -*- .\" .\" s3backer - FUSE-based single file backing store via Amazon S3 .\" .\" Copyright 2008-2011 Archie L. Cobbs .\" .\" This program is free software; you can redistribute it and/or .\" modify it under the terms of the GNU General Public License .\" as published by the Free Software Foundation; either version 2 .\" of the License, or (at your option) any later version. .\" .\" This program is distributed in the hope that it will be useful, .\" but WITHOUT ANY WARRANTY; without even the implied warranty of .\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the .\" GNU General Public License for more details. .\" .\" You should have received a copy of the GNU General Public License .\" along with this program; if not, write to the Free Software .\" Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA .\" 02110-1301, USA. .\" .\" In addition, as a special exception, the copyright holders give .\" permission to link the code of portions of this program with the .\" OpenSSL library under certain conditions as described in each .\" individual source file, and distribute linked combinations including .\" the two. .\" .\" You must obey the GNU General Public License in all respects for all .\" of the code used other than OpenSSL. If you modify file(s) with this .\" exception, you may extend this exception to your version of the .\" file(s), but you are not obligated to do so. If you do not wish to do .\" so, delete this exception statement from your version. If you delete .\" this exception statement from all source files in the program, then .\" also delete it here. .\" .Dd September 7, 2009 .Dt S3BACKER 1 .Os .Sh NAME .Nm s3backer .Nd FUSE-based single file backing store via Amazon S3 .Sh SYNOPSIS .Nm s3backer .Bk -words .Op options .Ar bucket .Ar /mount/point .Ek .Pp .Nm s3backer .Bk -words .Fl \-test .Op options .Ar dir .Ar /mount/point .Ek .Pp .Nm s3backer .Bk -words .Fl \-erase .Op options .Ar bucket .Ek .Pp .Nm s3backer .Bk -words .Fl \-reset-mounted-flag .Op options .Ar bucket .Ek .Sh DESCRIPTION .Nm is a filesystem that contains a single file backed by the Amazon Simple Storage Service (Amazon S3). As a filesystem, it is very simple: it provides a single normal file having a fixed size. Underneath, the file is divided up into blocks, and the content of each block is stored in a unique Amazon S3 object. In other words, what .Nm provides is really more like an S3-backed virtual hard disk device, rather than a filesystem. .Pp In typical usage, a `normal' filesystem is mounted on top of the file exported by the .Nm filesystem using a loopback mount (or disk image mount on Mac OS X). .Pp This arrangement has several benefits compared to more complete S3 filesystem implementations: .Bl -tag -width xx .It o By not attempting to implement a complete filesystem, which is a complex undertaking and difficult to get right, .Nm can stay very lightweight and simple. Only three HTTP operations are used: GET, PUT, and DELETE. All of the experience and knowledge about how to properly implement filesystems that already exists can be reused. .It o By utilizing existing filesystems, you get full UNIX filesystem semantics. Subtle bugs or missing functionality relating to hard links, extended attributes, POSIX locking, etc. are avoided. .It o The gap between normal filesystem semantics and Amazon S3 ``eventual consistency'' is more easily and simply solved when one can interpret S3 objects as simple device blocks rather than filesystem objects (see below). .It o When storing your data on Amazon S3 servers, which are not under your control, the ability to encrypt and authenticate data becomes a critical issue. .Nm supports secure encryption and authentication. Alternately, the encryption capability built into the Linux loopback device can be used. .It o Since S3 data is accessed over the network, local caching is also very important for performance reasons. Since .Nm presents the equivalent of a virtual hard disk to the kernel, most of the filesystem caching can be done where it should be: in the kernel, via the kernel's page cache. However .Nm also includes its own internal block cache for increased performance, using asynchronous worker threads to take advantage of the parallelism inherent in the network. .El .Ss Consistency Guarantees Amazon S3 makes relatively weak guarantees relating to the timing and consistency of reads vs. writes (collectively known as ``eventual consistency''). .Nm includes logic and configuration parameters to work around these limitations, allowing the user to guarantee consistency to whatever level desired, up to and including 100% detection and avoidance of incorrect data. These are: .Bl -tag -width xx .It 1. .Nm enforces a minimum delay between consecutive PUT or DELETE operations on the same block. This ensures that Amazon S3 doesn't receive these operations out of order. .It 2. .Nm maintains an internal block MD5 checksum cache, which enables automatic detection and rejection of `stale' blocks returned by GET operations. .El .Pp This logic is configured by the following command line options: .Fl \-md5CacheSize , .Fl \-md5CacheTime , and .Fl \-minWriteDelay . .Ss Zeroed Block Optimization As a simple optimization, .Nm does not store blocks containing all zeroes; instead, they are simply deleted. Conversely, reads of non-existent blocks will contain all zeroes. In other words, the backed file is always maximally sparse. .Pp As a result, blocks do not need to be created before being used and no special initialization is necessary when creating a new filesystem. .Pp When the .Fl \-listBlocks flag is given, .Nm will list all existing blocks at startup so it knows ahead of time exactly which blocks are empty. .Ss File and Block Size Auto-Detection As a convenience, whenever the first block of the backed file is written, .Nm includes as meta-data (in the ``x-amz-meta-s3backer-filesize'' header) the total size of the file. Along with the size of the block itself, this value can be checked and/or auto-detected later when the filesystem is remounted, eliminating the need for the .Fl \-blockSize or .Fl \-size flags to be explicitly provided and avoiding accidental mis-interpretation of an existing filesystem. .Ss Block Cache .Nm includes support for an internal block cache to increase performance. The block cache cache is completely separate from the MD5 cache which only stores MD5 checksums transiently and whose sole purpose is to mitigate ``eventual consistency''. The block cache is a traditional cache containing cached data blocks. When full, clean blocks are evicted as necessary in LRU order. .Pp Reads of cached blocks will return immediately with no network traffic. Writes to the cache also return immediately and trigger an asynchronous write operation to the network via a separate worker thread. Because the kernel typically writes blocks through FUSE filesystems one at a time, performing writes asynchronously allows .Nm to take advantage of the parallelism inherent in the network, vastly improving write performance. .Pp The block cache can be configured to store the cached data in a local file instead of in memory. This permits larger cache sizes and allows .Nm to reload cached data after a restart. Reloaded data is verified via MD5 checksum with Amazon S3 before reuse. .Pp The block cache is configured by the following command line options: .Fl \-blockCacheFile , .Fl \-blockCacheMaxDirty , .Fl \-blockCacheNoVerify , .Fl \-blockCacheSize , .Fl \-blockCacheSync , .Fl \-blockCacheThreads , .Fl \-blockCacheTimeout , and .Fl \-blockCacheWriteDelay . .Ss Read Ahead .Nm implements a simple read-ahead algorithm in the block cache. When a configurable number of blocks are read in order, block cache worker threads are awoken to begin reading subsequent blocks into the block cache. Read ahead continues as long as the kernel continues reading blocks sequentially. The kernel typically requests blocks one at a time, so having multiple worker threads already reading the next few blocks improves read performance by taking advantage of the parallelism inherent in the network. .Pp Note that the kernel implements a read ahead algorithm as well; its behavior should be taken into consideration. By default, .Nm passes the .Fl o Ar max_readahead=0 option to FUSE. .Pp Read ahead is configured by the .Fl \-readAhead and .Fl \-readAheadTrigger command line options. .Ss Encryption and Authentication .Nm supports encryption via the .Fl \-encrypt , .Fl \-password , and .Fl \-passwordFile flags. When encryption is enabled, SHA1 HMAC authentication is also automatically enabled, and .Nm rejects any blocks that are not properly encrypted and signed. .Pp Encrypting at the .Nm layer is preferable to encrypting at an upper layer (e.g., at the loopback device layer), because if the data .Nm sees is already encrypted it can't optimize away zeroed blocks or do meaningful compression. .Ss Compression .Nm supports block-level compression, which minimizes transfer time and storage costs. .Pp Compression is configured via the .Fl \-compress flag. Compression is automatically enabled when encryption is enabled. .Ss Read-Only Access An Amazon S3 account is not required in order to use .Nm . The filesystem must already exist and have S3 objects with ACL's configured for public read access (see .Fl \-accessType below); users should perform the looback mount with the read-only flag (see .Xr mount 8 ) and provide the .Fl \-readOnly flag to .Nm . This mode of operation facilitates the creation of public, read-only filesystems. .Ss Simultaneous Mounts Although it functions over the network, the .Nm filesystem is not a distributed filesystem and does not support simultaneous read/write mounts. (This is not something you would normally do with a hard-disk partition either.) As a safety measure, .Nm attempts to detect this situation using an 'already mounted' flag in the data store, and will fail to start if it does. .Pp This detection may produce a false positive if a former .Nm process was not shutdown cleanly; if so, the .Fl \-reset-mounted-flag flag can be used to reset the 'already mounted' flag. But see also BUGS below. .Ss Statistics File .Nm populates the filesystem with a human-readable statistics file. See .Fl \-statsFilename below. .Ss Logging In normal operation .Nm will log via .Xr syslog 3 . When run with the .Fl d or .Fl f flags, .Nm will log to standard error. .Sh OPTIONS Each command line flag has two forms, for example .Fl \-accessFile=FILE and .Fl o Ar accessFile=FILE . Only the first form is shown below. Either form many be used; both are equivalent. The second form allows mount options to be specified directly in .Pa /etc/fstab and passed seamlessly to .Nm by FUSE. .Bl -tag -width Ds .It Fl \-accessFile=FILE Specify a file containing `accessID:accessKey' pairs, one per-line. Blank lines and lines beginning with a `#' are ignored. If no .Fl \-accessKey is specified, this file will be searched for the entry matching the access ID specified via .Fl \-accessId; if neither .Fl \-accessKey nor .Fl \-accessId is specified, the first entry in this file will be used. Default value is .Pa $HOME/.s3backer_passwd . .It Fl \-accessId=ID Specify Amazon S3 access ID. Specify an empty string to force no access ID. If no access ID is specified (and none is found in the access file) then .Nm will still function, but only reads of publicly available filesystems will work. .It Fl \-accessKey=KEY Specify Amazon S3 access key. To avoid publicizing this secret via the command line, use .Fl \-accessFile instead of this flag. .It Fl \-accessType=TYPE Specify the Amazon S3 access privilege ACL type for newly written blocks. The value must be one of `private', `public-read', `public-read-write', or `authenticated-read'. Default is `private'. .It Fl \-accessEC2IAM=ROLE Download access credentials and security token in JSON document form from .Bk -words .Ar http://169.254.169.254/latest/meta-data/iam/security-credentials/ROLE .Ek every five minutes. .Pp This option allows S3 credentials to be provided automatically via the specified IAM role to .Nm when running on an Amazon EC2 instance. .It Fl \-authVersion=TYPE Specify how to authenticate requests. There are two supported authentication methods: .Ar aws2 is the original AWS authentication scheme. .Ar aws4 is the newer, recommended authentication scheme. .Pp .Ar aws4 is the default setting starting in version 1.4, and is required for certain non-US regions, while .Ar aws2 may still be required by some non-Amazon S3 providers. .It Fl \-baseURL=URL Specify the base URL, which must end in a forward slash. Default is `http://s3.amazonaws.com/'. .It Fl \-blockCacheFile=FILE Specify a file in which to store cached data blocks. Without this flag, the block cache lives entirely in process memory and the cached data disappears when .Nm is stopped. The file will be created if it doesn't exist. .Pp Cache files that have been created by previous invocations of .Nm are reusable as long as they were created with the same configured block size (if not, startup will fail). This is true even if .Nm was stopped abruptly, e.g., due to a system crash; however, this guarantee rests on the assumption that the filesystem containing the cache file will not reorder writes across calls to .Xr fsync 2 . .Pp If an existing cache is used but was created with a different size, .Nm will automatically expand or shrink the file at startup. When shrinking, blocks that don't fit in the new, smaller cache are discarded. This process also compacts the cache file to the extent possible. .Pp In any case, only clean cache blocks are recoverable after a restart. This means a system crash will cause dirty blocks in the cache to be lost (of course, that is the case with an in-memory cache as well). Use .Fl \-blockCacheWriteDelay to limit this window. .Pp By default, when having reloaded the cache from a cache file, .Nm will verify the MD5 checksum of each reloaded block with Amazon S3 before its first use. This verify operation does not require actually reading the block's data, and therefore is relatively quick. This guards against the cached data having unknowingly gotten out of sync since the cache file was last used, a situation that is otherwise impossible for .Nm to detect. .It Fl \-blockCacheMaxDirty=NUM Specify a limit on the number of dirty blocks in the block cache. When this limit is reached, subsequent write attempts will block until an existing dirty block is successfully written (and therefore becomes no longer dirty). This flag limits the amount of inconsistency there can be with respect to the underlying S3 data store. .Pp The default value is zero, which means no limit. .It Fl \-blockCacheNoVerify Disable the MD5 verification of blocks loaded from a cache file specified via .Fl \-blockCacheFile . Using this flag is dangerous; use only when you are sure the cached file is uncorrupted and the data it contains is up to date. .It Fl \-blockCacheSize=SIZE Specify the block cache size (in number of blocks). Each entry in the cache will consume approximately block size plus 20 bytes. A value of zero disables the block cache. Default value is 1000. .It Fl \-blockCacheThreads=NUM Set the size of the thread pool associated with the block cache (if enabled). This bounds the number of simultaneous writes that can occur to the network. Default value is 20. .It Fl \-blockCacheTimeout=MILLIS Specify the maximum time a clean entry can remain in the block cache before it will be forcibly evicted and its associated memory freed. A value of zero means there is no timeout; in this case, the number of entries in the block cache will never decrease, eventually reaching the maximum size configured by .Fl \-blockCacheSize and staying there. Configure a non-zero value if the memory usage of the block cache is a concern. Default value is zero (no timeout). .It Fl \-blockCacheWriteDelay=MILLIS Specify the maximum time a dirty block can remain in the block cache before it must be written out to the network. Blocks may be written sooner when there is cache pressure. A value of zero configures a ``write-through'' policy; greater values configure a ``write-back'' policy. Larger values increase performance when a small number of blocks are accessed repeatedly, at the cost of greater inconsistency with the underlying S3 data store. Default value is 250 milliseconds. .It Fl \-blockCacheSync Forces synchronous writes in the block cache layer. Instead of returning immediately and scheduling the actual write to operation happen later, write requests will not return until the write has completed. This flag is a stricter requirement than .Fl \-blockCacheWriteDelay=0 , which merely causes the writes to be initiated as soon as possible (but still after the write request returns). .Pp This flag requires .Fl \-blockCacheWriteDelay to be zero. Using this flag is likely to drastically reduce write performance. .It Fl \-blockSize=SIZE Specify the block size. This must be a power of two and should be a multiple of the kernel's native page size. The size may have an optional suffix 'K' for kilobytes, 'M' for megabytes, etc. .Pp .Nm supports partial block operations, though this forces a read before each write; use of the block cache and proper alignment of the .Nm block size with the intended use (e.g., the block size of the `upper' filesystem) will help minimize the extra reads. Note that even when filesystems are configured for large block sizes, the kernel will often still write page-sized blocks. .Pp .Nm will attempt to auto-detect the block size by reading block number zero at startup. If this option is not specified, the auto-detected value will be used. If this option is specified but disagrees with the auto-detected value, .Nm will exit with an error unless .Fl \-force is also given. If auto-detection fails because block number zero does not exist, and this option is not specified, then the default value of 4K (4096) is used. .It Fl \-cacert=FILE Specify SSL certificate file to be used when verifying the remote server's identity when operating over SSL connections. Equivalent to the .Fl \-cacert flag documented in .Xr curl 1 . .It Fl \-compress[=LEVEL] Compress blocks before sending them over the network. This should result in less network traffic (in both directions) and lower storage costs. .Pp The compression level is optional; if given, it must be between 1 (fast compression) and 9 (most compression), inclusive. If omitted, the default compression level is used. .Pp This flag only enables compression of newly written blocks; decompression is always enabled and applied when appropriate. Therefore, it is safe to switch this flag on or off between different invocations of .Nm on the same filesystem. .Pp This flag is automatically enabled when .Fl \-encrypt is used, though you may also specify .Fl \-compress=LEVEL to set a non-default compression level. .Pp When using an encrypted upper layer filesystem, this flag adds no value because the data will not be compressible. .It Fl \-directIO Disable kernel caching of the backed file. This will force the kernel to always pass reads and writes directly to .Nm . This reduces performance but also eliminates one source of inconsistency. .It Fl \-debug Enable logging of debug messages. Note that this flag is different from .Fl d , which is a flag to FUSE; however, the .Fl d FUSE flag implies this flag. .It Fl \-debug-http Enable printing of HTTP headers to standard output. .It Fl \-encrypt[=CIPHER] Enable encryption and authentication of block data. See your OpenSSL documentation for a list of supported ciphers; the default if no cipher is specified is AES-128 CBC. .Pp The encryption password may be supplied via one of .Fl \-password or .Fl \-passwordFile . If neither flag is given, .Nm will ask for the password at startup. .Pp Note: the actual key used is derived by hashing the password, the bucket name, the prefix name (if any), and the block number. Therefore, encrypted data cannot be ported to different buckets or prefixes. .Pp This flag implies .Fl \-compress . .It Fl \-erase Completely erase the file system by deleting all non-zero blocks, clear the 'already mounted' flag, and then exit. User confirmation is required unless the .Fl \-force flag is also given. Note, no simultaneous mount detection is performed in this case. .Pp This option implies .Fl \-listBlocks . .It Fl \-filename=NAME Specify the name of the backed file that appears in the .Nm filesystem. Default is `file'. .It Fl \-fileMode=MODE Specify the UNIX permission bits for the backed file that appears in the .Nm filesystem. Default is 0600, unless .Fl \-readOnly is specified, in which case the default is 0400. .It Fl \-force Proceed even if the value specified by .Fl \-blockSize or .Fl \-size disagrees with the auto-detected value, or .Nm detects that another .Nm instance is still mounted on top of the same S3 bucket (and prefix). In any of these cases, proceeding will lead to corrupted data, so the .Fl \-force flag should be avoided for normal use. .Pp The simultaneous mount detection can produce a false positive when a previous .Nm instance was not shut down cleanly. In this case, don't use .Fl \-force but rather run .Nm once with the .Fl \-reset-mounted-flag flag. .Pp If .Fl \-erase is given, .Fl \-force causes .Nm to proceed without user confirmation. .It Fl h Fl \-help Print a help message and exit. .It Fl \-initialRetryPause=MILLIS Specify the initial pause time in milliseconds before the first retry attempt after failed HTTP operations. Failures include network failures and timeouts, HTTP errors, and reads of stale data (i.e., MD5 mismatch); .Nm will make multiple retry attempts using an exponential backoff algorithm, starting with this initial retry pause time. Default value is 200ms. See also .Fl \-maxRetryPause . .It Fl \-insecure Do not verify the remote server's identity when operating over SSL connections. Equivalent to the .Fl \-insecure flag documented in .Xr curl 1 . .It Fl \-keyLength Override the length of the generated block encryption key. .Pp Versions of .Nm prior to 1.3.6 contained a bug where the length of the generated encryption key was fixed but system-dependent, causing it to be possibly incompatible on different systems for some ciphers. In version 1.3.6, this bug was corrected; however, in some cases this changed the generated key length, making the encryption no longer compatible with previously written data. This flag can be used to force the older, fixed key length. The value you want to use is whatever is defined for .Pa EVP_MAX_KEY_LENGTH on your system, typically 64. .Pp It is an error to specify a value smaller than the cipher's natural key length; however, a value of zero is allowed and is equivalent to not specifying anything. .It Fl \-listBlocks Perform a query at startup to determine which blocks already exist. This enables optimizations whereby, for each block that does not yet exist, reads return zeroes and zeroed writes are omitted, thereby eliminating any network access. This flag is useful when creating a new backed file, or any time it is expected that a large number of zeroed blocks will be read or written, such as when initializing a new filesystem. .Pp This flag will slow down startup in direct proportion to the number of blocks that already exist. .It Fl \-maxUploadSpeed=BITSPERSEC .It Fl \-maxDownloadSpeed=BITSPERSEC These flags set a limit on the bandwidth utilized for individual block uploads and downloads (i.e., the setting applies on a per-thread basis). The limits only apply to HTTP payload data and do not include any additional overhead from HTTP or TCP headers, etc. .Pp The value is measured in bits per second, and abbreviations like `256k', `1m', etc. may be used. By default, there is no fixed limit. .Pp Use of these flags may also require setting the .Fl \-timeout flag to a higher value. .It Fl \-maxRetryPause=MILLIS Specify the total amount of time in milliseconds .Nm should pause when retrying failed HTTP operations before giving up. Failures include network failures and timeouts, HTTP errors, and reads of stale data (i.e., MD5 mismatch); .Nm will make multiple retry attempts using an exponential backoff algorithm, up to this maximum total retry pause time. This value does not include the time it takes to perform the HTTP operations themselves (use .Fl \-timeout for that). Default value is 30000 (30 seconds). See also .Fl \-initialRetryPause . .It Fl \-minWriteDelay=MILLIS Specify a minimum time in milliseconds between the successful completion of a write and the initiation of another write to the same block. This delay ensures that S3 doesn't receive the writes out of order. This value must be set to zero when .Fl \-md5CacheSize is set to zero (MD5 cache disabled). Default value is 500ms. .It Fl \-md5CacheSize=SIZE Specify the size of the MD5 checksum cache (in number of blocks). If the cache is full when a new block is written, the write will block until there is room. Therefore, it is important to configure .Fl \-md5CacheTime and .Fl \-md5CacheSize according to the frequency of writes to the filesystem overall and to the same block repeatedly. Alternately, a value equal to the number of blocks in the filesystem eliminates this problem but consumes the most memory when full (each entry in the cache is approximately 40 bytes). A value of zero disables the MD5 cache. Default value is 1000. .It Fl \-md5CacheTime=MILLIS Specify in milliseconds the time after a block has been successfully written for which the MD5 checksum of the block's contents should be cached, for the purpose of detecting stale data during subsequent reads. A value of zero means `infinite' and provides a guarantee against reading stale data; however, you should only do this when .Fl \-md5CacheSize is configured to be equal to the number of blocks; otherwise deadlock will (eventually) occur. This value must be at least as big as .Fl \-minWriteDelay. This value must be set to zero when .Fl \-md5CacheSize is set to zero (MD5 cache disabled). Default value is 10 seconds. .Pp The MD5 checksum cache is not persisted across restarts. Therefore, to ensure the same eventual consistency protection while .Nm is not running, you must delay at least .Fl \-md5CacheTime milliseconds between stopping and restarting .Nm . .It Fl \-noAutoDetect Disable block and file size auto-detection at startup. If this flag is given, then the block size defaults to 4096 and the .Fl \-size flag is required. .It Fl \-password=PASSWORD Supply the password for encryption and authentication as a command-line parameter. .It Fl \-passwordFile=FILE Read the password for encryption and authentication from (the first line of) the specified file. .It Fl \-prefix=STRING Specify a prefix to prepend to the resource names within bucket that identify each block. By using different prefixes, multiple independent .Nm disks can live in the same S3 bucket. .Pp The default prefix is the empty string. .It Fl \-quiet Suppress progress output during initial startup. .It Fl \-readAhead=NUM Configure the number of blocks of read ahead. This determines how many blocks will be read into the block cache ahead of the last block read by the kernel when read ahead is active. This option has no effect if the block cache is disabled. Default value is 4. .It Fl \-readAheadTrigger=NUM Configure the number of blocks that must be read consecutively before the read ahead algorithm is triggered. Once triggered, read ahead will continue as long as the kernel continues reading blocks sequentially. This option has no effect if the block cache is disabled. Default value is 2. .It Fl \-readOnly Assume the filesystem is going to be mounted read-only, and return .Er EROFS in response to any attempt to write. This flag also changes the default mode of the backed file from 0600 to 0400 and disables the MD5 checksum cache. .It Fl \-region=REGION Specify an AWS region. This flag changes the default base URL to include the region name and automatically sets the .Fl \-vhost flag. .It Fl \-reset-mounted-flag Reset the 'already mounted' flag on the underlying S3 data store. .Pp .Nm detects simultaneous mounts by checking a special flag. If a previous invocation of .Nm was not shut down cleanly, the flag may not have been cleared. Running .Nm .Fl \-erase will clear it manually. But see also BUGS below. .Pp .It Fl \-rrs Deprecated; equivalent to .Fl \-storageClass=REDUCED_REDUNDANCY . .It Fl \-size=SIZE Specify the size (in bytes) of the backed file to be exported by the filesystem. The size may have an optional suffix 'K' for kilobytes, 'M' for megabytes, 'G' for gigabytes, 'T' for terabytes, 'E' for exabytes, 'Z' for zettabytes, or 'Y' for yottabytes. .Nm will attempt to auto-detect the block size by reading block number zero. If this option is not specified, the auto-detected value will be used. If this option is specified but disagrees with the auto-detected value, .Nm will exit with an error unless .Fl \-force is also given. .It Fl \-ssl Equivalent to .Bk -words .Fl \-baseURL .Ar https://s3.amazonaws.com/ .Ek .It Fl \-statsFilename=NAME Specify the name of the human-readable statistics file that appears in the .Nm filesystem. A value of empty string disables the appearance of this file. Default is `stats'. .It Fl \-storageClass=TYPE Specify storage class. .Pp Valid values are: .Pa STANDARD , .Pa STANDARD_IA , and .Pa REDUCED_REDUNDANCY . .Pp The default is .Pa STANDARD . .It Fl \-test Operate in local test mode. Filesystem blocks are stored as regular files in the directory .Ar dir . No network traffic occurs. .Pp Note if .Ar dir is a relative pathname (and .Fl f is not given) it will be resolved relative to the root directory. .It Fl \-timeout=SECONDS Specify a time limit in seconds for one HTTP operation attempt. This limits the entire operation including connection time (if not already connected) and data transfer time. The default is 30 seconds; this value may need to be adjusted upwards to avoid premature timeouts on slower links and/or when using a large number of block cache worker threads. .Pp See also .Fl \-maxRetryPause . .It Fl \-version Output version and exit. .It Fl \-vhost Force virtual hosted style requests. For example, this will cause .Nm to use the URL .Pa http://mybucket.s3.amazonaws.com/path/uri instead of .Pa http://s3.amazonaws.com/mybucket/path/uri . .Pp This flag is required when S3 buckets have been created with location constraints (for example `EU buckets'). Put another way, this flag is required for buckets defined outside of the US region. This flag is automatically set when the .Fl \-region flag is used. .El .Pp In addition, .Nm accepts all of the generic FUSE options as well. Here is a partial list: .Bl -tag -width Ds .It Fl o Ar uid=UID Override the user ID of the backed file, which defaults to the current user ID. .It Fl o Ar gid=GID Override the group ID of the backed file, which defaults to the current group ID. .It Fl o Ar sync_read Do synchronous reads. .It Fl o Ar max_readahead=NUM Set maximum read-ahead (in bytes). .It Fl f Run in the foreground (do not fork). Causes logging to be sent to standard error. .It Fl d Enable FUSE debug mode. Implies .Fl f . .It Fl s Run in single-threaded mode. .El .Pp In addition, .Nm passes the following flags which are optimized for .Nm to FUSE (unless overridden by the user on the command line): .Pp .Bl -tag -width Ds -compact .It Fl o Ar kernel_cache .It Fl o Ar fsname=/ .It Fl o Ar subtype=s3backer .It Fl o Ar use_ino .It Fl o Ar entry_timeout=31536000 .It Fl o Ar negative_timeout=31536000 .It Fl o Ar max_readahead=0 .It Fl o Ar attr_timeout=0 .It Fl o Ar default_permissions .It Fl o Ar allow_other .It Fl o Ar nodev .It Fl o Ar nosuid .El .Sh FILES .Bl -tag -compact -width Ds .It Pa $HOME/.s3backer_passwd Contains Amazon S3 `accessID:accessKey' pairs. .El .Sh SEE ALSO .Xr curl 1 , .Xr losetup 8 , .Xr mount 8 , .Xr umount 8 , .Xr fusermount 8 . .Rs .%T "s3backer: FUSE-based single file backing store via Amazon S3" .%O https://github.com/archiecobbs/s3backer .Re .Rs .%T "Amazon Simple Storage Service (Amazon S3)" .%O http://aws.amazon.com/s3 .Re .Rs .%T "FUSE: Filesystem in Userspace" .%O http://fuse.sourceforge.net/ .Re .Rs .%T "MacFUSE: A User-Space File System Implementation Mechanism for Mac OS X" .%O http://code.google.com/p/macfuse/ .Re .Rs .%T "FUSE for OS X" .%O https://osxfuse.github.io/ .Re .Rs .%T "Google Search for `linux page cache'" .%O http://www.google.com/search?q=linux+page+cache .Re .Sh BUGS Due to a design flaw in FUSE, an unmount of the .Nm filesystem will complete successfully before .Nm has finished writing back all dirty blocks. Therefore, when using the block cache, attempts to remount the same bucket and prefix may fail with an 'already mounted' error while the former .Nm process finishes flusing its cache. Before assuming a false positive and using .Fl \-reset-mounted-flag, ensure that any previous .Nm process attached to the same bucket and prefix has exited. See issue #40 for details. .Pp For cache space efficiency, .Nm uses 32 bit values to index individual blocks. Therefore, the block size must be increased beyond the default 4K when very large filesystems (greater than 16 terabytes) are created. .Pp .Nm should really be implemented as a device rather than a filesystem. However, this would require writing a kernel module instead of a simple user-space daemon, because Linux does not provide a user-space API for devices like it does for filesystems with FUSE. Implementing .Nm as a filesystem and then using the loopback mount is a simple workaround. .Pp On Mac OS X, the kernel imposes its own timeout (600 seconds) on FUSE operations, and automatically unmounts the filesystem when this limit is reached. This can happen when a combination of .Fl \-maxRetryPause and/or .Fl \-timeout settings allow HTTP retries to take longer than this value. A warning is emitted on startup in this case. .Pp Filesystem size is limited by the maximum allowable size of a single file. .Pp The default block size of 4k is non-optimal from a compression and cost perspective. Typically, users will want a larger value to maximize compression and minimize transaction costs, e.g., 1m. .Sh AUTHOR .An Archie L. Cobbs Aq archie@dellroad.org