.\" Automatically generated by Podwrapper::Man 1.40.2 (Pod::Simple 3.35) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. \*(C+ will .\" give a nicer C++. Capital omega is used to do unbreakable dashes and .\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, .\" nothing in troff, for use with C<>. .tr \(*W- .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{\ . if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" ======================================================================== .\" .IX Title "guestfs-internals 1" .TH guestfs-internals 1 "2019-02-07" "libguestfs-1.40.2" "Virtualization Support" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH "NAME" guestfs\-internals \- architecture and internals of libguestfs .SH "DESCRIPTION" .IX Header "DESCRIPTION" This manual page is for hackers who want to understand how libguestfs works internally. This is just a description of how libguestfs works now, and it may change at any time in the future. .SH "ARCHITECTURE" .IX Header "ARCHITECTURE" Internally, libguestfs is implemented by running an appliance (a special type of small virtual machine) using \fBqemu\fR\|(1). Qemu runs as a child process of the main program. .PP .Vb 10 \& ┌───────────────────┐ \& │ main program │ \& │ │ \& │ │ child process / appliance \& │ │ ┌──────────────────────────┐ \& │ │ │ qemu │ \& ├───────────────────┤ RPC │ ┌─────────────────┐ │ \& │ libguestfs ◀╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍▶ guestfsd │ │ \& │ │ │ ├─────────────────┤ │ \& └───────────────────┘ │ │ Linux kernel │ │ \& │ └────────┬────────┘ │ \& └───────────────│──────────┘ \& │ \& │ virtio\-scsi \& ┌──────┴──────┐ \& │ Device or │ \& │ disk image │ \& └─────────────┘ .Ve .PP The library, linked to the main program, creates the child process and hence the appliance in the \*(L"guestfs_launch\*(R" in \fBguestfs\fR\|(3) function. .PP Inside the appliance is a Linux kernel and a complete stack of userspace tools (such as \s-1LVM\s0 and ext2 programs) and a small controlling daemon called \*(L"guestfsd\*(R". The library talks to \&\*(L"guestfsd\*(R" using remote procedure calls (\s-1RPC\s0). There is a mostly one-to-one correspondence between libguestfs \s-1API\s0 calls and \s-1RPC\s0 calls to the daemon. Lastly the disk image(s) are attached to the qemu process which translates device access by the appliance’s Linux kernel into accesses to the image. .PP A common misunderstanding is that the appliance \*(L"is\*(R" the virtual machine. Although the disk image you are attached to might also be used by some virtual machine, libguestfs doesn't know or care about this. (But you will care if both libguestfs’s qemu process and your virtual machine are trying to update the disk image at the same time, since these usually results in massive disk corruption). .SH "STATE MACHINE" .IX Header "STATE MACHINE" libguestfs uses a state machine to model the child process: .PP .Vb 10 \& | \& guestfs_create / guestfs_create_flags \& | \& | \& _\|_\|_\|_V_\|_\|_\|_\|_ \& / \e \& | CONFIG | \& \e_\|_\|_\|_\|_\|_\|_\|_\|_\|_/ \& ^ ^ \e \& | \e \e guestfs_launch \& | _\e_\|_V_\|_\|_\|_\|_\|_ \& | / \e \& | | LAUNCHING | \& | \e_\|_\|_\|_\|_\|_\|_\|_\|_\|_\|_/ \& | / \& | guestfs_launch \& | / \& _\|_|_\|_\|_\|_V \& / \e \& | READY | \& \e_\|_\|_\|_\|_\|_\|_\|_/ .Ve .PP The normal transitions are (1) \s-1CONFIG\s0 (when the handle is created, but there is no child process), (2) \s-1LAUNCHING\s0 (when the child process is booting up), (3) \s-1READY\s0 meaning the appliance is up, actions can be issued to, and carried out by, the child process. .PP The guest may be killed by \*(L"guestfs_kill_subprocess\*(R" in \fBguestfs\fR\|(3), or may die asynchronously at any time (eg. due to some internal error), and that causes the state to transition back to \s-1CONFIG.\s0 .PP Configuration commands for qemu such as \*(L"guestfs_set_path\*(R" in \fBguestfs\fR\|(3) can only be issued when in the \s-1CONFIG\s0 state. .PP The \s-1API\s0 offers one call that goes from \s-1CONFIG\s0 through \s-1LAUNCHING\s0 to \&\s-1READY.\s0 \*(L"guestfs_launch\*(R" in \fBguestfs\fR\|(3) blocks until the child process is \&\s-1READY\s0 to accept commands (or until some failure or timeout). \&\*(L"guestfs_launch\*(R" in \fBguestfs\fR\|(3) internally moves the state from \s-1CONFIG\s0 to \&\s-1LAUNCHING\s0 while it is running. .PP \&\s-1API\s0 actions such as \*(L"guestfs_mount\*(R" in \fBguestfs\fR\|(3) can only be issued when in the \s-1READY\s0 state. These \s-1API\s0 calls block waiting for the command to be carried out. There are no non-blocking versions, and no way to issue more than one command per handle at the same time. .PP Finally, the child process sends asynchronous messages back to the main program, such as kernel log messages. You can register a callback to receive these messages. .SH "INTERNALS" .IX Header "INTERNALS" .SS "\s-1APPLIANCE BOOT PROCESS\s0" .IX Subsection "APPLIANCE BOOT PROCESS" This process has evolved and continues to evolve. The description here corresponds only to the current version of libguestfs and is provided for information only. .PP In order to follow the stages involved below, enable libguestfs debugging (set the environment variable \f(CW\*(C`LIBGUESTFS_DEBUG=1\*(C'\fR). .IP "Create the appliance" 4 .IX Item "Create the appliance" \&\f(CW\*(C`supermin \-\-build\*(C'\fR is invoked to create the kernel, a small initrd and the appliance. .Sp The appliance is cached in \fI/var/tmp/.guestfs\-<\s-1UID\s0>\fR (or in another directory if \f(CW\*(C`LIBGUESTFS_CACHEDIR\*(C'\fR or \f(CW\*(C`TMPDIR\*(C'\fR are set). .Sp For a complete description of how the appliance is created and cached, read the \fBsupermin\fR\|(1) man page. .IP "Start qemu and boot the kernel" 4 .IX Item "Start qemu and boot the kernel" qemu is invoked to boot the kernel. .IP "Run the initrd" 4 .IX Item "Run the initrd" \&\f(CW\*(C`supermin \-\-build\*(C'\fR builds a small initrd. The initrd is not the appliance. The purpose of the initrd is to load enough kernel modules in order that the appliance itself can be mounted and started. .Sp The initrd is a cpio archive called \&\fI/var/tmp/.guestfs\-<\s-1UID\s0>/appliance.d/initrd\fR. .Sp When the initrd has started you will see messages showing that kernel modules are being loaded, similar to this: .Sp .Vb 4 \& supermin: ext2 mini initrd starting up \& supermin: mounting /sys \& supermin: internal insmod libcrc32c.ko \& supermin: internal insmod crc32c\-intel.ko .Ve .IP "Find and mount the appliance device" 4 .IX Item "Find and mount the appliance device" The appliance is a sparse file containing an ext2 filesystem which contains a familiar (although reduced in size) Linux operating system. It would normally be called \&\fI/var/tmp/.guestfs\-<\s-1UID\s0>/appliance.d/root\fR. .Sp The regular disks being inspected by libguestfs are the first devices exposed by qemu (eg. as \fI/dev/vda\fR). .Sp The last disk added to qemu is the appliance itself (eg. \fI/dev/vdb\fR if there was only one regular disk). .Sp Thus the final job of the initrd is to locate the appliance disk, mount it, and switch root into the appliance, and run \fI/init\fR from the appliance. .Sp If this works successfully you will see messages such as: .Sp .Vb 5 \& supermin: picked /sys/block/vdb/dev as root device \& supermin: creating /dev/root as block special 252:16 \& supermin: mounting new root on /root \& supermin: chroot \& Starting /init script ... .Ve .Sp Note that \f(CW\*(C`Starting /init script ...\*(C'\fR indicates that the appliance's init script is now running. .IP "Initialize the appliance" 4 .IX Item "Initialize the appliance" The appliance itself now initializes itself. This involves starting certain processes like \f(CW\*(C`udev\*(C'\fR, possibly printing some debug information, and finally running the daemon (\f(CW\*(C`guestfsd\*(C'\fR). .IP "The daemon" 4 .IX Item "The daemon" Finally the daemon (\f(CW\*(C`guestfsd\*(C'\fR) runs inside the appliance. If it runs you should see: .Sp .Vb 1 \& verbose daemon enabled .Ve .Sp The daemon expects to see a named virtio-serial port exposed by qemu and connected on the other end to the library. .Sp The daemon connects to this port (and hence to the library) and sends a four byte message \f(CW\*(C`GUESTFS_LAUNCH_FLAG\*(C'\fR, which initiates the communication protocol (see below). .SS "\s-1COMMUNICATION PROTOCOL\s0" .IX Subsection "COMMUNICATION PROTOCOL" Don’t rely on using this protocol directly. This section documents how it currently works, but it may change at any time. .PP The protocol used to talk between the library and the daemon running inside the qemu virtual machine is a simple \s-1RPC\s0 mechanism built on top of \s-1XDR\s0 (\s-1RFC 1014, RFC 1832, RFC 4506\s0). .PP The detailed format of structures is in \fIcommon/protocol/guestfs_protocol.x\fR (note: this file is automatically generated). .PP There are two broad cases, ordinary functions that don’t have any \&\f(CW\*(C`FileIn\*(C'\fR and \f(CW\*(C`FileOut\*(C'\fR parameters, which are handled with very simple request/reply messages. Then there are functions that have any \&\f(CW\*(C`FileIn\*(C'\fR or \f(CW\*(C`FileOut\*(C'\fR parameters, which use the same request and reply messages, but they may also be followed by files sent using a chunked encoding. .PP \fI\s-1ORDINARY FUNCTIONS\s0 (\s-1NO FILEIN/FILEOUT PARAMS\s0)\fR .IX Subsection "ORDINARY FUNCTIONS (NO FILEIN/FILEOUT PARAMS)" .PP For ordinary functions, the request message is: .PP .Vb 4 \& total length (header + arguments, \& but not including the length word itself) \& struct guestfs_message_header (encoded as XDR) \& struct guestfs__args (encoded as XDR) .Ve .PP The total length field allows the daemon to allocate a fixed size buffer into which it slurps the rest of the message. As a result, the total length is limited to \f(CW\*(C`GUESTFS_MESSAGE_MAX\*(C'\fR bytes (currently 4MB), which means the effective size of any request is limited to somewhere under this size. .PP Note also that many functions don’t take any arguments, in which case the \f(CW\*(C`guestfs_\f(CIfoo\f(CW_args\*(C'\fR is completely omitted. .PP The header contains the procedure number (\f(CW\*(C`guestfs_proc\*(C'\fR) which is how the receiver knows what type of args structure to expect, or none at all. .PP For functions that take optional arguments, the optional arguments are encoded in the \f(CW\*(C`guestfs_\f(CIfoo\f(CW_args\*(C'\fR structure in the same way as ordinary arguments. A bitmask in the header indicates which optional arguments are meaningful. The bitmask is also checked to see if it contains bits set which the daemon does not know about (eg. if more optional arguments were added in a later version of the library), and this causes the call to be rejected. .PP The reply message for ordinary functions is: .PP .Vb 4 \& total length (header + ret, \& but not including the length word itself) \& struct guestfs_message_header (encoded as XDR) \& struct guestfs__ret (encoded as XDR) .Ve .PP As above the \f(CW\*(C`guestfs_\f(CIfoo\f(CW_ret\*(C'\fR structure may be completely omitted for functions that return no formal return values. .PP As above the total length of the reply is limited to \&\f(CW\*(C`GUESTFS_MESSAGE_MAX\*(C'\fR. .PP In the case of an error, a flag is set in the header, and the reply message is slightly changed: .PP .Vb 4 \& total length (header + error, \& but not including the length word itself) \& struct guestfs_message_header (encoded as XDR) \& struct guestfs_message_error (encoded as XDR) .Ve .PP The \f(CW\*(C`guestfs_message_error\*(C'\fR structure contains the error message as a string. .PP \fI\s-1FUNCTIONS THAT HAVE FILEIN PARAMETERS\s0\fR .IX Subsection "FUNCTIONS THAT HAVE FILEIN PARAMETERS" .PP A \f(CW\*(C`FileIn\*(C'\fR parameter indicates that we transfer a file \fIinto\fR the guest. The normal request message is sent (see above). However this is followed by a sequence of file chunks. .PP .Vb 7 \& total length (header + arguments, \& but not including the length word itself, \& and not including the chunks) \& struct guestfs_message_header (encoded as XDR) \& struct guestfs__args (encoded as XDR) \& sequence of chunks for FileIn param #0 \& sequence of chunks for FileIn param #1 etc. .Ve .PP The \*(L"sequence of chunks\*(R" is: .PP .Vb 7 \& length of chunk (not including length word itself) \& struct guestfs_chunk (encoded as XDR) \& length of chunk \& struct guestfs_chunk (encoded as XDR) \& ... \& length of chunk \& struct guestfs_chunk (with data.data_len == 0) .Ve .PP The final chunk has the \f(CW\*(C`data_len\*(C'\fR field set to zero. Additionally a flag is set in the final chunk to indicate either successful completion or early cancellation. .PP At time of writing there are no functions that have more than one FileIn parameter. However this is (theoretically) supported, by sending the sequence of chunks for each FileIn parameter one after another (from left to right). .PP Both the library (sender) \fIand\fR the daemon (receiver) may cancel the transfer. The library does this by sending a chunk with a special flag set to indicate cancellation. When the daemon sees this, it cancels the whole \s-1RPC,\s0 does \fInot\fR send any reply, and goes back to reading the next request. .PP The daemon may also cancel. It does this by writing a special word \&\f(CW\*(C`GUESTFS_CANCEL_FLAG\*(C'\fR to the socket. The library listens for this during the transfer, and if it gets it, it will cancel the transfer (it sends a cancel chunk). The special word is chosen so that even if cancellation happens right at the end of the transfer (after the library has finished writing and has started listening for the reply), the \*(L"spurious\*(R" cancel flag will not be confused with the reply message. .PP This protocol allows the transfer of arbitrary sized files (no 32 bit limit), and also files where the size is not known in advance (eg. from pipes or sockets). However the chunks are rather small (\f(CW\*(C`GUESTFS_MAX_CHUNK_SIZE\*(C'\fR), so that neither the library nor the daemon need to keep much in memory. .PP \fI\s-1FUNCTIONS THAT HAVE FILEOUT PARAMETERS\s0\fR .IX Subsection "FUNCTIONS THAT HAVE FILEOUT PARAMETERS" .PP The protocol for FileOut parameters is exactly the same as for FileIn parameters, but with the roles of daemon and library reversed. .PP .Vb 7 \& total length (header + ret, \& but not including the length word itself, \& and not including the chunks) \& struct guestfs_message_header (encoded as XDR) \& struct guestfs__ret (encoded as XDR) \& sequence of chunks for FileOut param #0 \& sequence of chunks for FileOut param #1 etc. .Ve .PP \fI\s-1INITIAL MESSAGE\s0\fR .IX Subsection "INITIAL MESSAGE" .PP When the daemon launches it sends an initial word (\f(CW\*(C`GUESTFS_LAUNCH_FLAG\*(C'\fR) which indicates that the guest and daemon is alive. This is what \*(L"guestfs_launch\*(R" in \fBguestfs\fR\|(3) waits for. .PP \fI\s-1PROGRESS NOTIFICATION MESSAGES\s0\fR .IX Subsection "PROGRESS NOTIFICATION MESSAGES" .PP The daemon may send progress notification messages at any time. These are distinguished by the normal length word being replaced by \&\f(CW\*(C`GUESTFS_PROGRESS_FLAG\*(C'\fR, followed by a fixed size progress message. .PP The library turns them into progress callbacks (see \&\*(L"\s-1GUESTFS_EVENT_PROGRESS\*(R"\s0 in \fBguestfs\fR\|(3)) if there is a callback registered, or discards them if not. .PP The daemon self-limits the frequency of progress messages it sends (see \f(CW\*(C`daemon/proto.c:notify_progress\*(C'\fR). Not all calls generate progress messages. .SS "\s-1FIXED APPLIANCE\s0" .IX Subsection "FIXED APPLIANCE" When libguestfs (or libguestfs tools) are run, they search a path looking for an appliance. The path is built into libguestfs, or can be set using the \f(CW\*(C`LIBGUESTFS_PATH\*(C'\fR environment variable. .PP Normally a supermin appliance is located on this path (see \&\*(L"\s-1SUPERMIN APPLIANCE\*(R"\s0 in \fBsupermin\fR\|(1)). libguestfs reconstructs this into a full appliance by running \f(CW\*(C`supermin \-\-build\*(C'\fR. .PP However, a simpler \*(L"fixed appliance\*(R" can also be used. libguestfs detects this by looking for a directory on the path containing all the following files: .IP "\(bu" 4 \&\fIkernel\fR .IP "\(bu" 4 \&\fIinitrd\fR .IP "\(bu" 4 \&\fIroot\fR .IP "\(bu" 4 \&\fI\s-1README\s0.fixed\fR (note that it \fBmust\fR be present as well) .PP If the fixed appliance is found, libguestfs skips supermin entirely and just runs the virtual machine (using qemu or the current backend, see \*(L"\s-1BACKEND\*(R"\s0 in \fBguestfs\fR\|(3)) with the kernel, initrd and root disk from the fixed appliance. .PP Thus the fixed appliance can be used when a platform or a Linux distribution does not support supermin. You build the fixed appliance on a platform that does support supermin using \&\fBlibguestfs\-make\-fixed\-appliance\fR\|(1), copy it over, and use that to run libguestfs. .SH "SEE ALSO" .IX Header "SEE ALSO" \&\fBguestfs\fR\|(3), \&\fBguestfs\-hacking\fR\|(1), \&\fBguestfs\-examples\fR\|(3), \&\fBlibguestfs\-test\-tool\fR\|(1), \&\fBlibguestfs\-make\-fixed\-appliance\fR\|(1), http://libguestfs.org/. .SH "AUTHORS" .IX Header "AUTHORS" Richard W.M. Jones (\f(CW\*(C`rjones at redhat dot com\*(C'\fR) .SH "COPYRIGHT" .IX Header "COPYRIGHT" Copyright (C) 2009\-2019 Red Hat Inc. .SH "LICENSE" .IX Header "LICENSE" This library is free software; you can redistribute it and/or modify it under the terms of the \s-1GNU\s0 Lesser General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. .PP This library is distributed in the hope that it will be useful, but \&\s-1WITHOUT ANY WARRANTY\s0; without even the implied warranty of \&\s-1MERCHANTABILITY\s0 or \s-1FITNESS FOR A PARTICULAR PURPOSE.\s0 See the \s-1GNU\s0 Lesser General Public License for more details. .PP You should have received a copy of the \s-1GNU\s0 Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, \s-1MA 02110\-1301 USA\s0 .SH "BUGS" .IX Header "BUGS" To get a list of bugs against libguestfs, use this link: https://bugzilla.redhat.com/buglist.cgi?component=libguestfs&product=Virtualization+Tools .PP To report a new bug against libguestfs, use this link: https://bugzilla.redhat.com/enter_bug.cgi?component=libguestfs&product=Virtualization+Tools .PP When reporting a bug, please supply: .IP "\(bu" 4 The version of libguestfs. .IP "\(bu" 4 Where you got libguestfs (eg. which Linux distro, compiled from source, etc) .IP "\(bu" 4 Describe the bug accurately and give a way to reproduce it. .IP "\(bu" 4 Run \fBlibguestfs\-test\-tool\fR\|(1) and paste the \fBcomplete, unedited\fR output into the bug report.