NAME¶
gearmand - Gearmand Documentation,
http://gearman.info/
SYNOPSIS¶
General options
- -b [ --backlog ] arg (=32)
- Number of backlog connections for listen.
- --check-args
- Check command line and configuration file argments and then
exit.
- -d [ --daemon ]
- Daemon, detach and run in the background.
- -f [ --file-descriptors ] arg
- Number of file descriptors to allow for the process (total
connections will be slightly less). Default is max allowed for user.
- -h [ --help ]
- Print this help menu.
- -j [ --job-retries ] arg (=0)
- Number of attempts to run the job before the job server
removes it. This is helpful to ensure a bad job does not crash all
available workers. Default is no limit.
- -l [ --log-file ] arg
- Log file to write errors and information to. Turning this
option on also forces the first verbose level to be enabled.
- -L [ --listen ] arg
- Address the server should listen on. Default is
INADDR_ANY.
- -p [ --port ] arg (=4730)
- Port the server should listen on.
- -P [ --pid-file ] arg
- File to write process ID out to.
- -r [ --protocol ] arg
- Load protocol module.
- -R [ --round-robin ]
- Assign work in round-robin order per worker connection. The
default is to assign work in the order of functions added by the
worker.
- -q [ --queue-type ] arg
- Persistent queue type to use.
- -t [ --threads ] arg (=4)
- Number of I/O threads to use. Default=4.
- -u [ --user ] arg
- Switch to given user after startup.
- -v [ --verbose ] arg (=v)
- Increase verbosity level by one.
- -V [ --version ]
- Display the version of gearmand and exit.
- -w [ --worker-wakeup ] arg (=0)
- Number of workers to wakeup for each job received. The
default is to wakeup all available workers.
HTTP:
- --http-port arg (=8080)
- Port to listen on.
sqlite
- --libsqlite3-db arg
- Database file to use.
- --libsqlite3-table arg (=gearman_queue)
- Table to use.
Memcached(libmemcached)
- --libmemcached-servers arg
- List of Memcached servers to use.
Drizzle/MySQL(libdrizzle)
- -host arg
- Host of server.
- -port arg
- Port of server. (by default Drizzle)
- -uds arg
- Unix domain socket for server.
- -user arg
- User name for authentication.
- -password arg
- Password for authentication.
- -db arg
- Schema/Database to use.
- -mysql arg
- Use MySQL protocol.
Postgres
- --libpq-conninfo arg
- PostgreSQL connection information string.
- --libpq-table arg (=queue)
- Table to use.
tokyocabinet
- --libtokyocabinet-file arg
- File name of the database. [see: man tcadb, tcadbopen() for
name guidelines]
- --libtokyocabinet-optimize
- Optimize database on open. [default=true]
DESCRIPTION¶
Gearman provides a generic application framework to farm out work to other
machines or processes that are better suited to do the work. It allows you to
do work in parallel, to load balance processing, and to call functions between
languages. It can be used in a variety of applications, from high-availability
web sites to the transport of database replication events. In other words, it
is the nervous system for how distributed processing communicates. A few
strong points about Gearman:
- •
- Open Source - It's free! (in both meanings of the word)
Gearman has an active open source community that is easy to get involved
with if you need help or want to contribute.
- •
- Multi-language - There are interfaces for a number of
languages, and this list is growing. You also have the option to write
heterogeneous applications with clients submitting work in one language
and workers performing that work in another.
- •
- Flexible - You are not tied to any specific design pattern.
You can quickly put together distributed applications using any model you
choose, one of those options being Map/Reduce.
- •
- Fast - Gearman has a simple protocol and interface with a
new optimized server in C to minimize your application overhead.
- •
- Embeddable - Since Gearman is fast and lightweight, it is
great for applications of all sizes. It is also easy to introduce into
existing applications with minimal overhead.
- •
- No single point of failure - Gearman can not only help
scale systems, but can do it in a fault tolerant way.
Thread Model¶
The -t option to gearmand allows you to specify multiple I/O threads, this is
enabled by default. There are currently three types of threads in the job
server:
Listening and management thread - only one I/O thread - can have many Processing
thread - only one
When no -t option is given or -t 0 is given, all of three thread types happen
within a single thread. When -t 1 is given, there is a thread for
listening/management and a thread for I/O and processing. When -t 2 is given,
there is a thread for each type of thread above. For all -t option values
above 2, more I/O threads are created.
The listening and management thread is mainly responsible for accepting new
connections and assigning those connections to an I/O thread (if there are
many). It also coordinates startup and shutdown within the server. This thread
will have an instance of libevent for managing socket events and signals on an
internal pipe. This pipe is used to wakeup the thread or to coordinate
shutdown.
The I/O thread is responsible for doing the read and write system calls on the
sockets and initial packet parsing. Once the packet has been parsed it it put
into an asynchronous queue for the processing thread (each thread has it's own
queue so there is very little contention). Each I/O thread has it's own
instance of libevent for managing socket events and signals on an internal
pipe like the listening thread.
The processing thread should have no system calls within it (except for the
occasional brk() for more memory), and manages the various lists and hash
tables used for tracking unique keys, job handles, functions, and job queues.
All packets that need to be sent back to connections are put into an
asynchronous queue for the I/O thread. The I/O thread will pick these up and
send them back over the connected socket. All packets flow through the
processing thread since it contains the information needed to process the
packets. This is due to the complex nature of the various lists and hash
tables. If multiple threads were modifying them the locking overhead would
most likely cause worse performance than having it in a single thread (and
would also complicate the code). In the future more work may be pushed to the
I/O threads, and the processing thread can retain minimal functionality to
manage those tables and lists. So far this has not been a significant
bottleneck, a 16 core Intel machine is able to process upwards of 50k jobs per
second.
Persistent Queues¶
Inside the Gearman job server, all job queues are stored in memory. This means
if a server restarts or crashes with pending jobs, they will be lost and are
never run by a worker. Persistent queues were added to allow background jobs
to be stored in an external durable queue so they may live between server
restarts and crashes. The persistent queue is only enabled for background jobs
because foreground jobs have an attached client. If a job server goes away,
the client can detect this and restart the foreground job somewhere else (or
report an error back to the original caller). Background jobs on the other
hand have no attached client and are simply expected to be run when submitted.
The persistent queue works by calling a module callback function right before
putting a new job in the internal queue for pending jobs to be run. This
allows the module to store the job about to be run in some persistent way so
that it can later be replayed during a restart. Once it is stored through the
module, the job is put onto the active runnable queue, waking up available
workers if needed. Once the job has been successfully completed by a worker,
another module callback function is called to notify the module the job is
done and can be removed. If a job server crashes or is restarted between these
two calls for a job, the jobs are reloaded during the next job server start.
When the job server starts up, it will call a replay callback function in the
module to provide a list of all jobs that were not complete. This is used to
populate the internal memory queue of jobs to be run. Once this replay is
complete, the job server finishes its initialization and the jobs are now
runnable once workers connect (the queue should be in the same state as when
it crashed). These jobs are removed from the persistent queue when completed
as normal. NOTE: Deleting jobs from the persistent queue storage will not
remove them from the in-memory queue while the server is running.
The queues are implemented using a modular interface so it is easy to add new
data stores for the persistent queue.
A persistent queue module is enabled by passing the -q or –queue-type
option to gearmand. Run gearmand –help to see which queue modules are
supported on your system. If you are missing options for one you would like to
use, you will need to install any dependencies and then recompile the gearmand
package.
Extended Protocols¶
The protocol plugin interface allows you to take over the packet send and
recieve functions, allowing you to pack the buffers as required by the
protocol. The core read and write functions can (and should) be used by the
protocol plugin.
HTTP¶
This protocol plugin allows you to map HTTP requests to Gearman jobs. It only
provides client job submission currently, but it may be extended to support
other request types in the future. The plugin can handle both GET and POST
data, the latter being used to send a workload to the job server. The URL
being requested is translated into the function being called.
For example, the request:
POST /reverse HTTP/1.1
Content-Length: 12
Hello world!
Is translated into a job submission request for the function
“reverse” and workload “Hello world!”. This will
respond with:
HTTP/1.0 200 OK
X-Gearman-Job-Handle: H:lap:4
Content-Length: 12
Server: Gearman/0.8
!dlrow olleH
The following headers can be passed to change the behavior of the job:
* X-Gearman-Unique: <unique key>
* X-Gearman-Background: true
* X-Gearman-Priority: <high|low>
For example, to run a low priority background job, the following request can be
sent:
POST /reverse HTTP/1.1
Content-Length: 12
X-Gearman-Background: true
X-Gearman-Priority: low
Hello world!
The response for this request will not have any data associated with it since it
was a background job:
HTTP/1.0 200 OK
X-Gearman-Job-Handle: H:lap:6
Content-Length: 0
Server: Gearman/0.8
The HTTP protocol should be considered experimental.
HOME¶
To find out more information please check:
http://gearman.info/
SEE ALSO¶
gearman(1) gearadmin(1) libgearmand(3)
AUTHOR¶
Data Differential
http://www.datadifferential.com/
COPYRIGHT¶
2012, Data Differential,
http://www.datadifferential.com/