.TH "Parmap" 3 2014-10-09 OCamldoc "" .SH NAME Parmap \- Module Parmap: efficient parallel map, fold and mapfold on lists and arrays on multicores. .SH Module Module Parmap .SH Documentation .sp Module .BI "Parmap" : .B sig end .sp Module .B Parmap : efficient parallel map, fold and mapfold on lists and arrays on multicores\&. .sp All the primitives allow to control the granularity of the parallelism via an optional parameter .B chunksize : if .B chunksize is omitted, the input sequence is split evenly among the available cores; if .B chunksize is specified, the input data is split in chunks of size .B chunksize and dispatched to the available cores using an on demand strategy that ensures automatic load balancing\&. .sp A specific primitive .B array_float_parmap is provided for fast operations on float arrays\&. .sp .sp .sp .sp .PP .B === .B Setting and getting the default value for ncores .B === .PP .I val set_default_ncores : .B int -> unit .sp .sp .I val get_default_ncores : .B unit -> int .sp .sp .PP .B === .B Sequence type, subsuming lists and arrays .B === .PP .I type .B 'a .I sequence = | L .B of .B 'a list | A .B of .B 'a array .sp .sp .PP .B === The parmapfold, parfold and parmap generic functions, for efficiency reasons, .B convert the input data into an array internally, so we provide the \&'a sequence type .B to allow passing an array directly as input\&. .B If you want to perform a parallel map operation on an array, use array_parmap or array_float_parmap instead\&. === .PP .PP .B === .B Parallel mapfold .B === .PP .I val parmapfold : .B ?init:(int -> unit) -> .B ?finalize:(unit -> unit) -> .B ?ncores:int -> .B ?chunksize:int -> .B ('a -> 'b) -> .B 'a sequence -> ('b -> 'c -> 'c) -> 'c -> ('c -> 'c -> 'c) -> 'c .sp .B parmapfold ~ncores:n f (L l) op b concat computes .B List\&.fold_right op (List\&.map f l) b by forking .B n processes on a multicore machine\&. You need to provide the extra .B concat operator to combine the partial results of the fold computed on each core\&. If \&'b = \&'c, then .B concat may be simply .B op \&. The order of computation in parallel changes w\&.r\&.t\&. sequential execution, so this function is only correct if .B op and .B concat are associative and commutative\&. If the optional .B chunksize parameter is specified, the processes compute the result in an on\-demand fashion on blocks of size .B chunksize \&. .B parmapfold ~ncores:n f (A a) op b concat computes .B Array\&.fold_right op (Array\&.map f a) b .sp .sp .PP .B === .B Parallel fold .B === .PP .I val parfold : .B ?init:(int -> unit) -> .B ?finalize:(unit -> unit) -> .B ?ncores:int -> .B ?chunksize:int -> .B ('a -> 'b -> 'b) -> 'a sequence -> 'b -> ('b -> 'b -> 'b) -> 'b .sp .B parfold ~ncores:n op (L l) b concat computes .B List\&.fold_right op l b by forking .B n processes on a multicore machine\&. You need to provide the extra .B concat operator to combine the partial results of the fold computed on each core\&. If \&'b = \&'c, then .B concat may be simply .B op \&. The order of computation in parallel changes w\&.r\&.t\&. sequential execution, so this function is only correct if .B op and .B concat are associative and commutative\&. If the optional .B chunksize parameter is specified, the processes compute the result in an on\-demand fashion on blocks of size .B chunksize \&. .B parfold ~ncores:n op (A a) b concat similarly computes .B Array\&.fold_right op a b \&. .sp .sp .PP .B === .B Parallel map .B === .PP .I val parmap : .B ?init:(int -> unit) -> .B ?finalize:(unit -> unit) -> .B ?ncores:int -> ?chunksize:int -> ('a -> 'b) -> 'a sequence -> 'b list .sp .B parmap ~ncores:n f (L l) computes .B List\&.map f l by forking .B n processes on a multicore machine\&. .B parmap ~ncores:n f (A a) computes .B Array\&.map f a by forking .B n processes on a multicore machine\&. If the optional .B chunksize parameter is specified, the processes compute the result in an on\-demand fashion on blocks of size .B chunksize ; this provides automatic load balancing for unbalanced computations, but the order of the result is no longer guaranteed to be preserved\&. .sp .sp .PP .B === .B Parallel iteration .B === .PP .I val pariter : .B ?init:(int -> unit) -> .B ?finalize:(unit -> unit) -> .B ?ncores:int -> ?chunksize:int -> ('a -> unit) -> 'a sequence -> unit .sp .B pariter ~ncores:n f (L l) computes .B List\&.iter f l by forking .B n processes on a multicore machine\&. .B parmap ~ncores:n f (A a) computes .B Array\&.iter f a by forking .B n processes on a multicore machine\&. If the optional .B chunksize parameter is specified, the processes perform the computation in an on\-demand fashion on blocks of size .B chunksize ; this provides automatic load balancing for unbalanced computations\&. .sp .sp .PP .B === .B Parallel mapfold, indexed .B === .PP .I val parmapifold : .B ?init:(int -> unit) -> .B ?finalize:(unit -> unit) -> .B ?ncores:int -> .B ?chunksize:int -> .B (int -> 'a -> 'b) -> .B 'a sequence -> ('b -> 'c -> 'c) -> 'c -> ('c -> 'c -> 'c) -> 'c .sp Like parmapfold, but the map function gets as an extra argument the index of the mapped element .sp .sp .PP .B === .B Parallel map, indexed .B === .PP .I val parmapi : .B ?init:(int -> unit) -> .B ?finalize:(unit -> unit) -> .B ?ncores:int -> .B ?chunksize:int -> (int -> 'a -> 'b) -> 'a sequence -> 'b list .sp Like parmap, but the map function gets as an extra argument the index of the mapped element .sp .sp .PP .B === .B Parallel iteration, indexed .B === .PP .I val pariteri : .B ?init:(int -> unit) -> .B ?finalize:(unit -> unit) -> .B ?ncores:int -> .B ?chunksize:int -> (int -> 'a -> unit) -> 'a sequence -> unit .sp Like pariter, but the iterated function gets as an extra argument the index of the sequence element .sp .sp .PP .B === .B Parallel map on arrays .B === .PP .I val array_parmap : .B ?init:(int -> unit) -> .B ?finalize:(unit -> unit) -> .B ?ncores:int -> ?chunksize:int -> ('a -> 'b) -> 'a array -> 'b array .sp .B array_parmap ~ncores:n f a computes .B Array\&.map f a by forking .B n processes on a multicore machine\&. If the optional .B chunksize parameter is specified, the processes compute the result in an on\-demand fashion on blochs of size .B chunksize ; this provides automatic load balancing for unbalanced computations, but the order of the result is no longer guaranteed to be preserved\&. .sp .sp .PP .B === .B Parallel map on arrays, indexed .B === .PP .I val array_parmapi : .B ?init:(int -> unit) -> .B ?finalize:(unit -> unit) -> .B ?ncores:int -> ?chunksize:int -> (int -> 'a -> 'b) -> 'a array -> 'b array .sp Like array_parmap, but the map function gets as an extra argument the index of the mapped element .sp .sp .PP .B === .B Parallel map on float arrays .B === .PP .I exception WrongArraySize .sp .sp .I type buf .sp .sp .I val init_shared_buffer : .B float array -> buf .sp .B init_shared_buffer a creates a new memory mapped shared buffer big enough to hold a float array of the size of .B a \&. This buffer can be reused in a series of calls to .B array_float_parmap , avoiding the cost of reallocating it each time\&. .sp .sp .I val array_float_parmap : .B ?init:(int -> unit) -> .B ?finalize:(unit -> unit) -> .B ?ncores:int -> .B ?chunksize:int -> .B ?result:float array -> .B ?sharedbuffer:buf -> ('a -> float) -> 'a array -> float array .sp .B array_float_parmap ~ncores:n f a computes .B Array\&.map f a by forking .B n processes on a multicore machine, and preallocating the resulting array as shared memory, which allows significantly more efficient computation than calling the generic array_parmap function\&. If the optional .B chunksize parameter is specified, the processes compute the result in an on\-demand fashion on blochs of size .B chunksize ; this provides automatic load balancing for unbalanced computations, *and* the order of the result is still guaranteed to be preserved\&. .sp In case you already have at hand an array where to store the result, you can squeeze out some more cpu cycles by passing it as optional parameter .B result : this will avoid the creation of a result array, which can be costly for very large data sets\&. Raises .B WrongArraySize if .B result is too small to hold the data\&. .sp It is possible to share the same preallocated shared memory space across calls, by initialising the space calling .B init_shared_buffer a and passing the result as the optional .B sharedbuffer parameter to each subsequent call to .B array_float_parmap \&. Raises WrongArraySize if .B sharedbuffer is too small to hold the input data\&. .sp .sp .PP .B === .B Parallel map on float arrays, indexed .B === .PP .I val array_float_parmapi : .B ?init:(int -> unit) -> .B ?finalize:(unit -> unit) -> .B ?ncores:int -> .B ?chunksize:int -> .B ?result:float array -> .B ?sharedbuffer:buf -> (int -> 'a -> float) -> 'a array -> float array .sp .sp .PP .B === Like array_float_parmap, but the map function gets as an extra argument .B the index of the mapped element === .PP .PP .B === .B Debugging .B === .PP .I val debugging : .B bool -> unit .sp .sp .PP .B === Enable or disable debugging code in the library; default: false === .PP .PP .B === .B Helper function for redirection of stdout and stderr .B === .PP .I val redirect : .B ?path:string -> id:int -> unit .sp .sp .PP .B === Helper function that redirects stdout and stderr to files .B located in the directory path, carrying names of the shape .B stdout\&.NNN and stderr\&.NNN where NNN is the id of the used core\&. .B Useful when writing initialisation functions to be passed as .B init argument to the parallel combinators\&. .B The default value for path is /tmp/\&.parmap\&.PPPP with PPPP the .B process id of the main program\&. === .PP