.TH "Hashtbl" 3o 2023-09-20 OCamldoc "OCaml library" .SH NAME Hashtbl \- Hash tables and hash functions. .SH Module Module Hashtbl .SH Documentation .sp Module .BI "Hashtbl" : .B sig end .sp Hash tables and hash functions\&. .sp Hash tables are hashed association tables, with in\-place modification\&. .sp .sp .sp .PP .SS Generic interface .PP .I type .B ('a, 'b) .I t .sp The type of hash tables from type .ft B \&'a .ft R to type .ft B \&'b .ft R \&. .sp .I val create : .B ?random:bool -> int -> ('a, 'b) t .sp .ft B Hashtbl\&.create n .ft R creates a new, empty hash table, with initial size .ft B n .ft R \&. For best results, .ft B n .ft R should be on the order of the expected number of elements that will be in the table\&. The table grows as needed, so .ft B n .ft R is just an initial guess\&. .sp The optional .ft B ~ .ft R .ft B random .ft R parameter (a boolean) controls whether the internal organization of the hash table is randomized at each execution of .ft B Hashtbl\&.create .ft R or deterministic over all executions\&. .sp A hash table that is created with .ft B ~ .ft R .ft B random .ft R set to .ft B false .ft R uses a fixed hash function ( .ft B Hashtbl\&.hash .ft R ) to distribute keys among buckets\&. As a consequence, collisions between keys happen deterministically\&. In Web\-facing applications or other security\-sensitive applications, the deterministic collision patterns can be exploited by a malicious user to create a denial\-of\-service attack: the attacker sends input crafted to create many collisions in the table, slowing the application down\&. .sp A hash table that is created with .ft B ~ .ft R .ft B random .ft R set to .ft B true .ft R uses the seeded hash function .ft B Hashtbl\&.seeded_hash .ft R with a seed that is randomly chosen at hash table creation time\&. In effect, the hash function used is randomly selected among .ft B 2^{30} .ft R different hash functions\&. All these hash functions have different collision patterns, rendering ineffective the denial\-of\-service attack described above\&. However, because of randomization, enumerating all elements of the hash table using .ft B Hashtbl\&.fold .ft R or .ft B Hashtbl\&.iter .ft R is no longer deterministic: elements are enumerated in different orders at different runs of the program\&. .sp If no .ft B ~ .ft R .ft B random .ft R parameter is given, hash tables are created in non\-random mode by default\&. This default can be changed either programmatically by calling .ft B Hashtbl\&.randomize .ft R or by setting the .ft B R .ft R flag in the .ft B OCAMLRUNPARAM .ft R environment variable\&. .sp .B "Before4.00.0" the .ft B ~ .ft R .ft B random .ft R parameter was not present and all hash tables were created in non\-randomized mode\&. .sp .I val clear : .B ('a, 'b) t -> unit .sp Empty a hash table\&. Use .ft B reset .ft R instead of .ft B clear .ft R to shrink the size of the bucket table to its initial size\&. .sp .I val reset : .B ('a, 'b) t -> unit .sp Empty a hash table and shrink the size of the bucket table to its initial size\&. .sp .B "Since" 4.00.0 .sp .I val copy : .B ('a, 'b) t -> ('a, 'b) t .sp Return a copy of the given hashtable\&. .sp .I val add : .B ('a, 'b) t -> 'a -> 'b -> unit .sp .ft B Hashtbl\&.add tbl key data .ft R adds a binding of .ft B key .ft R to .ft B data .ft R in table .ft B tbl .ft R \&. Previous bindings for .ft B key .ft R are not removed, but simply hidden\&. That is, after performing .ft B Hashtbl\&.remove .ft R .ft B tbl key .ft R , the previous binding for .ft B key .ft R , if any, is restored\&. (Same behavior as with association lists\&.) .sp .I val find : .B ('a, 'b) t -> 'a -> 'b .sp .ft B Hashtbl\&.find tbl x .ft R returns the current binding of .ft B x .ft R in .ft B tbl .ft R , or raises .ft B Not_found .ft R if no such binding exists\&. .sp .I val find_opt : .B ('a, 'b) t -> 'a -> 'b option .sp .ft B Hashtbl\&.find_opt tbl x .ft R returns the current binding of .ft B x .ft R in .ft B tbl .ft R , or .ft B None .ft R if no such binding exists\&. .sp .B "Since" 4.05 .sp .I val find_all : .B ('a, 'b) t -> 'a -> 'b list .sp .ft B Hashtbl\&.find_all tbl x .ft R returns the list of all data associated with .ft B x .ft R in .ft B tbl .ft R \&. The current binding is returned first, then the previous bindings, in reverse order of introduction in the table\&. .sp .I val mem : .B ('a, 'b) t -> 'a -> bool .sp .ft B Hashtbl\&.mem tbl x .ft R checks if .ft B x .ft R is bound in .ft B tbl .ft R \&. .sp .I val remove : .B ('a, 'b) t -> 'a -> unit .sp .ft B Hashtbl\&.remove tbl x .ft R removes the current binding of .ft B x .ft R in .ft B tbl .ft R , restoring the previous binding if it exists\&. It does nothing if .ft B x .ft R is not bound in .ft B tbl .ft R \&. .sp .I val replace : .B ('a, 'b) t -> 'a -> 'b -> unit .sp .ft B Hashtbl\&.replace tbl key data .ft R replaces the current binding of .ft B key .ft R in .ft B tbl .ft R by a binding of .ft B key .ft R to .ft B data .ft R \&. If .ft B key .ft R is unbound in .ft B tbl .ft R , a binding of .ft B key .ft R to .ft B data .ft R is added to .ft B tbl .ft R \&. This is functionally equivalent to .ft B Hashtbl\&.remove .ft R .ft B tbl key .ft R followed by .ft B Hashtbl\&.add .ft R .ft B tbl key data .ft R \&. .sp .I val iter : .B ('a -> 'b -> unit) -> ('a, 'b) t -> unit .sp .ft B Hashtbl\&.iter f tbl .ft R applies .ft B f .ft R to all bindings in table .ft B tbl .ft R \&. .ft B f .ft R receives the key as first argument, and the associated value as second argument\&. Each binding is presented exactly once to .ft B f .ft R \&. .sp The order in which the bindings are passed to .ft B f .ft R is unspecified\&. However, if the table contains several bindings for the same key, they are passed to .ft B f .ft R in reverse order of introduction, that is, the most recent binding is passed first\&. .sp If the hash table was created in non\-randomized mode, the order in which the bindings are enumerated is reproducible between successive runs of the program, and even between minor versions of OCaml\&. For randomized hash tables, the order of enumeration is entirely random\&. .sp The behavior is not specified if the hash table is modified by .ft B f .ft R during the iteration\&. .sp .I val filter_map_inplace : .B ('a -> 'b -> 'b option) -> ('a, 'b) t -> unit .sp .ft B Hashtbl\&.filter_map_inplace f tbl .ft R applies .ft B f .ft R to all bindings in table .ft B tbl .ft R and update each binding depending on the result of .ft B f .ft R \&. If .ft B f .ft R returns .ft B None .ft R , the binding is discarded\&. If it returns .ft B Some new_val .ft R , the binding is update to associate the key to .ft B new_val .ft R \&. .sp Other comments for .ft B Hashtbl\&.iter .ft R apply as well\&. .sp .B "Since" 4.03.0 .sp .I val fold : .B ('a -> 'b -> 'c -> 'c) -> ('a, 'b) t -> 'c -> 'c .sp .ft B Hashtbl\&.fold f tbl init .ft R computes .ft B (f kN dN \&.\&.\&. (f k1 d1 init)\&.\&.\&.) .ft R , where .ft B k1 \&.\&.\&. kN .ft R are the keys of all bindings in .ft B tbl .ft R , and .ft B d1 \&.\&.\&. dN .ft R are the associated values\&. Each binding is presented exactly once to .ft B f .ft R \&. .sp The order in which the bindings are passed to .ft B f .ft R is unspecified\&. However, if the table contains several bindings for the same key, they are passed to .ft B f .ft R in reverse order of introduction, that is, the most recent binding is passed first\&. .sp If the hash table was created in non\-randomized mode, the order in which the bindings are enumerated is reproducible between successive runs of the program, and even between minor versions of OCaml\&. For randomized hash tables, the order of enumeration is entirely random\&. .sp The behavior is not specified if the hash table is modified by .ft B f .ft R during the iteration\&. .sp .I val length : .B ('a, 'b) t -> int .sp .ft B Hashtbl\&.length tbl .ft R returns the number of bindings in .ft B tbl .ft R \&. It takes constant time\&. Multiple bindings are counted once each, so .ft B Hashtbl\&.length .ft R gives the number of times .ft B Hashtbl\&.iter .ft R calls its first argument\&. .sp .I val randomize : .B unit -> unit .sp After a call to .ft B Hashtbl\&.randomize() .ft R , hash tables are created in randomized mode by default: .ft B Hashtbl\&.create .ft R returns randomized hash tables, unless the .ft B ~random:false .ft R optional parameter is given\&. The same effect can be achieved by setting the .ft B R .ft R parameter in the .ft B OCAMLRUNPARAM .ft R environment variable\&. .sp It is recommended that applications or Web frameworks that need to protect themselves against the denial\-of\-service attack described in .ft B Hashtbl\&.create .ft R call .ft B Hashtbl\&.randomize() .ft R at initialization time\&. .sp Note that once .ft B Hashtbl\&.randomize() .ft R was called, there is no way to revert to the non\-randomized default behavior of .ft B Hashtbl\&.create .ft R \&. This is intentional\&. Non\-randomized hash tables can still be created using .ft B Hashtbl\&.create ~random:false .ft R \&. .sp .B "Since" 4.00.0 .sp .I val is_randomized : .B unit -> bool .sp Return .ft B true .ft R if the tables are currently created in randomized mode by default, .ft B false .ft R otherwise\&. .sp .B "Since" 4.03.0 .sp .I val rebuild : .B ?random:bool -> ('a, 'b) t -> ('a, 'b) t .sp Return a copy of the given hashtable\&. Unlike .ft B Hashtbl\&.copy .ft R , .ft B Hashtbl\&.rebuild .ft R .ft B h .ft R re\-hashes all the (key, value) entries of the original table .ft B h .ft R \&. The returned hash table is randomized if .ft B h .ft R was randomized, or the optional .ft B random .ft R parameter is true, or if the default is to create randomized hash tables; see .ft B Hashtbl\&.create .ft R for more information\&. .sp .ft B Hashtbl\&.rebuild .ft R can safely be used to import a hash table built by an old version of the .ft B Hashtbl .ft R module, then marshaled to persistent storage\&. After unmarshaling, apply .ft B Hashtbl\&.rebuild .ft R to produce a hash table for the current version of the .ft B Hashtbl .ft R module\&. .sp .B "Since" 4.12.0 .sp .I type statistics = { num_bindings : .B int ; (* Number of bindings present in the table\&. Same value as returned by .ft B Hashtbl\&.length .ft R \&. *) num_buckets : .B int ; (* Number of buckets in the table\&. *) max_bucket_length : .B int ; (* Maximal number of bindings per bucket\&. *) bucket_histogram : .B int array ; (* Histogram of bucket sizes\&. This array .ft B histo .ft R has length .ft B max_bucket_length + 1 .ft R \&. The value of .ft B histo\&.(i) .ft R is the number of buckets whose size is .ft B i .ft R \&. *) } .sp .B "Since" 4.00.0 .sp .I val stats : .B ('a, 'b) t -> statistics .sp .ft B Hashtbl\&.stats tbl .ft R returns statistics about the table .ft B tbl .ft R : number of buckets, size of the biggest bucket, distribution of buckets by size\&. .sp .B "Since" 4.00.0 .sp .PP .SS Hash tables and Sequences .PP .I val to_seq : .B ('a, 'b) t -> ('a * 'b) Seq.t .sp Iterate on the whole table\&. The order in which the bindings appear in the sequence is unspecified\&. However, if the table contains several bindings for the same key, they appear in reversed order of introduction, that is, the most recent binding appears first\&. .sp The behavior is not specified if the hash table is modified during the iteration\&. .sp .B "Since" 4.07 .sp .I val to_seq_keys : .B ('a, 'b) t -> 'a Seq.t .sp Same as .ft B Seq\&.map fst (to_seq m) .ft R .sp .B "Since" 4.07 .sp .I val to_seq_values : .B ('a, 'b) t -> 'b Seq.t .sp Same as .ft B Seq\&.map snd (to_seq m) .ft R .sp .B "Since" 4.07 .sp .I val add_seq : .B ('a, 'b) t -> ('a * 'b) Seq.t -> unit .sp Add the given bindings to the table, using .ft B Hashtbl\&.add .ft R .sp .B "Since" 4.07 .sp .I val replace_seq : .B ('a, 'b) t -> ('a * 'b) Seq.t -> unit .sp Add the given bindings to the table, using .ft B Hashtbl\&.replace .ft R .sp .B "Since" 4.07 .sp .I val of_seq : .B ('a * 'b) Seq.t -> ('a, 'b) t .sp Build a table from the given bindings\&. The bindings are added in the same order they appear in the sequence, using .ft B Hashtbl\&.replace_seq .ft R , which means that if two pairs have the same key, only the latest one will appear in the table\&. .sp .B "Since" 4.07 .sp .PP .SS Functorial interface .PP .PP The functorial interface allows the use of specific comparison and hash functions, either for performance/security concerns, or because keys are not hashable/comparable with the polymorphic builtins\&. .sp For instance, one might want to specialize a table for integer keys: .EX .ft B .br \& module IntHash = .br \& struct .br \& type t = int .br \& let equal i j = i=j .br \& let hash i = i land max_int .br \& end .br \& .br \& module IntHashtbl = Hashtbl\&.Make(IntHash) .br \& .br \& let h = IntHashtbl\&.create 17 in .br \& IntHashtbl\&.add h 12 "hello" .br \& .ft R .EE .sp This creates a new module .ft B IntHashtbl .ft R , with a new type .ft B \&'a .br \& IntHashtbl\&.t .ft R of tables from .ft B int .ft R to .ft B \&'a .ft R \&. In this example, .ft B h .ft R contains .ft B string .ft R values so its type is .ft B string IntHashtbl\&.t .ft R \&. .sp Note that the new type .ft B \&'a IntHashtbl\&.t .ft R is not compatible with the type .ft B (\&'a,\&'b) Hashtbl\&.t .ft R of the generic interface\&. For example, .ft B Hashtbl\&.length h .ft R would not type\-check, you must use .ft B IntHashtbl\&.length .ft R \&. .PP .I module type HashedType = .B sig end .sp The input signature of the functor .ft B Hashtbl\&.Make .ft R \&. .sp .I module type S = .B sig end .sp The output signature of the functor .ft B Hashtbl\&.Make .ft R \&. .sp .I module Make : .B functor (H : HashedType) -> sig end .sp Functor building an implementation of the hashtable structure\&. The functor .ft B Hashtbl\&.Make .ft R returns a structure containing a type .ft B key .ft R of keys and a type .ft B \&'a t .ft R of hash tables associating data of type .ft B \&'a .ft R to keys of type .ft B key .ft R \&. The operations perform similarly to those of the generic interface, but use the hashing and equality functions specified in the functor argument .ft B H .ft R instead of generic equality and hashing\&. Since the hash function is not seeded, the .ft B create .ft R operation of the result structure always returns non\-randomized hash tables\&. .sp .I module type SeededHashedType = .B sig end .sp The input signature of the functor .ft B Hashtbl\&.MakeSeeded .ft R \&. .sp .B "Since" 4.00.0 .sp .I module type SeededS = .B sig end .sp The output signature of the functor .ft B Hashtbl\&.MakeSeeded .ft R \&. .sp .B "Since" 4.00.0 .sp .I module MakeSeeded : .B functor (H : SeededHashedType) -> sig end .sp Functor building an implementation of the hashtable structure\&. The functor .ft B Hashtbl\&.MakeSeeded .ft R returns a structure containing a type .ft B key .ft R of keys and a type .ft B \&'a t .ft R of hash tables associating data of type .ft B \&'a .ft R to keys of type .ft B key .ft R \&. The operations perform similarly to those of the generic interface, but use the seeded hashing and equality functions specified in the functor argument .ft B H .ft R instead of generic equality and hashing\&. The .ft B create .ft R operation of the result structure supports the .ft B ~ .ft R .ft B random .ft R optional parameter and returns randomized hash tables if .ft B ~random:true .ft R is passed or if randomization is globally on (see .ft B Hashtbl\&.randomize .ft R )\&. .sp .B "Since" 4.00.0 .sp .PP .SS The polymorphic hash functions .PP .I val hash : .B 'a -> int .sp .ft B Hashtbl\&.hash x .ft R associates a nonnegative integer to any value of any type\&. It is guaranteed that if .ft B x = y .ft R or .ft B Stdlib\&.compare x y = 0 .ft R , then .ft B hash x = hash y .ft R \&. Moreover, .ft B hash .ft R always terminates, even on cyclic structures\&. .sp .I val seeded_hash : .B int -> 'a -> int .sp A variant of .ft B Hashtbl\&.hash .ft R that is further parameterized by an integer seed\&. .sp .B "Since" 4.00.0 .sp .I val hash_param : .B int -> int -> 'a -> int .sp .ft B Hashtbl\&.hash_param meaningful total x .ft R computes a hash value for .ft B x .ft R , with the same properties as for .ft B hash .ft R \&. The two extra integer parameters .ft B meaningful .ft R and .ft B total .ft R give more precise control over hashing\&. Hashing performs a breadth\-first, left\-to\-right traversal of the structure .ft B x .ft R , stopping after .ft B meaningful .ft R meaningful nodes were encountered, or .ft B total .ft R nodes (meaningful or not) were encountered\&. If .ft B total .ft R as specified by the user exceeds a certain value, currently 256, then it is capped to that value\&. Meaningful nodes are: integers; floating\-point numbers; strings; characters; booleans; and constant constructors\&. Larger values of .ft B meaningful .ft R and .ft B total .ft R means that more nodes are taken into account to compute the final hash value, and therefore collisions are less likely to happen\&. However, hashing takes longer\&. The parameters .ft B meaningful .ft R and .ft B total .ft R govern the tradeoff between accuracy and speed\&. As default choices, .ft B Hashtbl\&.hash .ft R and .ft B Hashtbl\&.seeded_hash .ft R take .ft B meaningful = 10 .ft R and .ft B total = 100 .ft R \&. .sp .I val seeded_hash_param : .B int -> int -> int -> 'a -> int .sp A variant of .ft B Hashtbl\&.hash_param .ft R that is further parameterized by an integer seed\&. Usage: .ft B Hashtbl\&.seeded_hash_param meaningful total seed x .ft R \&. .sp .B "Since" 4.00.0 .sp