Hashmap and Hashset
Monkey Forums/Monkey Code/Hashmap and Hashset
| ||
In an attempt to learn monkey I created a HashMap class and a HashSet class that extends it. I am putting them here because--in my empirical testing--I noticed both classes consistently outperformed the Red-Black binary tree based Map and Set that come with Monkey, especially for sizes > 1000. Which makes sense given the asymptotic complexity of both. Following are the times I saw in bench marking. Diddy's RealMillisecs was used to time the benchmarks. All tests were done in release mode. HTML 500: HashSet<string>: 2 StringSet: 2 HTML 5000 HashSet<string>: 5 StringSet: 13 HTML 500000: HashSet<string>: 1037 StringSet: 2039 (GLFW was too fast to see a notable difference till the larger test case. Also the timer wasn't fine-grained enough) GLFW 5000000: HashSet<string>: 7000 StringSet: 21000 Feel free to use the code, change it or ignore it as you see fit. I hereby release it under public domain. Class Hasher<T> Method hashCode:int(i:T) Return 0 End Method equals:Bool(i:T, i2:T) Return False End End Class IntHasher Extends Hasher<int> Method hashCode:int(i:int) Return i End Method equals:Bool(i:int, i2:int) Return i = i2 End End Class StringHasher Extends Hasher<String> Method hashCode:int(i:String) Local hash:Int = 1 Local prime:Int = 31 For Local j:Int = 0 To i.Length() -1 hash = prime * hash + i[j] Next Return hash End Method equals:Bool(i:String, i2:String) Return i.Compare(i2) = 0 End End ' A hashtable element represents an entry in the hash map/table. ' Works like a singly linked list to deal with hash collisions. Class HTElem<K, V> Field key:K Field value:V Field nxt:HTElem<K, V> Method New(k:K, v:V, n:HTElem<K, V>) key = k value = v nxt = n End End ' A hash map class that uses chained bucketing to deal with hash collisions. ' K and V represent the classes of the key and value items in the hash map. ' To use this class one will have to provide a 'Hasher' object defined above. Class HashMap<K, V> Private Field table:HTElem<K, V>[] ' The array that backs the hash table Field numElems:Int = 0 ' Number of elements in the hash table Const MAX_LOAD:Float = 1.75 ' When the hashtable exceedes this density it will resize itself Const GROWTH_FACTOR:= 3 ' The multiple to with the table resizes Const DEFAULT_HT_SIZE:= 11 ' The default hash table size Const SHRINK_FACTOR:= 0.5 Const MIN_LOAD:= 0.3 Field hasher:Hasher<K> ' The hasher object that provides equals and hashCode methods ' Resizes the hash map while keeping all its entries. Method resize:Void(size:int) Local tmp:HTElem<K, V>[] = table table = New HTElem<K, V>[size] For Local i:int = 0 Until tmp.Length Local curr:= tmp[i] While (curr <> Null) Self.add(curr.key, curr.value) curr = curr.nxt Wend Next End ' Gets the location for a key in the current table Method getLoc:Int(key:K) Return Abs(hasher.hashCode(key) Mod table.Length) End Public ' ------Constructors------- ' Create a new hashmap with a specific size Method New(h:Hasher<K>, size:Int) table = New HTElem<K, V>[size] hasher = h End ' Create a default hashmap Method New(h:Hasher<K>) table = New HTElem<K, V>[DEFAULT_HT_SIZE] hasher = h End ' Check if the hashmap contains a specific element Method contains:Bool(key:K) Local location:Int = Self.getLoc(key) Local curr:HTElem<K, V> = table[location] While (curr <> Null) If (hasher.equals(curr.key, key)) Return True EndIf curr = curr.nxt Wend Return False End Method get:V(key:K) Local location:Int = Self.getLoc(key) Local curr:HTElem<K, V> = table[location] While (curr <> Null) If (hasher.equals(curr.key, key)) Return curr.value EndIf curr = curr.nxt Wend Throw New Throwable End ' Add a new key/value pair Method add:Void(key:K, value:V) Local location:Int = Self.getLoc(key) Local tst:= table[location] table[location] = New HTElem<K, V>(key, value, table[location]) Self.numElems += 1 If (Self.numElems / Float(table.Length) >= MAX_LOAD) Self.resize(table.Length * GROWTH_FACTOR) EndIf tst = table[location] End ' Remove all instances of a key Method remove:Void(key:K) Local location:Int = Self.getLoc(key) Local curr:HTElem<K, V> = table[location] Local prev:HTElem<K, V> = Null While (curr <> Null) If (hasher.equals(curr.key, key)) If (prev <> Null) prev.nxt = curr.nxt Else table[location] = table[location].nxt EndIf EndIf prev = curr curr = curr.nxt Wend End End ' A set that is backed by a hashmap Class HashSet<T> Extends HashMap<T, Object> Method New(h:Hasher<K>, size:Int) Super.New(h, size) End Method New(h:Hasher<K>) Super.New(h) End Method add:Void(key:T, value:Object = Null) Super.add(key, Null) End Method remove:bool(key:T) Return Super.remove(key) End Method contains:bool(key:T) Return Super.contains(key) End End Here is an example of how one might use the code: Local h:HashSet<String> = New HashSet<String>(New StringHasher()) h.add("hello hashset!") You will have to define your own Hasher classes for different classes. However I might be able to make a generic one using reflection. When using these classes with your own objects, or for any other type I haven't made a Hasher object for, you need to make sure you define a good hashCode method that spreads out entries evenly over the hash table. I suggest following the method used by StringHasher in the above code for every field element you want to use to form the hash code. The hash map will automatically resize itself. However using the right constructor you can give it a starting size other than the default. This can be useful when you know roughly how many elements you will be housing in the hash map and will optimize performance a little. The code has been moderately tested for speed and correctness on the HTML5 and GLFW targets. Feel free to report any bugs, ideas or improvements. |
| ||
Looks neat, though I'll admit that my current roguelike project is the first time I've ever used a Map or a Set for anything! |
| ||
Thanks! I've had to use Maps and Sets for a lot of things. I think of them like a nice swiss army knife for lots of applications. |
| ||
I saw these back when you first posted them and didn't get around to testing them until now (5 months later...). Interestingly, on a small scale, both hash map and map typically ran at the same speed. It wasn't until very large sized maps that the speed difference became noticeable. These are the results from two of my tests (red = hash multimap; blue = multimap) ![]() ![]() All-in-all though, the (A)verage and (T)otal usually leaned towards hash map (most platforms, except flash, which was indecisive). Thank you for making these, 'cause they do help with performance in certain cases where you need to squeeze out a bit more without causing major slowdowns. |