Hash Tables in Javascript

Introduction

A hash table is a permutation of an associative array (i.e. name => value pairs). If you use PHP, then you are very familiar with this type of data structure already since all PHP arrays are associative. (That's how I became familiar with them).
In JavaScript, all non-scalar objects behave as associative arrays. The object can use other objects as keys. However, the length of the associative array is not tracked (like with index-based arrays) and there is the potential that the keys can conflict with built-in members (such as those added to the object prototype) and custom members such as the length property or convenience methods. We'll explore an approach that can avoid these shortcomings.
A short example of an associative array (hash table) in JavaScript is as follows:
var h = new Object(); // or just {}
h['one'] = 1;
h['two'] = 2;
h['three'] = 3;

// show the values stored
for (var k in h) {
    // use hasOwnProperty to filter out keys from the Object.prototype
    if (h.hasOwnProperty(k)) {
        alert('key is: ' + k + ', value is: ' + h[k]);
    }
}
        
TIP: Feel free to replace alert() with console.log() if it's supported by your browser (Chrome, Firefox with FireBug, etc).
Just as in PHP, the 'foreach' contruct is used to run through the array, doing something for each key => value pair. But what if we wanted to know the size?
alert('size of hash table ' + h.length);
        
Hmm. This reference to the length property gives the somewhat unexpected value of 'undefined'. That's because in Javascript the length property of an object is not incremented when new properties (keys) are added (meaning it doesn't really behave like a hash table).

Fundamentals

After a brief discussion of fundamentals we will begin to focus on the core problem. In JavaScript, every non-scalar variable is an object. Okay, so what does this mean? Well, essentially, this means it has a constructor, methods and properties. A property is just a variable that is owned by the object and thus local to that object. A property is accessed using the syntax:
h.one
        
where one is the property and the '.' symbol signifies we are talking about the property of the object obj. It can also be assigned in the same way:
h.one = 1;
        
The above example could be alternatively executed as:
for (var k in h) {
    if (h.hasOwnProperty(k)) {
        alert('key is: ' + k + ', value is: ' + eval('h.' + k));
    }
}
        
WARNING: We are using eval() here only for demonstration purposes to show that direct property reference (obj.one) produces the same result at bracket notation (obj['one']).
It's also possible to add properties to the object whose type is an array, which mixes assignment to the indexed values of the array and the members of the object.
var a = new Array(); // or just []
a[0] = 0
a['one'] = 1;
a['two'] = 2;
a['three'] = 3;

for (var k in a) {
    if (a.hasOwnProperty(k)) {
        alert('key is: ' + k + ', value is: ' + a[k]);
    }
}
alert(a.length);
        
Oddly, the length is reported to be 1, yet four keys are printed. That's because we are manipulating both the array elements and the underlying object. That's just the flexibility of JavaScript shining through (and confusing us). The square bracket notation (i.e., []) is overloaded. When the key is numeric, we are assigning elements to the array. Otherwise, we are assigning members to the object.
Since each object has default properties that are accessed using this very same syntax, such as length and constructor, consider the case where the key in the hash is the same as one of these properties. This situation highlights the fundamental problem with associative arrays in JavaScript. It should be clear now why the length property is not set when we make an associative array data structure. By treating the object as an associative array by adding non-integer keys, you are manipulating the underlying JavaScript object, which is not tracked by the length (the length is for the indexed key values only).

Constructing a HashTable Class

In Javascript, we can create our own classes. So what we are going to do is create a HashTable() class that can maintain an associative array, but isolate API methods from data keys (no conflicts).
You might thinking, "okay, so we make a class, but how do we get around the conflicting properties problem?". Easy, we make a property which itself is an associative array and call it items. Then, we can use any key we want, and store the data about the array in other properties. The trick is to move the data part of the associative array inside of a property of the class. The following listing is the HashTable() object definition:
function HashTable(obj)
{
    this.length = 0;
    this.items = {};
    for (var p in obj) {
        if (obj.hasOwnProperty(p)) {
            this.items[p] = obj[p];
            this.length++;
        }
    }
}
        
NOTE: You should select another name for this class if you are using a JavaScript library that already defines this function.
Let's break this down a bit. Right off the bat, we create a length property, which will just be 0 to start with. Additionally, we accept an object in our constructor. Next we populate the internal items with the key => value pairs of the object passed to the constructor and increment the length coorespondingly. A typical call to create a HashTable() object would use the following syntax:
var h = new HashTable({one: 1, two: 2, three: 3, "i'm no 4": 4});
        
Already it must be nice to see a contructor...so much easier to add data to the structure!
Now, as you may recall before, we couldn't have any properties or methods in our associative array (because they could conflict with the keys of the data), so besides a 'foreach' construct, there was not much we could do with our associative array. Now that they are separated, we have the ability to add methods and properties, let's get started!
function HashTable(obj)
{
    this.length = 0;
    this.items = {};
    for (var p in obj) {
        if (obj.hasOwnProperty(p)) {
            this.items[p] = obj[p];
            this.length++;
        }
    }

    this.setItem = function(key, value)
    {
        var previous = undefined;
        if (this.hasItem(key)) {
            previous = this.items[key];
        }
        else {
            this.length++;
        }
        this.items[key] = value;
        return previous;
    }

    this.getItem = function(key) {
        return this.hasItem(key) ? this.items[key] : undefined;
    }

    this.hasItem = function(key)
    {
        return this.items.hasOwnProperty(key);
    }
   
    this.removeItem = function(key)
    {
        if (this.hasItem(key)) {
            previous = this.items[key];
            this.length--;
            delete this.items[key];
            return previous;
        }
        else {
            return undefined;
        }
    }

    this.keys = function()
    {
        var keys = [];
        for (var k in this.items) {
            if (this.hasItem(k)) {
                keys.push(k);
            }
        }
        return keys;
    }

    this.values = function()
    {
        var values = [];
        for (var k in this.items) {
            if (this.hasItem(k)) {
                values.push(this.items[k]);
            }
        }
        return values;
    }

    this.each = function(fn) {
        for (var k in this.items) {
            if (this.hasItem(k)) {
                fn(k, this.items[k]);
            }
        }
    }

    this.clear = function()
    {
        this.items = {}
        this.length = 0;
    }
}
        
NOTE: Alternatively, you could refer directly to the constructor argument, obj, rather than copying it to the member property, items. It's up to you.
Let's put the HashTable() through some exercise.
var h = new HashTable({one: 1, two: 2, three: 3, "i'm no 4": 4});

alert('original length: ' + h.length);
alert('value of key "one": ' + h.getItem('one'));
alert('has key "foo"? ' + h.hasItem('foo'));
alert('previous value of key "foo": ' + h.setItem('foo', 'bar'));
alert('length after setItem: ' + h.length);
alert('value of key "foo": ' + h.getItem('foo'));
alert('value of key "i'm no 4": ' + h.getItem("i'm no 4"));
h.clear();
alert('length after clear: ' + h.length);
        
These calls should produce the following output:
  • original length: 4
  • value of key "one": 1
  • has key "foo"? false
  • previous value of key "foo": undefined
  • length after setItem: 5
  • value of key "foo": bar
  • value of key "i'm no 4": 4
  • length after clear: 0
Let's find out how this all works.

Understanding the Implementation

We now have lots of useful methods! In JavaScript, any variable can be a reference to a function, so to add methods to the class, the easiest way to do it is to just write the function and then assign it to a property in the class. Okay, so you may be thinking, "But I can't have the same property name as a method name." That's right, another limitation of JavaScript objects is that methods are properties. However, in most cases, it won't be a problem because method names should be 'behavior' names and properties should be 'state' names.
In order to access the underlying items, we added the methods 'setItem', 'getItem', 'hasItem', 'keys', 'values' to set and retrieve data and 'remoteItem' and 'clear' methods to flush out the data. For now we will refer to each key => value pair as an item. Not that 'getItem' is added merely for completeness. You could just acccess the items property direct to retrieve values by key (which would be slightly faster anyway). The 'hasItem' method uses the 'hasOwnProperty' method to check that a key belong to the items object and isn't a function that has been added to the prototype of Object (which is done by libraries such as prototype.js).
The most important role of our methods is to keep the length property up to date. As we can see, it takes a lot of work out of our job and we can use these nice methods to work easily with our hash. Just like a HashTable in Java, the return value is a reference to the item in the HashTable() that was replaced:
alert("Previous value: " + h.setItem('foobar', 'hey'));
        
If you now want to iterate through the HashTable() like we did the object in the very beginning, you may do so using several different approaches:
Iterating the items, filtering out members inherited from the Object.prototype:
for (var k in h.items) {
    if (h.hasItem(k)) {
        alert('key is: ' + k + ', value is: ' + h.items[k]);
    }
}
        
Iterating the entries using each: (notice we don't have to use hasOwnProperty in this case)
h.each(function(k, v) {
    alert('key is: ' + k + ', value is: ' + v);
});
        
Iterating the collection of keys:
for (var i = 0, keys = h.keys(), len = keys.length; i < len; i++) {
    alert('key is: ' + keys[i] + ', value is: ' + h.getItem(keys[i]));
}
        
Iterating the collection of values:
for (var i = 0, v = h.values(), len = v.length; i < len; i++) {
    alert('value is: ' + v[i]);
}
        
You can also find out the size of the hash table:
alert('size of hash table: ' + h.length);
        

No comments:

Post a Comment

Genuine websites to earn money.

If you are interested in PTC sites then this article is for you. I have personally tried many of the sites and found that the best thing ...