Performance Optimization of Arrays - Part II
Hello Friends,
Hope you have read part I of this topic which I posted few days ago. If not then I would suggest going through that first as this post is just a continuation of that one.
So coming to the point, here is the second reason -
2. “JavaScript arrays are sparse arrays” – JavaScript allows arrays to be sparse therefore, when you write arrayObj = new Array(100); unlike C runtime, JScript doesn’t allocate any slot/memory for those 100 entries up front. It allocates slot for an index only when an index is to be populated with a value, for example…
arrayObj = new Array(100);
arrayObj[10] = 10;
In this example, the array size is 100, but in the actual physical memory there is only one slot allocated for this array. Remaining 99 slots don’t exist at all.
Wondering how this factor contributes to the bad performance? Not allocating memory for all the slots up front is good design. Isn’t it?
Well, JScript always assumed that all arrays are sparse. So even if you had a fully dense array in your code, JScript runtime would treat it like a sparse array only. So if you are going to do a pop() operation on a dense JScript array, JScript just won’t go and delete the last indexed entry and update the length attribute. It does something which is extremely performant for sparse arrays, but equally under-performant for dense arrays. Let‘s see what it is all about.
Internally, for each Array object, JScript runtime maintains a table (different from HashTable, let’s call it TrackerTable) which is nothing but a list of pointers to actual entries in the hash table. So if you just create an Array Object of size 100, and add 10 entries (indexed or named), the TrackerTable will have 10 pointers pointing to actual entries in the hash table. Let’s take one simple example…
arrayObj = new Array(100);
arrayObj[10] = 10;
arrayObj.length = 90;
In this example, I just create an Array of size 100, and populated index 10. Next I reduce length of array to 90. Since the length has been reduced, entries from 90 to 99 have to be deleted. The ideal thing to do would be to delete <key, value> pair from hash table for key = 90 to 99. That means 10 operations on Hash table. JScript is smart here and saves 9 out of 10 operations.
What JScript actually does is that it goes to the TrackerTable, iterates through it, for every entry it checks if it falls between deletion range (90-99 in this case), if yes then delete it. In above example, TrackerTable had only one entry. So instead of 10 operations on hash table, JScript does only one iteration and performs better.
However what if the array was a dense array and all the indexed entries from 0 to 100 were populated? Unfortunately in this case too, JScript would follow the same logic. Therefore it would do 100 iterations over TrackerTable and end up doing 90 more operations.
How it was fixed– As I said in the last post, in IE8 JScript we have the mechanism in place to decide whether an array is a sparse or dense. So we have changed the implementation to take sparse path only when the array is actually sparse. That means for operations on dense arrays, we don’t use TrackerTable anymore. We directly go to the right storage and get the things done in fast manner.
So this is all I had to tell about array improvements. Hope you got an idea of what was/is happening under the hood? If not, do leave a note and I would try to address that. And yes, I have lot many interesting things to share with you all, so keep checking this blog.
-JP
Comments
Anonymous
April 09, 2008
The comment has been removedAnonymous
April 09, 2008
@julian - There is no change in array behavior. Only the under-the-hood implementation has changed. Yes we are aware of Array-Subclass issue. Thanks for reporting it. -JPAnonymous
April 10, 2008
@Jaiprakash: I think you mean "It is still NOT a Native JavaScript Object" If it were a native JavaScript object, subclassing and accessing the hierarchy constructor info would be possible. @JScript team: If you can work on making the Array a native JS object for IE8, you would win over many developers! Currently things like this just make Firefox, Safari, Konqueror, Opera, Camino, etc. far superior browsers.Anonymous
April 10, 2008
@steve: What does it being a native JavaScript Object or not have to do with being able to do with what you want to do? The main issue is that there is a catch-all setter function - in ECMA-262 3ed parlance a custom [[Put]] (P, V) method - which is not a prototype property, it's an instance property. The ECMAScript spec doesn't allow inheriting that behaviour! There's also minor problems in the Array prototype methods, but those are ignorable for most cases. Yes, it'd be great if one could inherit the catch-all setter using prototypes, but that issue has nothing to do with whether Arrays are true "Native JavaScript Objects" or not. @Jaiprakash: Regarding that bug i pointed out in the comments of Part I (http://blogs.msdn.com/jscript/archive/2008/03/25/performance-optimization-of-arrays-part-i.aspx#8369217) - just wanted to make sure that you noted that Array.prototype.push is not the only Array.prototype method that is affected. All Array.prototype methods that can extend the length property are affected by similar bugs when the resulting array length would exceed 2^32.Anonymous
April 15, 2008
@liorean The reason why I want a native Array object (which JScript does NOT offer), is that I want to add methods to Arrays that I use as a List (key==integer index), but not to Arrays that are used as Collections (key==random index). For example an .indexOf(value) method that returned the (first) index of a given value would be very handy. In other browsers, I can subclass the Array, and make a "List" Object... that I can then add this method to. I can also "cast" it back and forth to/from a True JavaScript Array as and when I feel like it. I appreciate that MS has fixed the major performance issues with Arrays, but I would also like them to fix the fact that it (as are many other objects), not a Subclass-able object. But i digress.Anonymous
April 16, 2008
@steve - Unfortunately it is a bug in our implementation that we don’t update “length” property properly when a user defined object’s prototype is set to Array.prototype and any of the array operations are performed on instances of such objects. However I don't think that just because of this one bug, conclusion should be drawn that Array object in Jscript is not a "native javascript object". Jscript Array object is indeed a Native JavaScript object. All operations, which are expected to work on a JavaScript Array object (x = new Array() or x = []), work perfectly in our implementation.
- JP
- Anonymous
July 28, 2008
@Steve Not being able to subclass Array is not a bug for the exact reason Liorean stated. It is not a bug. @Jaiprakash it is a bug in the JScript implementation that the the "length" property is not updated on a native EcmaScript object. It does not have anything to do with the "user defined object’s prototype being set to Array.prototype." In fact, Array .prototype.push clearly states: "NOTE The push function is intentionally generic; it does not require that its this value be an Array object. Therefore it can be transferred to other kinds of objects for use as a method. Whether the push function can be applied successfully to a host object is implementation-dependent." Making such claims that "the length property is not updated properly when a user defined object’s prototype is set to Array.prototype" is misleading. It seems to be entering into the cloudy thinking of how web developers imagine JavaScript works and how they want IE to work. It is not relevant to standards. We can see in the following example that calling push on a non-array object will not result in the "length" property being properly updated. This seems to be a bug in JScript's implementation of the push() method, and I have annotated where I think JScript is going wrong further down below. <script> var p = {}; Array.prototype.push.call(p, "one slice"); document.write("p.length: " + p.length + "<br>"); document.write("p[0] " + p[0] + "<br>"); </script> Results: IE8b1: | p.length: undefined | p[0] undefined FF3, Safari3, Opera9: | p.length: 1 | p[0] one slice To see what the expected behavior should be, lets look at Array.prototype.push:
15.4.4.7 Array.prototype.push ... 1. Call the [[Get]] method of this object with argument "length". 2. Let n be the result of calling ToUint32(Result(1)). 3. Get the next argument in the argument list; if there are no more arguments, go to step 7. 4. Call the [[Put]] method of this object with arguments ToString(n) and Result(3). 5. Increase n by 1. 6. Go to step 3. 7. Call the [[Put]] method of this object with arguments "length" and n. 8. Return n.
| 1. Call the [[Get]] method of this object with argument "length". undefined | 2. Let n be the result of calling ToUint32(Result(1)) That would result in ToUint32(undefined). This behavior is specified:
9.6 ToUint32: (Unsigned 32 Bit Integer) The operator ToUint32 converts its argument to one of 232 integer values in the range 0 through 232-1, inclusive. This operator functions as follows:
- Call ToNumber on the input argument.
- If Result(1) is NaN, +0, -0, +∞, or -∞, return +0.
- Compute sign(Result(1)) * floor(abs(Result(1))).
- Compute Result(3) modulo 232 ; that is, a finite integer value k of Number type with positive sign and less than 232 in magnitude such the mathematical difference of Result(3) and k is mathematically an integer multiple of 232 .
- Return Result(4).
| 1. Call ToNumber on the input argument. Now we have to look at ToNumber:-
9.3 ToNumber The operator ToNumber converts its argument to a value of type Number according to the following table: Input Type Result Undefined NaN ...
ToNumber returns NaN to ToUint32.
- Call ToNumber on the input argument.
- If Result(1) is NaN, +0, -0, +∞, or -∞, return +0.
So ToUint32 returns +0 to Array.prototype.push
15.4.4.7 Array.prototype.push 1. Call the [[Get]] method of this object with argument "length".
result: undefined
2. Let n be the result of calling ToUint32(Result(1)).
n = 0
3. Get the next argument in the argument list; if there are no more arguments, go to step 7.
"one slice"
4. Call the [[Put]] method of this object with arguments ToString(n) and Result(3).
result: this["0"] = "one slice"
5. Increase n by 1.
n = 1;
6. Go to step 3.
go to step 7.
7. Call the [[Put]] method of this object with arguments "length" and n.
result: this["length"] = 1
8. Return n.
In this example, p does not initially have a "length" property. When Array.prototype.push is called, p.length is undefined. This value, undefined, is passed to ToUint32, which calls ToNumber(undefined). ToNumber returns NaN to ToUint32, which returns +0 to step 2 of push(). Garrett
Anonymous
July 28, 2008
The comment has been removedAnonymous
July 29, 2008
The comment has been removedAnonymous
August 15, 2008
@Jaiprakash: Garrett is correct. That Array.prototype.push doesn't work on non-arrays is a broken implementation i.e. a bug. That the [[Put]] (P, V) method of Array instances is not inherited is on the other side not a broken implementation. The ECMAScript spec does not provide any behaviour by which it would be inherited. It is thus not a bug.Anonymous
August 18, 2008
Thank you, David. The "bug" that Microsoft acknowledged is bogus. The real problem is with the methods in Array. Array.prototype.slice.call({}); "undefined" in IE8. This is a spec violation. In IE8b1, I cannot call any array methods in a generic context, using a NodeList as the thisArg. For example: Array.prototype.slice.call(document.childNodes); IE8 : "Error: JScript Object Expected" Has Microsoft considered writing test suites?Anonymous
October 28, 2008
The comment has been removedAnonymous
October 28, 2008
Could you provide little more details, like how the array is accessed, from JScript code, VBSCript or an ActiveX? How are you populating the array, direct assignment or some builtin array operation like push()? It would be great if you could post a repro. -JPAnonymous
October 30, 2008
This array is only accessed from jscript. here is my sample code for adding and finding values in that array this.JSHash = new Array(); this.length = 0; this.blnOrderedHash=false; // for making sorted entries this.add = function addHashNode(key, Value){ key = "key_" + key; oldObject = this.JSHash[key]; if (typeof (oldObject) == "undefined"|| oldObject == null) this.length++; this.JSHash[key]=Value; } this.remove= function removeHashNode(key){ key = "key_" + key; oldObject = this.JSHash[key]; if (typeof (oldObject) != "undefined" || oldObject != null) { this.length--; delete this.JSHash[key]; } else return; if(this.blnOrderedHash==true){ var temp = new Array(); for(var key in this.JSHash) { temp[key]=this.JSHash[key]; } this.JSHash = temp; } } this.find = function findHashNode(key){ key = "key_" + key; Returned_object_from_hash = this.JSHash[key]; if (typeof (Returned_object_from_hash) == "undefined") Returned_object_from_hash = null; return Returned_object_from_hash; }Anonymous
November 02, 2008
The comment has been removedAnonymous
November 03, 2008
This array is primarly used to save IFRAME objects. Arrayz find method method takes a string value as key and returns IFRAME obect. Some times this IFRAME object returns as strings containing garbage value. The size of the array usually ranges from 30 to 40 objectsAnonymous
March 02, 2009
I'm guessing that the heuristic for detecting sparse/dense arrays has the number 32 in it. The following code fails, i.e. it doesn't delete the splic()ed element, even though the length does get decremented correctly. var L=33,a=[],b=[]; for (var i=0;i<L; i++) a[i]=i; b.push.apply(b,a); b.splice(L-1,1); b[L-1]==undefined Pasting this into the IE8 console gives false where as it should give true if the element were correctly removed. If you start with L=32 instead, you'll correctly get true. Because of this I'm guessing that there is a bug one of the two ways splice() is implemented. -Jim PS @JSBlog, please file this bug if you can figure out how; I sure couldn't.Anonymous
March 04, 2009
@Jim: Thanks for reporting this issue. We have filed a bug and fixed this issue in the latest IE8 builds. GauravS [MSFT]Anonymous
January 11, 2010
So does this mean that for "dense" arrays, operations like splice would result in moving all subsequent array elements and potentially cause performance issues esp. when operating near index 0 of large dense arrays?Anonymous
April 25, 2010
@Jaiprakash: Garrett is correct. That Array.prototype.push doesn't work on non-arrays is a broken implementation i.e. a bug. That the [[Put]] (P, V) method of Array instances is not inherited is on the other side not a broken implementation. The ECMAScript spec does not provide any behaviour by which it would be inherited. It is thus not a bug.