I found in BananaBread that if I use part of a a buffer (index and element), the size of the buffer matters. If I use 1K of a 10K or the same 1K but of a 1MB buffer, the frame rate changes quite a lot. Apologies for not having a good testcase, here is the BananaBread code, http://syntensity.com/static/bb/bb2.tar.bz2 The code there will use 2MB buffers for all temporary data, and is quite slow because of that. The code can be modified to optimize so that it uses the smallest buffer possible for the data, search for ceilPower2, and delete GL.immediate.MAX_TEMP_BUFFER_SIZE;// in that function's definition. It will then not use a 2MB buffer, instead it rounds up to the next power of 2, so it uses small buffers when possible. (It has to pregenerate buffers of all of those sizes..) Note that this change only affects the buffer size used. We still upload the same amount of data using glBufferSubData. But if the buffer we use is large, we end up with slower frame rates. I suppose it is possible this is a GL driver issue. For example if glBufferSubData reads the buffer from the GPU, modifies the part of the buffer, and pushes the whole thing back up, this would explain the slowness. I have anecdotally heard that some mobile GPU drivers used to do things like that with glTexSubImage.
I expect that you're running into the known perf issue that WebGL drawElements is hard to optimize as it has to validate that all the indices being used are in range given the vertex attrib arrays. I have a plan to fix that, it's in bug 732660. It could use help from someone more versed in fancy data structures than me. Regarding your theory that glBufferSubData might be slow on some drivers in the same way that glTexSubImage2D is, that's very plausible as well, and it would be nice to check in a profile if that's what's happening or if the WebGL-specific issue is the determining factor. If it is, then WebGLBuffer::FindMaxElementInSubArray will show prominently in the profile.
Can you retry now that bug 732660 has landed?
That link is stale I'm afraid (I deleted that entire server), so I don't have the testcase from before. Testing on a new build, I don't see much difference when doubling the temp buffer size, so I think that's a good sign.