Comments for Benchmarking Is Hard: Reddit Edition
But it is a one time in a while cost.
According to the Wikipedia article you linked, the common DNS TTL is 24 hours. DNS is just not that slow compared to everything else.
Take OpenDNS for example:
Cached Query dig @208.67.222.222 dojotoolkit.org Trial 1: Query time: 7 msec Trial 2: Query time: 8 msec
Uncached Query dig @208.67.222.222 fkadsjkfjksadjfkjasdf.com Trial 1: Query time: 118 msec Trial 2: Query time: 7 msec
As you can see, DNS query times become quickly negligible.
I am willing to bet that most people visit the same few websites over and over again rather than 100 different websites that don't have their DNS cached.
I can't speak for - and won't speak to - Apple's marketing. What I can say is that if you look through the Chrome Blog (http://chrome.blogspot.com/), you'll see that what's claimed with relationship to JS speed is only ever that. Does it matter in the real world? Depends on your app. Systems like GMail get a lot more out of the performance improvement in V8 than, say, the Google home page does. I'll agree that the media tend to distort what's claimed to everyone's great frustration, given the care involved in how the results are carefully gathered and presented.
You might be able to uncharitably argue that this distortion might be seen as a "win" somehow for Chrome marketing, but I can assure you that inaccuracy hurts everyone, particularly since what matters at the end of the day to end-users is how fast the web feels to them. No one installs a browser and keeps using it based on a benchmark. Placebo effects wear off, so there's no gain in a quick-hit PR win that's not backed up by your day-to-day experience of the browser.
All of that speaks to why I praised the IE test methodology (while noting their stale results). Getting a reliable measure of end-user perceived performance is good for everyone, and I don't think you'll find anyone on the Chrome team arguing to the contrary. All this post is arguing is that the presented benchmark doesn't even get close to that mark.
Regards
I think it's legit to include pathological cases in a benchmark...if things are bottlenecks, well, then they're bottlenecks. What's difficult about it is that no argument is made for why the site in question is representative of some broad class of pathological sites that hurt you in the real world. It might be, but who knows? That's enough (to my mind) to eliminate it from a list which might have been otherwise populated by, say, the Alexa Top 100 sites or something.
The biggest failings to my mind are around methodology. The site appears to be down for me, but IIRC, the times reported didn't include a # of test runs, median and mean values, a discussion of how outliers were handled or diagnosed (discarded? included and reported as standard deviation?), or any real attempt to eliminate contention on local disk (since this was a test of local disk which was being used both by browsers for caching and storage an for the server or file-based I/O to run the tests, after all).
Lastly, it didn't test released versions of browsers and it didn't discuss how error related to extensions might be eliminated or accounted for.
Maybe what the world needs is a "how to produce and present credible benchmarks" document?
Regards
I will have to say though that slow motion benchmarks on a browser is useless because I can tell just by trying to interact with the page before onload triggers, that the whole browser is likely to be locked up. This is not a very useful standard to see how fast a browser paints a portion of a page.
As for "local dehydrated versions of tests" the point of a test is to isolate the variables in which would be likely to cause high variance.
Your browser has nothing to do with the speed of your network or the server. Page loading times will often surpass DNS resolution times due to the fact that many sites you visit will be cached.
I find that I can already interact with the page after the onload event has fired. Isn't this good enough?
I didn't notice any instance where the page became unresponsive after onload.
-"Your methodology doesn’t actually test to find out when (in the course of page loading), one can being to meaningfully interact with the page, so making claims about when things are “locked up” or not doesn’t hold water"
I used both DOMContentLoaded and onreadyStateChange (for IE) and none of them allowed for any meaningful interaction with the page (Despite their descriptions). Therefore I decided to go with onload because it is a standard in which the definition defines an interactive page.
-"One of the major improvements in Chrome’s initial release was an implementation of DNS pre-fetching. It has a large impact on real-world performance."
It only has a one time impact. Once you've visited the page, your DNS cache is likely to hold the query for future reference. Assuming you are loading a new page, the page loading times are likely to be far higher than DNS. The DNS protocol is very lightweight compared to HTTP.
-"Similarly, IE 8 and Chrome 2.0 both implemented concurrent script downloading, dramatically improving page loading performance for pages (like Facebook) that are script heavy and include many resources"
I think all the modern browsers have concurrent connections by now.
-"If the goal is to test an isolated component of a browser (like the single-function benchmarks you derided)"
My goal was to create the most meaningful browser benchmark to date. Unfortunately, you cannot have a meaningful benchmark across a network when the network delays can be as big as the page loading times.
If there was a way to standardize an internet connection, I would do it. But I really don't think you would find a big difference even if you stored the pages on a LAN to emulate the internet.
http://blog.chromium.org/2008/09/dns-prefetching-or-pre-resolving.html
The graphs are in there fore a reason.
And then reading up on DNS TTL's:
http://en.wikipedia.org/wiki/Time_to_live
Regards
On your dig against reddit: This point was actually mentioned in the benchmark post's discussion but downvoted into oblivion by a troll. http://www.reddit.com/r/programming/comments/8vd9s/finally_a_browser_benchmark_that_tests_real/c0ak3au
I think it's totally fair to argue over the relative merit of the V8 and Sun Spider benchmarks as they relate to a real-world web workload. What I meant by that statement wasn't that they measured everything, but rather that they purport to measure specific things and do it well. They use a strong methodology, are reproducible, and don't make claims about what they test that can't be backed up by the tests themselves. In that sense, they're much better than the test under discussion.
Regards
A couple of quick points:
- Noting that things are "locked up" before
onload
says nothing about what might be causing a page to be unresponsive afteronload
. Your methodology doesn't catch any of those issues. - Nearly all browsers implement a "progressive rendering" algorithm that causes the page to take longer to finally load and render, but provides an interactive UI in the interim. Your methodology doesn't actually test to find out when (in the course of page loading), one can being to meaningfully interact with the page, so making claims about when things are "locked up" or not doesn't hold water
- One of the major improvements in Chrome's initial release was an implementation of DNS pre-fetching. It has a large impact on real-world performance. Similarly, IE 8 and Chrome 2.0 both implemented concurrent script downloading, dramatically improving page loading performance for pages (like Facebook) that are script heavy and include many resources
- Using local pages eliminates any potential differences in the effect of network-level request parallelism. For example, IE 7 (not tested) allows 2 network connections per host (via HTTP 1.1), whereas IE 8 bumps the limit to 8. This has large implications for page loading performance when using CDNs
If the goal is to test the real world performance of browsers against pages as browsers load them, you should test that. If the goal is to test an isolated component of a browser (like the single-function benchmarks you derided), then you should make clear what parts of the browser you're attempting to stress and eliminate sources of error.
Regards
Credibility that is not at all deserved, considering that these tests were created specifically to perform well in specific browsers, and are basically testing tiny parts of JS and are not relevant to overall performance at all, especially since JS only makes up a tiny part of even the JS-heaviest sites today.
Sites using the canvas tag throw a hard punch to IE because it doesn't directly support it.
It seems to be the RIA apps and other leading-edge work that IE falls apart on. But I guess Microsoft's solution there would be Silverlight, not IE8.