Our performance tests currently rely on metrics accessible by web content (Date.now, events, setTimeout, etc.). Many of these sources of information are loosely defined or just lies, implemented that way to improve performance of real web pages. Performance tests built on these metrics are useful in their own right, but it's very hard to use them to measure what users actually perceive: pixels appearing on screen. The problem is made even harder by process separation, GPU rendering, async scrolling and animation, and async rerendering (fennec).
Chris Jones, original specification