我做过一些基于web的项目,但我没有过多考虑一个普通网页的加载和执行顺序。但现在我需要知道细节。很难从谷歌或SO中找到答案,所以我创造了这个问题。

一个示例页面是这样的:

<html>
 <head>
  <script src="jquery.js" type="text/javascript"></script>
  <script src="abc.js" type="text/javascript">
  </script>
  <link rel="stylesheets" type="text/css" href="abc.css"></link>
  <style>h2{font-wight:bold;}</style>
  <script>
  $(document).ready(function(){
     $("#img").attr("src", "kkk.png");
  });
 </script>
 </head>
 <body>
    <img id="img" src="abc.jpg" style="width:400px;height:300px;"/>
    <script src="kkk.js" type="text/javascript"></script>
 </body>
</html>

下面是我的问题:

这个页面是如何加载的? 装载的顺序是什么? JS代码什么时候执行?(内联和外部) CSS什么时候执行(应用)? 什么时候$(document)。准备好被处决了吗? 会下载abc.jpg吗?还是直接下载kkk.png?

我有以下认识:

浏览器首先加载html (DOM)。 浏览器开始逐行从上到下加载外部资源。 如果遇到<script>,加载将被阻塞,等待JS文件加载并执行,然后继续。 其他资源(CSS/图像)并行加载并在需要时执行(如CSS)。

或者是这样的:

浏览器解析html (DOM)并以数组或类似堆栈的结构获取外部资源。html加载后,浏览器开始并行加载结构中的外部资源并执行,直到所有资源都加载完毕。然后DOM将根据用户的行为(取决于JS)进行相应的更改。

有人能详细解释一下当你得到一个html页面的响应时发生了什么吗?这在不同的浏览器中是否有所不同?有关于这个问题的参考资料吗?

谢谢。

编辑:

我用Firebug在Firefox中做了一个实验。如下图所示:


当前回答

在Firefox中打开页面并获取HTTPFox插件。它会告诉你所需要的一切。

在archivism .incuito上找到了这个:

http://archivist.incutio.com/viewlist/css-discuss/76444

When you first request a page, your browser sends a GET request to the server, which returns the HTML to the browser. The browser then starts parsing the page (possibly before all of it has been returned). When it finds a reference to an external entity such as a CSS file, an image file, a script file, a Flash file, or anything else external to the page (either on the same server/domain or not), it prepares to make a further GET request for that resource. However the HTTP standard specifies that the browser should not make more than two concurrent requests to the same domain. So it puts each request to a particular domain in a queue, and as each entity is returned it starts the next one in the queue for that domain. The time it takes for an entity to be returned depends on its size, the load the server is currently experiencing, and the activity of every single machine between the machine running the browser and the server. The list of these machines can in principle be different for every request, to the extent that one image might travel from the USA to me in the UK over the Atlantic, while another from the same server comes out via the Pacific, Asia and Europe, which takes longer. So you might get a sequence like the following, where a page has (in this order) references to three script files, and five image files, all of differing sizes: GET script1 and script2; queue request for script3 and images1-5. script2 arrives (it's smaller than script1): GET script3, queue images1-5. script1 arrives; GET image1, queue images2-5. image1 arrives, GET image2, queue images3-5. script3 fails to arrive due to a network problem - GET script3 again (automatic retry). image2 arrives, script3 still not here; GET image3, queue images4-5. image 3 arrives; GET image4, queue image5, script3 still on the way. image4 arrives, GET image5; image5 arrives. script3 arrives. In short: any old order, depending on what the server is doing, what the rest of the Internet is doing, and whether or not anything has errors and has to be re-fetched. This may seem like a weird way of doing things, but it would quite literally be impossible for the Internet (not just the WWW) to work with any degree of reliability if it wasn't done this way. Also, the browser's internal queue might not fetch entities in the order they appear in the page - it's not required to by any standard. (Oh, and don't forget caching, both in the browser and in caching proxies used by ISPs to ease the load on the network.)

其他回答

1)下载HTML。

2) HTML is parsed progressively. When a request for an asset is reached the browser will attempt to download the asset. A default configuration for most HTTP servers and most browsers is to process only two requests in parallel. IE can be reconfigured to downloaded an unlimited number of assets in parallel. Steve Souders has been able to download over 100 requests in parallel on IE. The exception is that script requests block parallel asset requests in IE. This is why it is highly suggested to put all JavaScript in external JavaScript files and put the request just prior to the closing body tag in the HTML.

3)一旦HTML被解析,DOM就被渲染。在几乎所有的用户代理中,CSS与DOM的呈现是并行呈现的。因此,强烈建议将所有CSS代码放在外部CSS文件中,这些文件在文档的<head></head>部分中请求的位置尽可能高。否则,页面呈现到DOM中CSS请求位置的出现位置,然后从头开始呈现。

4) Only after the DOM is completely rendered and requests for all assets in the page are either resolved or time out does JavaScript execute from the onload event. IE7, and I am not sure about IE8, does not time out assets quickly if an HTTP response is not received from the asset request. This means an asset requested by JavaScript inline to the page, that is JavaScript written into HTML tags that is not contained in a function, can prevent the execution of the onload event for hours. This problem can be triggered if such inline code exists in the page and fails to execute due to a namespace collision that causes a code crash.

在上述步骤中,最消耗CPU的是DOM/CSS的解析。如果你想让你的页面处理得更快,那么通过消除冗余指令和将CSS指令整合到尽可能少的元素引用中来编写高效的CSS。减少DOM树中的节点数量也会产生更快的呈现速度。

Keep in mind that each asset you request from your HTML or even from your CSS/JavaScript assets is requested with a separate HTTP header. This consumes bandwidth and requires processing per request. If you want to make your page load as fast as possible then reduce the number of HTTP requests and reduce the size of your HTML. You are not doing your user experience any favors by averaging page weight at 180k from HTML alone. Many developers subscribe to some fallacy that a user makes up their mind about the quality of content on the page in 6 nanoseconds and then purges the DNS query from his server and burns his computer if displeased, so instead they provide the most beautiful possible page at 250k of HTML. Keep your HTML short and sweet so that a user can load your pages faster. Nothing improves the user experience like a fast and responsive web page.

Dynatrace AJAX Edition向您展示了页面加载、解析和执行的确切顺序。

如果你问这个问题是因为你想要加快你的网站速度,请查看雅虎网站的最佳实践。它有很多加速你的网站的最佳实践。

The chosen answer looks like does not apply to modern browsers, at least on Firefox 52. What I observed is that the requests of loading resources like css, javascript are issued before HTML parser reaches the element, for example <html> <head> <!-- prints the date before parsing and blocks HTMP parsering --> <script> console.log("start: " + (new Date()).toISOString()); for(var i=0; i<1000000000; i++) {}; </script> <script src="jquery.js" type="text/javascript"></script> <script src="abc.js" type="text/javascript"></script> <link rel="stylesheets" type="text/css" href="abc.css"></link> <style>h2{font-wight:bold;}</style> <script> $(document).ready(function(){ $("#img").attr("src", "kkk.png"); }); </script> </head> <body> <img id="img" src="abc.jpg" style="width:400px;height:300px;"/> <script src="kkk.js" type="text/javascript"></script> </body> </html> What I found that the start time of requests to load css and javascript resources were not being blocked. Looks like Firefox has a HTML scan, and identify key resources(img resource is not included) before starting to parse the HTML.

在Firefox中打开页面并获取HTTPFox插件。它会告诉你所需要的一切。

在archivism .incuito上找到了这个:

http://archivist.incutio.com/viewlist/css-discuss/76444

When you first request a page, your browser sends a GET request to the server, which returns the HTML to the browser. The browser then starts parsing the page (possibly before all of it has been returned). When it finds a reference to an external entity such as a CSS file, an image file, a script file, a Flash file, or anything else external to the page (either on the same server/domain or not), it prepares to make a further GET request for that resource. However the HTTP standard specifies that the browser should not make more than two concurrent requests to the same domain. So it puts each request to a particular domain in a queue, and as each entity is returned it starts the next one in the queue for that domain. The time it takes for an entity to be returned depends on its size, the load the server is currently experiencing, and the activity of every single machine between the machine running the browser and the server. The list of these machines can in principle be different for every request, to the extent that one image might travel from the USA to me in the UK over the Atlantic, while another from the same server comes out via the Pacific, Asia and Europe, which takes longer. So you might get a sequence like the following, where a page has (in this order) references to three script files, and five image files, all of differing sizes: GET script1 and script2; queue request for script3 and images1-5. script2 arrives (it's smaller than script1): GET script3, queue images1-5. script1 arrives; GET image1, queue images2-5. image1 arrives, GET image2, queue images3-5. script3 fails to arrive due to a network problem - GET script3 again (automatic retry). image2 arrives, script3 still not here; GET image3, queue images4-5. image 3 arrives; GET image4, queue image5, script3 still on the way. image4 arrives, GET image5; image5 arrives. script3 arrives. In short: any old order, depending on what the server is doing, what the rest of the Internet is doing, and whether or not anything has errors and has to be re-fetched. This may seem like a weird way of doing things, but it would quite literally be impossible for the Internet (not just the WWW) to work with any degree of reliability if it wasn't done this way. Also, the browser's internal queue might not fetch entities in the order they appear in the page - it's not required to by any standard. (Oh, and don't forget caching, both in the browser and in caching proxies used by ISPs to ease the load on the network.)