Let's have a look what happens when we download a web page. First, the browser sends an HTTP GET request to the server with the web page URL. The server sends back an HTTP header that is typically about 1KB in size. It can be more or less depending on many factors. But, let's hypothetically assume that it is 1KB in size. And, the page contents is sent back in the response body. Let's not forget that all this happens over a network connection and there is also a network overhead. Okay. The page is loaded into the browser and the browser starts rendering the HTML contents. Now, the page defines a number of external resource files, like CSS style sheets, JavaScript scripts, images, etc. Let's say that the number of the external resources is 2. What that means as that for every single resource the browser will initiate a network connection with the server, which can be on the other side of the globe, send an HTTP request, receive a response containing the resource file, decode the response and load the resource file into the browser. As you probably noticed, there is a significant overhead. In our case, at least 20KB. Plus, additional delays due to network latency.
Wouldn’t it be wonderful to load the entire page with all of the external files inside a compressed file? There are a number benefits for using this approach:
- The number of round-trips is significantly reduced.
- Payload overhead is reduced.
- The browser doesn’t have to process as many responses .
- Network-related delays are reduced.
- The more files are in the archive, the higher level of compression typically can be achieved.
- The web server doesn’t have to create a log for every single request
- Web server administrators are happier because reading of the web server logs is easier, and they take up less space.
- Many involved parties are happier because less network bandwidth is utilized.
- The end user is happier because the web page loads faster.
Sounds very promising. However, consequent requests to load other pages can also mean that the same contents will be packaged up and re-sent again. With the current HTML pages, the browser will cache the files it received in the previous session and will re-use the cached resources without re-downloading the common files. Therefore, the compressed file approach can actually become slower and utilize more bandwidth.
So, how can we solve the issue? By adding a manifest to the compressed file that will actually define a reference to an external HTZ package that contains common files.
But, that is not the extent of the possibilities. Let's say web site statistics show that users who come to the site typically visit pages A, B and C. The package manifest can be smart enough to include all three pages into a single package. So, the browser wouldn’t even have to request pages B and C upon downloading page A. Where this can be especially useful is for small-screen browsers, like smart phones. The multiple pages can represent a single-screen big-screen browser contents. The multi-part contents (e.g. text, images, drawings) can e delivered as a sequence of pages inside a single package. Or, the big-screen browser can simply assemble the multi-part contents into a single page.
Now, let's talk about the web server. How do we manage packages on the server?
A package can represent a folder on the web server. The key is to have a package configuration file in the folder. Without the manifest in the folder, parent's folder configuration will be used, and so on...
For example, let's have a look at the following folder structure:
site\
- index.html
- index.html.config
- license.txt
- site.config
- web.config
site\core\
- core.config
site\core\code\
site\core\code\controls\
- controls.config
- multi-select-dropbox-autocomplete.htz
- time-period.htz
- xsl-view.htz
site\core\css\
- style.css
site\core\images\
- arrow1.gif
- banner.gif
- body_bg.gif
- footer_bg.gif
- logo.gif
In the root folder "site" the web.config will contain web site global configuration. The site.config matches the name of the folder and will have folder-specific configuration, and the index.html.config will contain configuration that is specific to the index.html web page.
Similarly, site\core\core.config and site\core\controls\controls.config files will contain folder-specific configurations that will supersede the parent's folder configuration.