August 20, 2011

HTZ HyperText Packages

Let's have a look what happens when we download a web page. First, the browser sends an HTTP GET request to the server with the web page URL. The server sends back an HTTP header that is typically about 1KB in size. It can be more or less depending on many factors. But, let's hypothetically assume that it is 1KB in size. And, the page contents is sent back in the response body.  Let's not forget that all this happens over a network connection and there is also a network overhead. Okay. The page is loaded into the browser and the browser starts rendering the HTML contents. Now, the page defines a number of external resource files, like CSS style sheets, JavaScript scripts, images, etc. Let's say that the number of the external resources is 2. What that means as that for every single resource the browser will initiate a network connection with the server, which can be on the other side of the globe, send an HTTP request, receive a response containing the resource file, decode the response and load the resource file into the browser. As you probably noticed, there is a significant overhead. In our case, at least 20KB. Plus, additional delays due to network latency.
Wouldn’t it be wonderful to load the entire page with all of the external files inside a compressed file? There are a number benefits for using this approach:
  • The number of round-trips is significantly reduced.
  • Payload overhead is reduced.
  • The browser doesn’t have to process as many responses .
  • Network-related delays are reduced.
  • The more files are in the archive, the higher level of compression typically can be achieved.
  • The web server  doesn’t have to create a log for every single request
  • Web server administrators are happier because reading of the web server logs is easier, and they take up less space.
  • Many involved parties are happier because less network bandwidth is utilized.
  • The end user is happier because the web page loads faster.

Sounds very promising. However, consequent requests to load other pages can also mean that the same contents will be packaged up and re-sent again. With the current HTML pages, the browser will cache the files it received in the previous session and will  re-use the cached resources without re-downloading the common files. Therefore, the compressed file approach can actually become slower and utilize more bandwidth.
So, how can we solve the issue? By adding a manifest to the  compressed file that will actually define a reference to an external HTZ package that contains common files.
But, that is not the extent of the possibilities. Let's say web site statistics show that users who come to the site typically visit pages A, B and C. The package manifest can be smart enough to include all three pages into a single package. So, the browser wouldn’t even have to request pages B and C upon downloading page A. Where this can be especially useful is for small-screen browsers, like smart phones. The multiple pages can represent a single-screen big-screen browser contents. The multi-part contents (e.g. text, images, drawings) can e delivered as a sequence of pages inside a single package. Or, the big-screen browser can simply assemble the multi-part contents into a single page.

Now, let's talk about the web server. How do we manage packages on the server?
A package can represent a folder on the web server. The key is to have a package configuration file in the folder. Without the manifest in the folder, parent's folder configuration will be used, and so on...
For example, let's have a look at the following folder structure:



site\
- index.html
- index.html.config
- license.txt
- site.config
- web.config
site\core\
- core.config
site\core\code\
site\core\code\controls\
- controls.config
- multi-select-dropbox-autocomplete.htz
- time-period.htz
- xsl-view.htz
site\core\css\
- style.css
site\core\images\
- arrow1.gif
- banner.gif
- body_bg.gif
- footer_bg.gif
- logo.gif

In the root folder "site" the web.config will contain web site global configuration. The site.config matches the name of the folder and will have folder-specific configuration, and the index.html.config will contain configuration that is specific to the index.html web page.
Similarly, site\core\core.config and site\core\controls\controls.config files will contain folder-specific configurations that will supersede the parent's folder configuration.

August 18, 2011

WebIL - the future Assembly language for the Web?

Some people have expressed their opinion of JavaScript being the Assembly language for the web. I actually partially agree with the statement. It is far from being close to the Assembly language neither syntactically nor functionally, but "Javascript is as low-level as a web programming language goes."
Despite the fact that especially in last few years it became much more developer-friendly (thank you John Resig for the wonderful jQuery library!) and faster with the improved browser engines, it still has many flaws (http://c2.com/cgi/wiki?JavaScriptFlaws). Just to name a few:
  • Dynamically-typed variables often cause run-time errors
  • Language syntax and implementation of many data types (e.g. arrays, date/time, numeric types) often do not follow most common programming language conventions (C-like languages like C/C++, C#, Java)
  • Very limited number of built-in functions. There is no common framework that is supported by all browsers. In most cases, you need to download external libraries or create your own to implement commonly-used functionality.
  • Browsers may interpret the same script differently. It is necessarily caused by non-conformance to ECMA specification, but by different ways of manipulating the DOM.
  • No ways or, rather awkward ways to manage performance, memory utilization and garbage collection
  • No easy ways to log and report scripting errors
  • No easy ways to ensure source code protection, code trust and security
  • JavaScript is not a true object-oriented programming language. People that can script JavaScript classes are often considered JavaScript gurus :-)
So, how can we solve it? My idea is to implement a browser engine that would execute WebIL - Web Intermediate Language, just like C# or VB.Net code is first compiled into MSIL (Microsoft Intermediate Language) and then compiled into OS-specific code at run-time (JIT - Just In-Time compiling). Here are a few advantages of this concept:
  • The WebIL is not language-agnostic. The code can be written in JavaScript, C#, Java,VB, F#, Python, Ruby, UNI (yoU Name It :-)), and then compiled into the WebIL by a compiler/interpretor. So, the existing JavaScript libraries can be relatively easily converted into the WebIL libraries.
  • No need to maintain client and server code bases. A common library can be compiled into platform-specific code and WebIL with very little or no modifications.
  • The WebIL should be standardized and should not be browser-agnostic. The same WebIL code should execute identically in any Browser, whether it is runs on a desktop or a mobile device.
  • Core libraries (frameworks) can be delivered with the browsers with updates and extensions hosted in centralized locations and cached by the Browser.
  • WebIL libraries can be potentially even smaller than the current JavaScript libraries because core frameworks will implement most common functionality.
  • Common coding and design patterns now can be implemented in WebIL.
  • Developer productivity can potentially increase significantly. Learning and transition for programmers and web developers is going to be easier.
  • Solutions can be coded more efficiently.

ECMA published CLI (Common Language Infrastructure) specification ECMA-335 rev 5 in December of 2010 (http://www.ecma-international.org/publications/standards/Ecma-335.htm). Even though it is most likely the specification for .Net 5.0, I do not see a reason why its sub-set cannot be used for the WebIL. In fact, in version 4 of the Framework Microsoft already created Client profiles.

So, I am wondering, is Microsoft already working on something that is similar to the WebIL concept? They have the power to revolutionize the web.