July 10, 2015 at 7:34 am #10712illaParticipant
Characteristics of Web Programming Languages
Just as there is a diversity of programming languages available and suitable for conventional programming tasks, there is a diversity of languages available and suitable for Web programming. There is no reason to believe that any one language will completely monopolize the Web programming scene, although the varying availability and suitability of the current offerings is likely to favor some over others. Java is both available and generally suitable, but not all application developers are likely to prefer it over languages more similar to what they currently use, or, in the case of non-programmers, over higher level languages and tools. This is OK because there is no real reason why we must converge on a single programming language for the Web any more than we must converge on a single programming language in any other domain.
The Web does, however, place some specific constraints on our choices: the ability to deal with a variety of protocols and formats (e.g. graphics) and programming tasks; performance (both speed and size); safety; platform independence; protection of intellectual property; and the basic ability to deal with other Web tools and languages. These issues are not independent of one another. A choice which seemingly is optimal in one dimension may be sub-optimal or worse in another.
Formats and protocols. The wide variety of computing, display, and software platforms found among clients necessitates a strategy in which the client plays a major role in the decision about how to process and/or display retrieved information, or in which servers must be capable of driving these activities on all potential clients. Since the latter is not practical, a suite of Web protocols covering addressing conventions, presentation formats, and handling of foreign formats has been created to allow interoperability [Berners-Lee, CACM, Aug. 1994].
HTML (HyperText Markup Language) is the basic language understood by all WWW (World Wide Web) clients. Unmodified HTML can execute on a PC under Windows or OS/2, on a Mac, or on a Unix workstation. HTML is simple enough that nearly anyone can write an HTML document, and it seems almost everyone is doing so.
HTML was developed as part of the WWW at CERN by Tim Berners-Lee, who is now Director of the World Wide Web Consortium (W3C) at MIT’s Laboratory for Computer Science. Refinement of HTML continues at W3C, with standardization via the Internet Engineering Task Force (IETF) of the Internet Society. HTML descended from SGML (Standard Generalized Markup Language), the ISO standard language for text. SGML is in widespread use by the US Government and the publishing industry for representing documents. HTML applies SGML principles to the WWW. As such, it implements a semantic subset of SGML with similar syntax.
HTML is a markup language rather than a complete programming language. An HTML document (program) is ASCII text with embedded instructions (markups) which affect the way the text is displayed. The basic model for HTML execution is to fetch a document by its name (e.g. URL), interpret the HTML and display the document, possibly fetching additional HTML documents in the process, and possibly leaving hot areas in the displayed document that, if selected by the user, can accept user input and/or cause additional HTML documents to be fetched by URL. HTML applications, or what we might consider the HTML equivalent of an application, consist of a collection of related web pages managed by a single HTTP (HTTP is the tcp/ip protocol that defines the interaction of WWW clients and servers) server. This is an oversimplification, but the model is simple, and the language is simple, and that is one of its strengths.
As HTML moves through the standardization process, and is extended by various vendors, it loses some of its simplicity, but it remains a useful language. The Web programmer generally finds HTML lacking in only two areas: its performance in certain types of applications, and the ability to program certain common tasks.
The remainder of the paper: (a) discusses the issues involved in meeting the performance and expressibility goals while still providing safety, platform independence, and the ability to interact with a variety of formats, protocols, tools, and languages; (b) identifies design alternatives addressing these issues; and (c) discusses a variety of Web programming languages in this context.
Power. HTML is limited in its computational power. This is intentional in its design, as it prevents the execution of dangerous programs on the client machine. However, Web programmers, as they have become more sophisticated in their applications, have increasingly been hamstrung by these limits. Tasks unable to be coded in HTML must either be executed on the server in some other language, or on the client in a program in some other language downloaded from a server. Both solutions are awkward for the programmer, often produce a sub-optimal segmentation of a application across program modules, both client and server, and reintroduce safety considerations.
Performance. Because of an HTML program’s limited functionality, and the resulting shift of computational load to the server, certain types of applications perform poorly, especially in the context of clients connected to the Internet with rather low bandwidth dialup communications (<=28.8Kbps). The performance problems arise from two sources: (a) an application which is highly interactive requires frequently hitting the server across this low bandwidth line which can dramatically and, at times, unacceptably slow observed performance ; and (b) requiring all computation to be done on the server increases the load on the server thereby reducing the observed performance of its clients..
Today, most users have pretty competent client machines which are capable of accepting a larger share of the computational load than HTML allows. For example, an Internet-based interactive game or simulation can be a frustrating experience for users with low speed connections, and can overwhelm the server that hosts it. If you were the developer of such a game, you’d be inclined to push more of the functionality to the client, but, since HTML limits the possibilities, another route to supporting computation on the client must be found. The developer might make an executable client program available to users, which would be invoked via the HTML browser, but users might only be willing to accept such programs if they trust the source (e.g. a major vendor), as such programs are a potential safety concern. Also, users don’t want to be continuously downloading client programs to be able to access web pages, so this solution has real practical limitations considering the size and dynamism of the Web. If safe powerful high performance programs could be automatically downloaded to client platforms, in much the same way as HTMLpages, the problem would be solved.
When code is to be executed on a client, there are two main considerations: what gets shipped and what gets executed. There are three main alternatives for each of these: source code, a partially compiled intermediate format (e.g. byte code), and binary code. Because compilation can take place on the client, what is shipped is not necessarily what is executed.
Byte code, according to measurements presented at the JavaOne conference can be 2-3x smaller than comparable binary code, so its transfer can be considerably faster; especially noticeable over low speed lines. Since transfer time is significant in the Web, this is a major advantage. Source code is also compact. Execution performance clearly favors binary code over byte code, and byte code over source code. In general, binary code executes 10 – 100 times faster than byte code. Most Java VM developers are developing JIT (Just In Time) compilers to get the benefits of bytecode size and binary speed. Java bytecodes are downloaded over the net and compiled to native binary on the local platform. The binary is then executed, and, possibly, cached for later executions.
It should be clear that any combination of these strategies could be used in the implementation of any particular Web programming language, and there is in fact wide variation among the systems actually surveyed.
Platform Independence Given the diversity of operating systems and hardware platforms currently in use on the Web, a great efficiency results from only dealing with a single form of an application. The success of HTML has proven this, and Java has seconded it. The ability to deliver a platform-independent applicationis of great appeal to developers, who spend a large portion of their resources developing and maintaining versions of their products for the different hardware/software platform combinations. With Java, one set of sources and one byte compiled executable, can be maintained for all hw/sw platforms.
While platform independence has long been a goal of language developers, the need to squeeze every last ounce of performance from software has often made this impractical to maintain, at least at the level of executable code. However, in the Web this concern becomes less important because transfer time is now a significant component of performance and can dominate execution time.
Platform independence can be achieved by shipping either byte code or source code. One advantage of shipping byte code over source code is that a plethora of source languages would require the client machines to maintain many compilers and/or interpreters for the source languages, while fewerbyte code formats would require fewer virtual machines.
Preserving intellectual property. Although not currently discussed much as an issue, the ability to download safe, portable applets in some form less than source code is an additional advantage to developers who wish to protect their intellectual property. Looking at someone else’s script or source to see how they do something and just tweaking it a little or copying a piece of it to do the same thing in one’s own program doesn’t feel like stealing. But if one has to go to the effort of reverse engineering byte or binary code, it becomes more obvious that this code is someone else’s intellectual property. For the vast majority of honest people on the Web, this subtle reminder may be enough. For some of the minority, the effort involved in reverse engineering may serve as a sufficient deterrent.
Safety. Viruses have proven that executing binary code acquired from an untrusted, or even moderately trusted, source is dangerous. Code that is downloaded or uploaded from random sites on the web should not be allowed to damage the user’s local environment. Downloading binary code compiled from conventional languages is clearly unsafe, due to the power of the languages. Even if such languages were constrained to some ostensibly safe subset, there is no way to verify that only the safe subset was used or that the compiler used was trustworthy (after all, it is under someone else’s control).
HTML proved that downloading source code in a safe language and executing it with a trusted interpreter was safe. You can’t infect a client with a virus by fetching and displaying an HTML document (although you certainly can fetch a file with a virus in it, which could then be activated by executing the file, something which is not supported directly by HTML, although some browsers allow it). HTML is not sufficiently powerful. A middle ground is being sought in which the downloaded program is less limited in its capabilities than HTML and more limited than a conventional language. Even though HTML has limited power, the general idea behind HTML, that of a somewhat limited language interpreted by a trusted client-side interpreter, has been widely adopted with more powerful languages and interpreters.
Some languages achieve relative safety by executing byte-code compiled programs in a relatively safe runtime environment (a virtual machine). Yet other languages are fully interpreted on the client by an interpreter provided by the language developer. In either case relative safety can be achieved because the interpreter or virtual machine can make safety checks that are impossible to make statically at compile-time. Note that safety can only be provided by the interpreter or virtual machine, not by the language or the language’s compiler.
Building a secure virtual machine is a non-trivial task. (See Java Security: From HotJava to Netscape and Beyond for a detailed analysis of how safe Java and the Java virtual machine really are). Not many virtual machines are needed since a single virtual machine can be the target of many languages.
This is not to say that lack of safety or platform-independence disqualify a language for a role in web application development, but for dynamic applications likely to be downloaded from untrusted sources with current browsers and executed locally on mainstream platforms, a safe and platform independent executable is highly desirable. At the other extreme, the interpreters and runtimes that execute such programs are likely to be developed using unsafe languages and platform dependent executables will be distributed by their developers. For programs intended for execution on servers, there is some value to safety and platform independence, but not to the same degree as on clients.
Conclusions. HTML is proving insufficient by itself to develop the myriad Web-based applications envisioned. As extended by server and client programs, the task is feasible, yet awkward and sub-optimal in terms of performance and safety. The ability to easily develop sophisticated Web-based applications optimally segmented between client and server in the context of the heterogeneous and dynamic environment of the Web while not compromising safety, performance, nor intellectual property, is the goal of current efforts. The first significant result of those efforts is Java, a C++-derived language with capabilities specialized for Web-based application development. Java is compiled by the developer to a platform-independent bytecode format, with bytecodes downloadable via HTML browsers to the client, and interpreted by a virtual machine which can guarantee its safety. Sun is working to improve the safety, performance, comprehensiveness, and ubiquity of Java, and the industry appears to be accepting their approach. Others, especially other language developers. vendors and users, are taking similar approaches to developing Web-based applications is their languages, by supporting safe client-side execution in some manner, including targetting the Java Virtual Machine.
While Java certainly has the edge at the moment, a belief which was reinforced by the 5000+ attendance figure at the JavaOne conference in May 1996, we believe there is room for more than one winner, and that an end result somewhat broader than just Java would be in the best interest of developers and users alike.
Safety is the biggest issue. The safety of a program is a function of the safety of the environment in which it executes, which is just another program. At some level, the user must acquire a potentially unsafe program from a trusted source. At present, we acquire Netscape, Java, and Windows from trusted (relatively) sources. Because there must be a trusted environment in which to execute safe, platform-independent programs and because users are only likely to trust a limited number of big name sources for that trusted environment, there has been speculation that diversity, including diversity in Web programming language choices, would be reduced. While this could become true, it now appears unlikely because language developers are proving that they can retarget their programming language to someone else’s execution environment. A more reasonable view of the future is a full diversity of programming languages supported by a few trusted execution engines. At present, most efforts are targeting Java’s Virtual Machine(VM), mainly because it is widely distributed with Netscape and is being licensed by other browser vendors. Its possible that the Java VM ends up being the one trusted execution environment, but it will probably be one of several general purpose execution environments, that together with many special purpose environments, will be distributed by trusted sources. An ideal outcome might be industry-wide standardization on a trusted virtual machine specification and validation of implementations by an industry group such as X/Open. Regardless of how it occurs, we do not think diversity of programming language alternatives will be reduced in the long term. However, it is likely that we will see some narrowing of our choices in the short term as language developers adapt their existing offerings to this new area and develop new ones.
The rest of this document surveys languages and interfaces being used for Web programming, attempting to provide a snapshot view of the direction that language is going to meet the needs of Web programmers, and its status.
- You must be logged in to reply to this topic.