eXtropia: the open web technology company
Technology | Support | Tutorials | Development | About Us | Users | Contact Us
 ::   Tutorials
 ::   Presentations
Perl & CGI tutorials
 ::   Intro to Perl/CGI and HTML Forms
 ::   Intro to Windows Perl
 ::   Intro to Perl 5
 ::   Intro to Perl
 ::   Intro to Perl Taint mode
 ::   Sherlock Holmes and the Case of the Broken CGI Script
 ::   Writing COM Components in Perl

Java tutorials
 ::   Intro to Java
 ::   Cross Browser Java

Misc technical tutorials
 ::   Intro to The Web Application Development Environment
 ::   Introduction to XML
 ::   Intro to Web Design
 ::   Intro to Web Security
 ::   Databases for Web Developers
 ::   UNIX for Web Developers
 ::   Intro to Adobe Photoshop
 ::   Web Programming 101
 ::   Introduction to Microsoft DNA

Misc non-technical tutorials
 ::   Misc Technopreneurship Docs
 ::   What is a Webmaster?
 ::   What is the open source business model?
 ::   Technical writing
 ::   Small and mid-sized businesses on the Web

Offsite tutorials
 ::   ISAPI Perl Primer
 ::   Serving up web server basics
 ::   Introduction to Java (Parts 1 and 2) in Slovak


Intro to the Web Application Development Environment
Introduction to Server-Side Processing  

Previous | Next | Table of Contents

[The Middleware Layer]

So now you have seen how user requests are gathered by the GUI Layer and how they are sent across the wire to a web server by the Communication Layer. So now what happens?

Well, in 99.9% of web applications on the other end of the wire from the web browser is a web server. The web server is the entry point to the Middleware Layer.

The purpose of the Middleware Layer is to accept incoming requests and process them, using the resources provided by the web server, the machine that the web server runs on, or by the network of servers and resources that the web server is connected to.

Consider the most basic function of a web server: distributing HTML files. In this case, a web browser requests a given HTML file from the web server. In response, the web server will find the given file on the local (or networked) file system and send it back to the browser.

Of course, the web server has at its disposals a whole world of resources above and beyond file systems. Web servers, in fact, are in close contact with all sorts of resources including data stores, applications, business objects, operating system resources, authentication services (server-based or directory services like LDAP/ADSI) and more.

A good Middleware Layer web application developer will have her fingers in every pie in an organization and will know what resources she can rip off from the work of other developers.

Of course, the next question is how.

Granted, the web server has the ability to grab files from a web document tree. That is built-in web server functionality. But how can a web server get to all those other resources?

CGI (Common Gateway Interface)  
Well, the most basic tool to access system resources is CGI (Common Gateway Interface). CGI is a service provided by all web servers that allows you to 1) create an executable script that the web server may call on demand, 2) pass incoming HTTP GET or POST data to the CGI script, and 3) filter CGI generated answers back to the browser.

You can think of a CGI script as taking the place of the HTML file in the standard web server activity. However, the key difference is dynamism. Rather than containing a pre-defined set of text (like an HTML document) which can only change when the author edits it, a CGI script can dynamically generate any information it is programmed to generate.

[CGI versus HTML]

A simple example would be a CGI-generated clock that would always show the current time when it is loaded. To do the same thing with HTML, an HTML author would have to edit the HTML document every second with the new time.

A CGI script, on the other hand, can access the time/date resources of the operating system it runs on and independently output the current time whenever it is called without the author ever having to do anything.

[CGI Time]

If you know Perl, here is the code for that CGI script:

           use Time::localtime;
           print "Content-type:text/html\n\n";
           my $time = localtime;
           print "Today is: " . 
                  ($time->year() + 1900) . "/" . 
                   $time->mon() . "/" . 

Of course, CGI is not a programming language. CGI is an "interface". It defines a way of bridging the web server to the back end resources. It does not say anything about how that bridging will be implemented. In fact, CGI applications can be written in just about any programming language in use today.

NOTE: If you need to handle the data coming in and going out to a web browser, your best bet is to use CGI.PM written by Lincoln Stein. Like ASP, CGI.PM packages up requests and responses into easy to use objects.

So why exactly do most web developers choose to use the programming language Perl for their CGI applications? Could one use another language like C, C++, Apple Script or Visual Basic instead?

This is a good and extremely frequently asked question. In fact, CGI applications can be written in any programming language that is able to accept and process input, and that is able to output the results of that processing.

However, for most of the CGI applications on the web, Perl has been by far the best choice for two main reasons: 1) Perl is the right tool for the job and 2) Perl is easy.

Perl is the Right Tool for the Job  
Perl does not attempt to be a super language. In fact, compared to some more robust languages out there, Perl solves only a few, though crucial, problems.

Fortunately however, because Perl is specialized, it does the jobs it sets out to do exceptionally well. And, better yet, the limited set of problems that Perl can solve happens to fit very well with the demands of CGI.

Perl and CGI are simply a match made in heaven. Common Gateway Interface (CGI), as its name implies, provides a "gateway" between a human user with unexpected and complex needs, and a powerful, command/logic oriented server. As a gateway, CGI must perform one task very well. It must translate.

All CGI applications must translate the needs of clients into server requests, and translate server replies into meaningful and well-presented "answers". This can be quite a chore, since computers and humans typically speak very different languages.

As such, CGI must be adept at manipulating text (be it the text input by the client or the text received from the server). A CGI application must be able to take "strings" of data and translate them from one language to another constantly and quickly.

As it so happens, Perl has a wide variety of tools designed to manipulate strings of data. It is, in fact, one of the best languages around for string manipulation.

What's more, CGI must serve as gateway not just for one client and one server but for as many types of clients and as many types of servers as possible. It must be more than a bilingual translator, it must be multilingual, providing translation services between dozens of browser types, server types and operating systems.

Again Perl shines. Fortunately, Perl is highly portable. Due to the hard work and good intentions of many net hackers, Perl has been ported to just about every operating system you would want to run a Web server on.

Finally, Perl's weaknesses are not so negative in a web environment.

Most CGI tasks, which are ultimately at the mercy of bandwidth speed, do not demand much gusto from the application. While other compiled languages may boast ten times the power and speed or Perl, with complex functions up the yazoo, using one is like bringing in the hydrogen bomb to kill an ant. Perl is simply the right tool for the job. Other languages are typically overkill.

Perl is Easy  
Perl is also easy to understand. Because Perl is an interpreted language, for example, there are no compilers and no illegible machine code compiled applications. What you see is what you get. The code that is run by the Web server, is the code that you see in your text editor window.

Since Perl is simple in design, it is also easy to modify, maintain and customize (which is really where the cost of software comes from anyway). That is, because Perl source code is so legible, it is very easy for one to pick up a script and quickly modify it to solve similar or new problems. Perl is a cut and paste language and program logic is easily transferred and manipulated between projects.

The benefit of this, of course, is that Perl is supported by a wide body of CGI freelance programmers. Unlike the more cryptic languages like C++ or Visual Basic, Perl is accessible to anthropology majors and computer science majors alike. In fact, newsgroups like comp.lang.perl are often too prolific to frequent on a regular basis. The Perl community is thriving and thanks to the web, expanding rapidly.

Thus, since so many people can write and modify Perl, it is very easy for you to find someone to do it for you cheaply and to do it well. You need not kneel at the mercy of the few reclusive wizards of other arcane languages who confidently slide on the curves of supply and demand. There is an abundance of qualified, starving students with skills enough to solve most of your programming needs for very cheap rates, especially if they are given working code to modify rather than asked to write one from scratch.

SSI (Server Side Includes)  
There are some problems with CGI of course. Perhaps the most serious problem is speed. Every time the web server gets a CGI request, it needs to execute the CGI application.

What's worse, if you are using a web server into which the Perl interpreter has not already been embedded, you will need to load the Perl interpreter every time. If you begin to get thousands of requests per second, this could quickly cause your web services to grind to a halt.

One way to get around this problem is to embed the processing into the web server itself. Rather than rely on another layer, most web servers provide several ways to extend the web server itself; to add logic and processing power.

The earliest technology to take advantage of this idea was SSI (Server Side Includes). The concept of SSI is simple. An application developer codes special tags into her HTML document. Those special tags are understood by the web server and can be translated on the fly by the web server as the HTML document passes through on its way to the browser.

[SSI Action]

WARNING: NCSA notes that having the server parse documents is "a double edged sword. It can be costly for heavily loaded servers to perform parsing of files while sending them. Further, it can be considered a security risk to have average users executing commands as the server's User."

All SSI directives are formatted as SGML comments within an HTML document and thus looks something like the following:

          <!--#command tag1="value1" tag2="value2" -->

There are several possible commands that you can use including the following:

config Controls various aspects of the file parsing. There are two valid tags:

  • errmsg controls what error message is sent back to the client if a problem occurs while parsing the document.
  • timefmt defines the format a server should use when providing dates. These formats follow the standard UNIX formatting rules. Thus,>!--#config timefmt="%A, %B %d, %Y"--< would give you something like Friday, August 12, 1999. Other useful formatting include:

    %% - %
    %a - Day of the week abbreviation (Like TUE)
    %A - Full name of day of the week (Like SUNDAY)
    %w - Number of day of the week. Don't forget that Sunday is day 0.
    %b - Month abbreviation (Like MAR)
    %B - Full name of month (Like March)
    %d - The day of the month (01-31)
    %e - The day of the month (1-31)
    %H - Hour of day (00-23)
    %I - Hour of day (01-12)
    %j - Day in the year (001-366)
    %M - Minute (00-59)
    %p - AM or PM
    %S - Second (00-61)
    %ys Last two digits of the year (00-99)
    %Y - The year (Like 1999)
    %Z - The time zone (Like PST)

  • sizefmt defines the format a server should use when displaying the size of a file.
include Inserts the text of a separate document into the parsed document. For example, if you had a menu bar on every page, rather than coding the same HTML into every page on your site (Q: what happens if you change the menu? A: You'd have to recode every document!) you might write the HTML for the menu bar in a separate file and use SSI to reference it in multiple documents using something like: <!--#include virtual="menu.html" --> The include command accepts two tags:

  • virtual defines a virtual path to a document on the server.
  • file gives a pathname relative to the current directory.
echo Prints the value of one of the 6 special include variables.

  • DOCUMENT_NAME: The current filename.
  • DOCUMENT_URI: The virtual path to this document.
  • QUERY_STRING_UNESCAPED: The unescaped version of any search query the client sent, with all shell-special characters escaped with \.
  • DATE_LOCAL: The current date, local time zone. (This variable is subject to the timefmt parameter to the config command)
  • DATE_GMT: Same as DATE_LOCAL but in Greenwich mean time.
  • LAST_MODIFIED: The last modification date of the current document. (Subject to timefmt like the others)

Note that any dates are formatted according to timefmt if set by config. Also note that the only valid tag to this command is var, whose value is the name of the variable you wish to echo.

fsize Prints the size of the specified file. The valid tags for this command are the same as those of the include command and the resulting format of this command can be defined by the sizefmt parameter in the config command.
flastmod Prints the last modification date of the specified file, subject to the formatting defined by the timefmt parameter in config. Valid tags are the same as with the include command.
exec Executes a given shell command or CGI script. Valid tags include:

  • cmd will execute the given string using the operating system shell on which the web server is running.
  • cgi will execute the given CGI script and include its output.

For example, to display the current date on an HTML page (if your web server us running on UNIX) you might have something like:

          <TITLE>Date Test</TITLE>
          The date is:
          <!--#exec cmd = "date" -->

The user would never see the tag because the web server would dynamically perform the work and the substitution before it went out.

NOTE: Most web servers have SSI disabled by default because of the security risks so if you want to use it you'll have to get in touch with the server administrator and have them turn on SSI.

[SSI Action]

SSI with Proprietary Tag Sets  
Taking off with the idea of SSI, several companies including Cold Fusion and Net Objects designed custom web servers with incredible SSI functionality. These third party web servers provided a huge API which offered a host of server embedded resources which app developers could use to make their web pages dynamic that extended far beyond the limited set of commands offered by the operating systems.

They also provided a huge number of formatting options as well including complex tabular display.

Cold Fusion, perhaps the best-known SSI-based application server offers a set of over 70 custom "CFML" tags that execute most, if not all of your average needs on the custom Cold Fusion Web Server. Cold Fusion also allows you to set name/value pairs in your HTML as well.

ASP (Active Server Pages  
However, there is a serious limitation to any SSI-based technology. Specifically, developers are limited to the range of commands/tags offered by the SSI-enabled web server provider.

For example, what happens if you need to define operations not supported by the operating system (SSI-traditional) or the custom web server (SSI- Cold Fusion style)? What happens when you need to code your own tag logic?

What developers need is a way to embed dynamically interpreted code into HTML that can be processed by the web server on demand. That way every web site can develop its own set of custom tags.

What was needed was a hybrid CGI-SSI animal.

And thus was born ASP. ASP is a server extension of the IIS web server released by Microsoft (By the way, Apache does have MOD_ASP at this stage so that you can code ASP pages on Apache web servers. Also third party vendors also provide ASP functionality for non-IIS servers.). ASP allows developers to code custom tags in JavaScript (JScript) or VBScript. These tags can be interpreted by IIS before the pages are sent out.

At about this time, by the way, Apache and ActiveState were making embedded Perl interpreters a reality. This meant that the overhead of loading the perl interpreter was no longer required. Perl with MOD_PERL is just as fast as ASP and provides all the same functionality.

An ASP page at its core is simply a text file that has been named using the extension .asp and which contains HTML and scripting. Scripting, usually in VBScript provides a means to embed programmatic logic into HTML files that will be dynamically interpreted as the HTML page goes through the web server and also provides access to any server side object.

NOTE: Like all server side technologies, functionality provided by ASP is completely cross-browser. All processing is done on the server side and the results of processing are displayed as plain HTML or images. Thus, a web developer can easily use the power of Excel or Power Point on the server side to generate graphs and charts that can be seen by a user running a UNIX-based web browser.

Like SSI, ASP provides a means to specify a "tag" with instructions that should be interpreted by the web server. However, unlike SSI, ASP has a robust set of objects that you can use to do serious programming. It also gives you the ability to instantiate server side resources (any COM component). Consider the following simple ASP page:

          <TITLE>Test ASP</TITLE>
          <% @Language = "VBScript" %>
          <% Response.Write("Hello cyberspace") %>

[Simple ASP]

In the above example, you saw the "Response" object being used to print a message out to the web browser. ASP has a whole set of objects for the convenience of the application developer . These objects conveniently cover all key aspects of creating dynamic web pages.

The basic object heirarchy builds off the Scripting Context Object (which you will never really use itself) and looks something like the following:

	Scripting Context
COM and Active-X  
The real power of ASP however, stems from the Microsoft COM architecture that magically breaks everything in the Microsoft universe down into reusable components with well-defined and easy to use interfaces.

Using COM, (or its web catch phrase alias, Active-X) a developer can bring the entire power of Microsoft to bear in any web application. You can instantiate IE to parse your XML, ask Excel to output dynamic graphs, or tell Outlook to send email for you. Everything in Microsoft is an object and everything can be spoken to using a standard interface from your web page.

COM works by creating objects that have a standard interface such that they can be used by any COM-aware program.

However, to understand COM, it is best to step back and look at the history of the Microsoft architecture.

Back in the 80s, the Microsoft architecture was application-centric. That is, every application in the Microsoft universe worked independently. As a result, each application saved its data in its own special format.

For example, you could not read a Word Perfect document in Microsoft Word. Each word processor program worked in its own way, and exchanging data between them was problematic if at all possible. Users had to rely on special bridge programs or klunky export and import functions that often did a mediocre job.

Consider how hard it used to be to import a spreadsheet into a word processor. You would probably have to export the spreadsheet as plain text, import it into a word processor, and then type in the tabs yourself to make the data presentable. Worse yet, not only did you lose any cool features like column addition, but if you changed any data in the spreadsheet, you would have to go through the same painful process of conversion.

Microsoft was never far behind market demand of course, and quickly modified their architecture towards data-centricity and away from application-centricity.

In a data centric universe, instead of focussing on applications and files specific to those applications, users could think of "documents". Documents could contain any type of "object" including text, sound, animation, spreadsheets, or even types that did not yet exist.

[The Document]

Data centricity requires several things of applications that deal with documents. Specifically, there needs to be a way for applications to:

  1. display objects with structures unknown to the application.
  2. load and save documents containing objects with structures unknown to the application
  3. provide editing functionality for objects with structures unknown to the application.
  4. execute commands that manipulate objects with structures unknown to the application.
  5. support the drag and drop of objects with structures unknown to the application.

In a perfect data-centric world, users would never again need to worry about applications and application specific files. When users opened a document, the operating system would automatically run the associated application in order to present the requested object.

Any document might include several embedded objects from completely separate applications. Sounds familiar? This is the Web? In fact "Active-X" is simply Microsoft's implementation of the data-centric architectural model.

Of course, the evolution of Active-X has been a long process.

In fact, Active-X was born many years ago in the guise of Dynamic Data Exchange (DDE). DDE was a Microsoft technology that 1) allowed applications to communicate with each other (exchange data) and 2) provided a means for applicaitons to execute commands in other applications.

Unfortunately, DDE was slow, difficult to use, and pretty limited.

Fairly soon after DDE was released, Microsoft rendered it obsolete with OLE 1.0 (Object Linking and Embedding). OLE 1.0 defined the essential "compound document" and specified the standard way for an application to work with a compound document.

Any application could display a document consisting of many different types of objects and you could double click on the object in order to edit it using the native application.

Of course OLE was only a first step. real data-centric architecture was approached with OLE 2.0 that was based on COM objects.

COM objects provided:

  1. A common way for apps to access and perform operations on objects
  2. A mechanism for keeping track of whether an object is in use and deleting it if it is no longer needed
  3. A standard error reporting mechanism and set of error codes and values
  4. A mechanism for apps to exchange objects
  5. A way to identify objects and associate objects with apps that understand how these objects are implemented.

When the internet hit big, Microsoft effectively took COM and renamed itActive-X so that the company would seem cutting edge and internet-focussed.

Server-Side Java  
Another alternative to using CGI or SSI/ASP technology is to create services on the web server machine that are capable of handling web input, processing that input, and returning the processed input back through the web server. One of the preferred languages for doing so is Java (though C++ or VB would also work for servers).

There are two primary ways to do this: 1)Java Servers and 2) Java Servlets.

Implementing a server in Java is fairly trivial since the language was designed with the network in mind. Essentially you create sockets for each client and then process the data coming in through the socket.

On a very simplistic level, you might have something like:

	try {
		ServerSocket socket = new ServerSocket(1969);
		Socket s = socket.accept;
		PrintStream ps = new printStream(s.getOutputStream());
		ps.println("Hello Cyberspace");
	catch (IOException e) {

Once your server is written, you can start it up and it should wait patiently for clients to connect to it. Connection code might be as simple as the following:

	try {
		Socket s = new Socket("www.mydomain.com", 1969);
		DataInputStream dis = new DataInputStream(s.getInputStream());
		String serverResponse = dis.readLine();
	catch (IOException e) {

However, most people actually implement server-side Java with Servlets. Servlets allow you to embed a Java server service through the web server. You can think of them as server-side applets. The web server will load and execute them the same way a browser would an applet.

The lifecycle of a servlet execution would be something like

  1. Web browser makes an HTTP request, specifying the servlet to be used to handle the request
  2. The web server passes off the request to the servlet. If the servlet has not already been loaded into memory, the web server will do so (that way future requests can be handled immediately).
  3. The servlet responds by sending data back to the client via the web server.

As you can see, this works much like CGI and ASP. In fact, like ASP, or perl embedded CGI, servlets work very fast because they are loaded into the web server.

Servlets (like Servers) have the extra benefit in that since they run on the server side, security and communications restrictions typical of Java applet programming do not apply. Servlets are freed of the Java sandbox.

Servlets can also maintain state, are platform independent, and extremely extensible as they are written in object oriented Java.

The only catch is that you need to use a web server that knows about servlets. The most popular include Java Web Server (JWS), Apache (using JServ), and O'Reilly WebSite. However, even if you don't have a servlet enabled web server, many web servers have third party add-ons which handle servlets on behalf of the web server. These include ServletExec and JRun.

Distributed Resources DCOM, CORBA, RMI  
Finally, it is important to think about the middle layer as a network of resources that are tied together by network protocols and distributed objects. DCOM (Microsoft), CORBA (Open Systems/C++) and RMI (Java) essentially provide access to objects no matter where they are on the network.

Previous | Next | Table of Contents