eXtropia: the open web technology company
Technology | Support | Tutorials | Development | About Us | Users | Contact Us
Development resources
 ::   WebWare 2.1 (Perl)
 ::   WebWare 2.0 (Java)
 ::   Cool hacks
WebWare 2.0 White paper
Note  
This document is the technical piece of a two part set. For the whole story, you may wish to start with eXtropia: A Case Study in Open Source Software

Also, check out the code as it comes out!

It has been five years since Gunther Birznieks and Selena Sol started designing open source web applications for Selena Sol's Script Archive, and almost five months since the founding of eXtropia.

In that time, we have learned a great deal about how to design an extensible application architecture.

In this document, we will review the current architecture of eXtropia's WebWare Suite as well as present a historical overview of the evolution of this application development model.

What exactly are Web Applications?  
Web-based applications are computer programs that execute in a web server environment. An example of such an application would be an online store accessed via Netscape Navigator or Internet Explorer.

Amazon.com is a high profile example of this. Amazon has a proprietary "Web Store" application that they use to sell books and compact discs online.

Built on the foundations of the World Wide Web, such applications can be run anywhere in the world at any time and are completely cross platform. Web applications provide a rich interactive environment through which the user can further define their unique online experience. Without web applications to breathe life and provide user interaction, a web page is limited to displaying static electronic text and images.

The Generic Web Application  
Regardless of the specific tasks they perform, all web applications do the same things generically. Specifically, all web applications must do the following:

  1. Get data from a user on the web - Traditionally, getting the user data involves creating and serving a user interface such as a Java GUI or an HTML/DHTML form. The user interface submits user-supplied data by sending it, via GET or POST requests, to the web server that is serving the user interface. The web server will then pass the data to a server-processing agent (application) such as a CGI script, a Java Servlet, or a server-integrated API script such as Cold Fusion or mod_perl.

    Typically, the developer must code the user interface and be able to parse the data coming from the user interface to the server-processing agent.

  2. Validate the user's data - Once the data has been handed off to the server-processing agent, the agent must check the data submitted to make sure it is valid. Such validation might include making sure a date is a valid date (i.e. not Oct 34,-1000), making sure a price is valid (i.e. not $123.98ASDF-1), or making sure the incoming data is safe for processing (e.g. not exec `rm *.*`;). An agent might also communicate with other processing agents such as a credit card validation service.

    Typically, the developer must define the logic of validation and embed that into the web application.

  3. Process that data - Once the data has been validated, the agent must process it. Processing often involves 1) data storage and retrieval and 2) inter-application communication.

    1. Data Storage and Retrieval - Often a web application must access data from some data source like a Relational Database Management System (RDBMS) or a local file on the web server. Web applications usually need to be able to read and write to these data sources.

    2. Inter-Application Communication - Web applications also need to be able to work with other resources such as email, fax-gateways, file locking mechanisms, encryption applications and even other web servers.

    Typically, the programmer must spend most of her time defining the logic of the work flow. This is the piece that is most often thought of as the web application. For example you might specify that a data source should be opened, a specific data row should be selected based on a given keyword and search description, and the row should be updated based on the submission of a Structured Query Language (SQL) statement.

  4. Respond to the client who submitted the request in the first place. Usually, the developer will code the server agent to send a response to the client based on the processing. This might be as simple as a dynamically generated thank you note HTML file or e-mail receipt, or as complex as a formatted report with images generated on-the-fly.

In performing these generic functions, a web application should be

  1. Secure - Both the privacy of the data and the access to supporting server resources must be secure.
  2. Scalable - The application must be able to serve one client at a time or ten thousand clients at the same time without a noticeable degradation of service.
  3. Fast - The execution of the application must appear rapid to the user even within the context of clogged Internet bandwidth.
  4. User-Friendly - The application must be so simple to use that a user on the web should need no instructions, or only minimal instructions, in order to perform the task they want to complete.
  5. Maintenance-Friendly - Because web application services must change rapidly, the application must be built so that it can be modified, fixed, or maintained with minimal outlay of time and money.
  6. Reusable - The cost of reinventing agent processing for each task is too great. Processing agent technology must be reusable between projects if it is to be affordable.

As you can see, designing a web application represents quite a bit of design and coding.

WebWare 1.0:The Old Framework  
In 1994, Selena Sol and Gunther Birznieks were doing just that - writing lots and lots of code! By the time they had completed their third proprietary web store, they realized that this web application development model needed to be streamlined.

To do so, the two began to write modularized, generic web application "cores" that could be more easily molded and remolded from one project to another.

Rather than adopt proprietary languages/environments such as Cold Fusion or NetObjects, Perl and Java were chosen as the languages to be used. They were selected because of their ease of use and wide-spread appeal. Ease of use was important because they wanted the applications to be accessible to both beginning and advanced developers. "Wide-spread appeal" was important to allow the applications to benefit as wide an audience as possible.

NOTE: Now we want to be VERY clear that we have nothing against proprietary solutions for web application development. For instance, we think Cold Fusion is an excellent product by itself. And coding in a proprietary environment may often be the best solution to a specific problem.

However, when we were designing our code, we were interested in writing code that would be useful to the masses of web users who might not have access to these, often expensive, technologies.

Such clients might include "Joe Web" who has a standard internet account hosted by ANY_ISP_USA. ANY_ISP_USA probably offers Joe CGI functionality but not a personal RDB or application development environment like Cold Fusion.

But this made sense for Joe since not only could he not afford such extras, but he would not know what to do with them. Joe is in the business of selling widgets on the web, not programming or database administration. Joe should have a solution that meets his needs.

We also wanted to develop an open standards protocol for developing applications that could be used as a springboard for other developers not associated with us. We hoped to provide a foundation upon which others could build without fear of us, or anyone else, pulling the code out from under them.

This process of generalization and modularization involved three stages:

  • Extracting user interface code into HTML configuration files
  • Extracting Programmatic Intelligence into application setup files
  • Documenting the code to death
Extracting the UI Code  
Essentially, any code that sent GUI (HTML or Java) data to the user was extracted from the application and placed in a GUI configuration file.

The great benefit of this was that when the GUI code was separated away from the main code, a user unfamiliar with application programming, but familiar with basic GUI programming, such as HTML, could modify the user interface code without worrying about breaking the program.

After all, GUI code tends to change more often between sites than other types of code. Every web site tends to have their own specific look-and-feel.

In addition, separating out the GUI allowed users to apply bug fixes without disturbing their customized GUI changes. Since program logic was separated from GUI logic, a fix to the program logic did not require the user to redo all their GUI/HTML changes every time a new version came out.

Here is an excerpt from one such GUI configuration file:

                    
    sub required_fields_error_message
      {
      print "Content-type: text/html\n\n"; 
      print qq~
      <HTML>
      <HEAD>
      <TITLE>Error in Processing Form - 
             Required Fields</TITLE>
      </HEAD>
      <BODY BGCOLOR = "FFFFFF" TEXT = "000000">

      <BLOCKQUOTE>

      <H2>
      Whoops, I'm sorry, the following fields 
      are required:

      <UL>
      <LI>Name
      <LI>Email
      <LI>Comments
      </UL>

      Please click the "back" button on your 
      browser or click <A HREF =
      "$url_of_the_form">here</A> to go back and 
      make sure you fill out all
      the required information.
      </H2>

      </BLOCKQUOTE>
      </BODY>
      </HTML>~;
      }

You can see how HTML-like the code appears. We found that by extracting out the code like that, users were much less intimidated about making changes.

This separate GUI code was imported into the main application code using Perl's require or use keywords. Then, the GUI-specific routines could be called from there.

Extracting Programming Logic  
Extracting implementation-specific programming logic was the next step in the eXtropia scripts evolution. We knew that we would have to provide a host of services for each application that could be turned on or off depending on what services each installation would support. To do that we needed to write the methods into the base code and then provide the user with "switches" in an application setup file.

A user need only specify the general work flow of the application by answering "questions". Below is an excerpt from such a file:

     $should_i_email_orders = "yes";
     $should_i_use_pgp = "no";
     $should_i_append_a_database = "yes";

The actual code would check for each case and act accordingly using "if tests" such as:

if ($should_i_email_orders eq "yes")
    {
    Go ahead and mail;
    }
else
    {
    Don't mail didilly squat;
    }

The trick was to predict all the myriad ways the script might need to function, build in that generic code, and finally provide "should_i" options in the configuration file to activate the relevant code in the main application.

Documentation  
Finally, the code was documented extensively so that a developer could easily make site-specific modifications. We commented the code so well that a number of people said they had learned to program in large part by studying the comments and the accompanying code. When more seasoned programmers began to grouse that the scripts were more comments than code, we figured we had accomplished our goal. Note that we did not want to be in the business of installations and customizations, we wanted to provide extensible source code so that others could do that.
The Result: Limits of the Model  
We had a great deal of success with this streamlined web application development model. Thousands of sites implemented the code and found that they never needed to do anything beyond editing the HTML in the GUI definition file. However, within a couple of years, chinks in the model became more apparent.

For one, it was hard to organize group programming projects such that add-ons were easily transferred to the entire group. The code needed to be far more "Lego-ized" for that to happen. That is, there needed to be a way to easily disassemble code and re-assemble it in different configurations with very little effort. Similarly, new and more efficient routines should be easily "popped" into existing applications without breaking the old routines that depended on it.

We had modularized the GUI and implementation specific setup information, but we still needed to modularize the internal generic algorithms. Once these algorithms were modularized, a more efficient, secure and robust algorithm could instantly replace an older algorithm in the main code without breaking any code that used that routine. Likewise, services such as data access could be more transparent, so that a user could easily move the code back and forth between databases such as Sybase or Oracle.

Moving to Object Oriented Design with Perl 5 and Java Servlets  
When Object Oriented Perl 5 became ubiquitous and Server Side Java/Servlets became a reality, it was clear which direction the development framework would have to go.

Object Oriented Design (OOD) was the answer!

So what exactly is OOD and how does it help solve the problems we just discussed?

Well, OOD is based on the concept of objects. You can think of an object as a little "black box" of functionality that accepts some standardized input and produces some standardized output or behavior.

"Black Box" is an engineering term used to describe a thing that is encapsulated, or has the property of being "plug and play". A black box provides some service in such a way that the system architect (such as a programmer) need not know anything about the internals of the object to use it. An object just "plugs-in" to an existing system.

What is an example of an object that we can sink out teeth into?

Think of a telephone. Do you know how it works--the details of the circuitry from end to end? Probably not. However, whether or not you know how the phone or the underlying telecommunications systems work, you can still call your mom on Christmas.

This is because a phone is a black box. It accepts a phone number as input and returns a phone connection as output. All the magic of creating phone connections in the international telecommunication network is handled magically and invisibly by the phone object (and a bunch of other helper objects--networks, switches, etc.--that it depends on). All you need to be concerned with is the protocol for picking up the receiver, dialing, and what you are going to say.

So, in summary, objects have the following attributes:

  • Objects provide "plug-and-play" functionality.
    They "hide" the internal machinery of how they do their job. Users need not understand these internals in order to use the object. In computer-science literature, this property goes by the fancier name "encapsulation."
  • Objects accept standardized input.
    Objects have an API (Application Programmers Interface), and all objects of the same type share the same interface. The computer-science jargon for this property is that these objects "implement" the same interface.
  • Objects produce some desired output or behavior.
    Unlike the inputs and the interface, this doesn't have to be the same for all objects of the same type. For example, a cellular phone contacts the telephone network by transmitting radio waves, rather than using copper wires, but the intended outcome--contacting your mom--is the same. The computer-science term for this is "polymorphism."

Essentially, the API defines a kind of contract: if you follow the rules, and provide the object with data in the right form, it will do what you ask. In the case of the phone, you use the standard interface (dial or keypad) to ask the object to do something (make a call), with some data you provide (the phone number), and the object performs the action you requested.

Software objects work the same way. When you call one of the routines in the interface, your data gets "magically" transformed inside the object, and then the transformed data is returned to you, or the action you requested is performed.

How OOD Solves Problems of the Old Model  
There are three primary reasons that objects are an excellent tool for large, complex programming projects... particularly ones that must be frequently changed, and where code reuse is important.

First, since you need not concern yourself with the internals of objects in order to use them, you can create complex programs built on a library of objects without needing to be an expert in each area of the program.

For example, if you want to incorporate database access into a program, and you can use a pre-written database connectivity object to do it, you needn't worry about how database connectivity actually works. You just let the object, designed by a database connectivity specialist, do it for you.

Using objects allows you to focus on the work flow of a program rather than the nitty gritty of particular algorithms. Well chosen object and variable names actually allow the programmer to program in terms of real world objects. Not strings and arrays and hashes, but messages and shopping carts and users. This makes it easier to write the code, and even more importantly, easier to read it.

Second, an object-oriented framework makes it very easy to divide development work among community members. Objects can be developed independently and submitted to the common pool of objects. Different objects can be written to interface with various tools on various platforms. And as long as all of objects of the same type conform to the same standard API, they can be plugged into other people's work with little or no effort.

Finally, objects can be modified (made more efficient, secure, and robust) without breaking all the code that uses them. Since the internals are hidden anyway, the client code (the code using an object) does not know or care that the implementation is changed. So long as the API stays the same, the client code is happy.

The Existing Perl Modules  
Soon after the release of Perl 5, the community was blessed with a flood of excellent objects (Perl Modules) that could be used in support of web-based applications. Among these modules were the CGI, LWP and DBI modules.

The CGI module takes care of organizing form name/value pairs and HTML output, as well as other useful features such as file uploading, error handling, and manipulation of the environment variables.

The LWP (Library for Web Programming) module made networking simple. A standard API allows your Perl script or CGI application to take advantage of all the network services available to a web browser, including FTP, GOPHER ,HTTP, local files, and HTTPS (Secure Socket Layer) connections.

Finally, the DBI module gives your CGI application access to almost any commercial, shareware, or freeware database that is used in support of web applications. Without knowing anything about database technology, a developer can use the simple interface to access and manipulate the most complex of databases. Further, since the DBI interface ensures that all of the database drivers provide the same interface, if the backend database suddenly was changed from one database to another--for example, from Oracle to Sybase-- no modification would be necessary in the client code. You would just need to plug in a different DBD (Database Dependent) module.

We will discuss interface modules of this sort in depth a little later.

Proposed eXtropia Extensions  
The CGI, LWP, and DBI modules are all fantastic tools for web application development. However, they are not enough. In fact, for the most part, these three modules are more useful at a deep infrastructure level. Most real-world application code deals with higher order issues (or should, at any rate).

With 5 years of experience designing generic, open source applications behind us, we are in a unique position to know what other black boxes must be developed for truly efficient web application development. In addition, the valuable feedback and ideas we have gathered from users and developers from all types of businesses during that time, allow us to leverage that knowledge to create a framework that is likely to be extensible to any type of business these components will be used in.

Coding of several modules began in November 1998.

The new modules include:

eXtropia Modules
          Datasource
                   File
                   DBI
                   HTTP
                   FTP
                   FileSystem
                   DBM
                   XML
          Mail 
                   SendMail
                   SMTP
                   Blat
                   Postie
          Encryption
                   PGP
                   Crypt
                   MD5
          Authentication
                   Server
                   CGI
          Session
                   Cookie
                   HiddenField
          Search
                   Dynamic
                   Indexed
          DataHandler
                   American
          Application

All of these modules take advantage of object-oriented concepts, including encapsulation, interfaces, inheritance and polymorphism. These concepts are explored in more depth in the next section. Let's look at interfaces first.

It is especially important to note that not all these modules will be written by us. The Comprehensive Perl Archive Network (CPAN) contains a rich set of modules. In the best object-oriented tradition, where appropriate, we will use these modules rather than building our own entirely from scratch.
Interfaces  
An interface is a type of middleware which in the case of WebWare is used to connect application code to helper applications in a way that removes all proprietary helper application specific references from the application code.

Interfaces are needed because at different stages in its lifetime, a single application must speak to many other helper applications (email, encryption, database1, database2, database3), all of which may speak different languages or idioms.

Without an interface, an application developer MUST recode the application every time the helper application changes (a pain in the butt) or provide the application with the knowledge to speak to every single proprietary helper application it will ever come across (impossible)....

For example, suppose you write a CGI application which talks to an Oracle database....

    DB APP with ORACLE specific connection code -> ORACLE

Now, suppose that later you are hired to do the same project for a client, but the new client uses SYBASE instead of ORACLE or perhaps your company changes from ORACLE to SYBASE....what would you have to do?

Well, given this application design, you would have to recode the application

    DB APP with SYBASE specific connection code -> SYBASE

Now suppose the next client wants to talk to a flat file on a UNIX box over a network (perhaps using LWP to a secure server). That might be a real pain in the butt with lots of new code....

     DB APP with new LWP and flat file access code ->
        network available file 

Not only would this take a lot of recoding, but the new application would almost assuredly look totally different from the original.

Further, the progression from project one through project three would have introduced all sorts of little spaghetti code eddies into your code. As such, it will be difficult to maintain or document the code and the code itself will become progressively harder to transfer from you to another application developer who has similar, but not quite the same, needs or for you to transfer the code between your own projects.

The idea of distributed, reusable programming is shot down at this point. We can all share a few algorithms and the basic set of ideas, but we cannot easily share complex solutions.

This problem is solved, by using an interface.

An interface sits between an application and its helpers (like ORACLE and SYBASE in the example above). Its job is to provide a single API to the app developer and to handle the translation to the myriad of helper apps.

Thus, for example, the application developer need not worry what helpers the application uses. The application developer simply speaks to the interface in the standard API defined syntax and the interface takes care of all the back-end proprietary mumbo jumbo.

Thus, continuing our example from above, you write a single database access script and it suddenly works with absolutely no changes on ORACLE, SYBASE, or across a network to a file!

     DB APP with standard Interface API calls -->
        Oracle Datasource driver translates calls to ORACLE-speak --> 
        ORACLE
     same DB APP with standard Interface API calls --> 
        Sybase Datasource driver translates calls to SYBASE-speak --> 
        SYBASE
     same DB APP with standard Interface API calls --> 
        FTP Datasource driver translates call and manages network
        connection --> 
        File over a network
How does an Interface work?  
So how does the interface perform its magic?

Well, first, the interface defines an API (Application Programer Interface) such as

     createDatasource() 
     search()
     getAllRecords() 
     closeDatasource() 

Your code, the application code, simply calls these methods to perform the required actions. Except in the single line of code where the Datasource is created, there is no need to specify what type of Datasource is being used, because the Datasource is a black box. The application developer need only speak in the language defined by the API. The interface "abstracts away," or hides, the nitty gritty.

Now that the interface can rely on a standardized set of inputs, it must figure out how to deal with all of the helper applications.

Extracting proprietary code from the application and dumping it into the interface would help, but eventually, you would still have spaghetti code. What is needed is a way to abstract the process even further so that all proprietary code is isolated and independent. So it can be plug and playable.

To do this, the interface relies on a horde of "drivers". These drivers provide the connection between the interface and the helper applications. This is where all the proprietary connection code is located.

Each driver knows how to speak two languages: the language the interface speaks to it, and the language it speaks to a single helper application.

An ORACLE driver, for example, would never be responsible for talking to SYBASE. Yet both the ORACLE and SYBASE drivers can speak to the interface.

And that is it...

How Interfaces help you?  
So where are we after all this theory? We now have four (or more) pieces where we once had two.

Originally we had 1) an application and 2) helper application(s)

Now we have 1) an application, 2) an interface, 3) driver(s) and 4) helper application(s).

How is this better? It seems more complex!

Well the first reason it is better is that the code which is proprietary is removed from all but the drivers...thus we never mix code and create spaghetti. Proprietary stuff is isolated from everyone else so it cannot cause harm.

Also if the application developer wishes to use a different helper application, she need not worry about coding new things in the application...all she needs to do is specify a different driver.

Switching from an ORACLE backend to a DBM backend requires changing one line of code!

Switching from the DBM application to one based on distributed inventory flat files on multiple servers across a network requires changing that same single line of code!

This will increase your productivity between projects immensely. What's more, suppose you have never worked with INFORMIX even though you know ORACLE and MYSQL. Nevertheless, your boss wants an INFORMIX-backend database manager just like the one you did for ORACLE on his desktop by end of the business day.

No problem. If you have already written the interface driven application code you are probably already there. You are not responsible for writing the Informix code...That driver can be downloaded from a library at eXtropia and can be plugged into your app with one line of code!

In the case of this database example, since the eXtropia Datasource Interface talks to DBI for handling RDBMS systems, you can easily switch back and forth between a vast number of databases on the market as freeware, shareware or payware.

The Datasource Module  
One of the first interfaces we realized that we needed to create was the Datasource.

But why create a Datasource module? Why not just use DBI, which provides exactly the kind of database-independent interface that we described above? The eXtropia Datasource interface is a simpler, higher-level interface, and includes some features specifically tailored to CGI applications. It also provides access to other sources of information that are not commonly considered databases, such as file systems, FTP sites, web pages, etc. It is lightweight, only loading the code necessary for the task at hand.

Specifically, we thought, data should be able to be drawn from many sources besides standard databases. In the real-world, many web sites do not have access to databases. These sites, hosted on ISPs, have only a CGI-flatfile capability at best. Their ISPs will not provide them database services. And even if they did, such services may well be more costly than the typical web site owner can afford.

Further, the administration of a database may be more work than the typical website owner wants to deal with.

Finally, for many data storage needs, a full-fledged database may be overkill. For typical mailing lists for example, a site may get 20 or 30 signups per week. Why set up a heavy-duty database engine. like Oracle for a mere 20 or 30 additions per week?

In an increasingly connected world, data should be accessible across the network as easily as local data is. Data should be accessible via FTP, HTTP or even via another CGI script on a different server. Where the data is located should be irrelevant and transparent to the application developer.

Thus, products for an "outlet" web store might contain products from 10 or 12 other web stores across the net. The outlet store must be able to grab product information from a dozen networked locations and be able to present them in a single interface.

For this reason, we devised the Datasource module for accessing data in a manner that is suitable for web applications.

The Datasource module wraps around and incorporates the basic functionality of DBI but adds on the ability to think of a data source not as a database but any data structure such as a local flat file, a CGI script, a URL-accessed flatfile, or even hierarchical/object-based systems such as entire file systems, XML documents, and Object-Oriented databases.

It is important to note that DBI, and the SQL standard that it is based on, is not the interface to Datasource. DBI is merely one of several mechanisms for implementing a Datasource. The reason for this is that DBI is relatively heavyweight unless you need the power of a database such as mySQL, Sybase, or Oracle. For handling file-based data, we wanted a simple, lightweight interface without worrying about SQL parsing. Furthermore, we wanted to support non-SQL oriented data sources such as XML. And finally, we didn't want to force developers to have to learn SQL in order to do the simple kinds of queries and updates that account for the vast majority of web application data access.

The following is a summary of the design goals of Datasource:

  • Provide a lightweight way of accessing simple files such as log files, database flat files, and even relatively unstructured data such as message files.
  • Provide an easy migration path to SQL databases so the core CGI applications can scale.
  • Provide flexibility in the types of data to be accessed and include distributed files as well as hierarchical data types in the mix.
  • Provide an interface to handle common tasks such as querying, updating, deleting, adding elements on a Datasource.
  • Provide an automatic mapping for certain common fields.
      For example, a record timestamp (modification time) is information that comes automatically when records are stored as files in a file-system. However, in the DBI Data source, a modification time field could (optionally) be automatically updated by the Datasource whenever a record gets added or changed in the database.

      This will allow a true plug-n-play system where objects that you would not normally think of as a Datasource can be mapped easily into a database. For example, Message files in a BBS forum subdirectory could easily become Message records inside of a Forum table in a relational database.

Inside an application, Datasources can be used for everything from logs, to user information, to shopping carts. Within other modules, Datasources can be used to store session data to preserve state, and any other data that is persistent.

The Mail Module  
We also see a need to use the interface design pattern to handle emailing on multiple platforms. In this new methodology we have created a simple Mail interface with a send() method. The send() method is implemented by several different driver modules to support various mail applications including Sendmail (UNIX), SMTP mail (Mac), Postie (Windows) and Blat (Windows).

The benefit of this architecture is that modules can quickly be written for any email package available to a site administrator, if they do not use one of the common ones we currently support. And once such a module is written, all applications that use the eXtropia mail interface can use this new email tool.

The Encrypt Module  
Data encryption is becoming increasingly important, especially for applications involving e-commerce. In the eXtropia framework, encryption will be handled as an interface with just two methods, encrypt() and compare(). The underlying algorithm-specific modules will initially include PGP, crypt and MD5. Additional mechanisms will be added in the future to include additional algorithms and to keep pace with the state of the art in encryption technologies.
The Authentication Module  
Authentication modules identify, and retain information about, each user, so that a more user-friendly, customized, and secure environment can be created. We are rewriting the current authentication libraries to take advantage of the Datasource, Encryption, and Session modules.
The Session Module  
One of the main challenges of developing applications in a web environment is that HTTP is a "stateless" protocol: the web server treats each connection as a separate transaction, and doesn't remember anything about a user from one click to the next. The Session module helps solve this problem by storing some of this important information, using a unique "session key" that is given back to the browser, and passed back to the server with the next request. The session key is like an ID card, which allows the web application to know who it is dealing with.

Initially, we will provide two kinds of Session objects, to use either hidden fields or cookies to maintain state.

The Search Module  
The capability of searching HTML and other data files is an important tool to allow users to find the information they need quickly. eXtropia search modules will allow these searches to be performed on live HTML files or other documents in entire branches of a directory tree, or using pre-built indexes to accelerate queries. In addition to site-wide keyword searches, other applications will take advantage of this capability. For example, a BBS might allow you to search previous BBS messages and the HTML template based web store allows you to search the HTML in the Web Store for a product.
The DataHandler Module  
We will provide a module for the basic validation of user input that is a frequently requested service for many applications. The module will include validation services for:

  • Dates (US and European)
  • Prices
  • Credit Card Numbers
  • Email Addresses
  • Censored words
  • Alphanumeric characters
  • Integers
  • Floating point numbers
  • Alphabetic characters only
  • Enhancing security by detecting unexpected or disallowed user responses
The Application Modules  
Finally, we will provide application-level framework objects that will allow the current suite of web modules to be easily combined into a working application.

The first generation of WebWare applications were relatively standalone and monolithic, except for libraries that dealt with infrastructure such as authentication and mail libraries. This model was OK for standalone applications. But adding BBS functionality to a web store was difficult without cutting and pasting a lot of code. And in some cases, naming conflicts would arise when the cut and paste technique was used.

WebWare 2.0 removes many of these barriers. The modules are fully encapsulated, each with its own namespace. The applications are modular, and loosely coupled. In particular, the application logic is separated from the presentation or GUI code. The next generation of scripts, built with these loosely coupled application components, will be much more flexible.

The power of this approach is that a developer can integrate these objects together to create hybrid applications. For example, the Forum object could be placed into a Web Store in order to allow users to chat about the products that they are looking for. In fact, a virtual salesperson could be on call watching for messages to appear, and thus increase the interactive experience of shopping on the Net.

In addition, the loose coupling between GUI and module logic will allow us to create many different-looking applications based on the same components. Not just in terms of cosmetic look and feel, but deep changes in the program flow. For our suite of applications, this will mean that applications that are very similar will get merged. For example, the Chat script is really a specialization of the BBS. With the next generation of Application objects, a different mixture of objects will form a Chat script with minimal coding once the main BBS application is glued together. These two distinct applications can be considered two different Views, incorporating the same basic algorithms in a slightly different context.

After these core applications have been constructed as examples of how to use the components, the new concept of releasing web applications will be to release them as different Views of data. For example, a calendar script will be easily modified to show a single day, week, month, quarter, or year--all based on the same underlying application logic, encapsulated in the application modules. Web applications will simply consist of code to glue the core application logic modules together. This will make sharing customizations within the development community much easier, since significantly different functionality can be produced using the same basic building blocks.

Suggested Areas for Community Development  
Given this new model, there will be many opportunities for community participation starting in March. In particular, developers can contribute in the following areas:

  • Additional Email modules
  • Additional Datasource modules
  • Datasource, Authentication and DataHandler API Extensions
  • Additional Application-level modules
  • Building custom applications and thus testing the modules on many different platforms and in various environments.

The new modular framework will make it very easy for code to be shared among the community. The architecture was devised to facilitate sharing code and encapsulating your expertise.

New Security Enhancements and Development Template  
All of the new modules and applications will follow a new enhanced regimen of security features. Added to the usual list of security considerations will be:
  • The Security Template - We are writing a CGI template that can be used as the base of any new application. This template will automatically enforce some of the basic security considerations
  • Support for -T and -w flags, and the "use strict;" pragma - All of the new modules and scripts will be taintmode compliant and run clean under -w for extra robustness. Special attention will be paid to making sure the new suite of applications runs well under environments that are meant to provide scalability improvements such as mod_perl on Apache or FastCGI.
  • We will be adding more documentation regarding how to protect secure files including the use of .cgi extensions for admin files, file renaming, the incorporation of index.html files in the CGI tree, and more detailed explanations on how to move admin files out of web server document tree. We will also redesign the architecture of applications so that it is very easy to move and rename the admin files.
Embracing Other Open Standard Technologies  
We plan to eventually provide support for

  • Integrating with Active Server Pages (ASP)
      Since the majority of the applications will reside in objects, they will be relatively easy to glue together inside of the ASPs. ASPs are basically GUI glue that Microsoft IIS and some other Windows NT Web Servers make use of. Since ActiveState provides a Perl scripting engine for ASPs, we can glue the application components together and make use of ASP technology.
  • Integrating with eXtensible Markup Langue (XML)
      HTML works well enough for displaying data to users. But what about allowing other computers to process your Web Application data? What about being able to process one or more web applications' data in your program?

      This will be made much easier with XML. XML provides a loosely typed tool for marshalling data over the web. XML was designed to be lightweight and easy to parse, yet powerful enough to describe just about any set of data.

      XML will be supported primarily by the Datasource module in two ways. First, XML documents will be able to be decoded and queried by the Datasource object. Second, the Datasource object will be able to interface with an XML-based display object to convert Datasource records and fields in an XML-friendly manner.

We welcome your comments or questions about this document at