|
|
WebWare 2.0 White paper
|
|
|
|
|
|
This document is the technical piece of a two part set.
For the whole story, you may wish to start with eXtropia: A Case Study in Open Source
Software
Also, check out the code as it comes
out!
It has been five years since Gunther Birznieks and
Selena Sol started designing open source web
applications for Selena Sol's Script Archive, and almost
five months since the founding of eXtropia.
In that time, we have learned a great deal about how to
design an extensible application architecture.
In this document, we will review the current architecture of
eXtropia's WebWare Suite as well as present a historical overview
of the evolution of this application development model.
|
|
 |
What exactly are Web Applications?
|
 |
|
|
|
|
Web-based applications are computer programs that execute in a
web server environment. An example of such an application
would be an online store accessed via Netscape Navigator or
Internet Explorer.
Amazon.com is a high profile
example of this. Amazon has
a proprietary "Web Store" application
that they use to sell books and compact discs online.
Built on the foundations of the World Wide Web, such
applications can be run anywhere in the world at any time and
are completely cross platform. Web applications provide
a rich interactive environment through which the user can
further define their unique online experience. Without web
applications to breathe life and provide user interaction, a
web page is limited to displaying static electronic text and
images.
|
|
 |
The Generic Web Application
|
 |
|
|
|
|
Regardless of the specific tasks they perform, all web
applications do the same things generically. Specifically,
all web applications must do the following:
- Get data from a user on the web - Traditionally,
getting the user data involves creating and serving
a user interface such as a Java GUI or an HTML/DHTML
form. The user interface submits user-supplied data
by sending it, via GET or POST requests, to the
web server that is serving the user interface. The
web server will then pass the data to a server-processing
agent (application) such as a CGI script, a Java Servlet, or
a server-integrated API script such as Cold Fusion or mod_perl.
Typically, the developer must code the user interface and
be able to parse the data coming from the user interface to the
server-processing agent.
- Validate the user's data - Once the data has been handed
off to the server-processing agent, the agent must check
the data submitted to make sure it is valid. Such validation
might include making sure a date is a valid date (i.e.
not Oct 34,-1000), making sure a price is valid (i.e.
not $123.98ASDF-1), or making sure the incoming data is
safe for processing (e.g. not exec `rm *.*`;). An agent
might also communicate with other processing agents such as
a credit card validation service.
Typically, the developer
must define the logic of validation and embed that into the
web application.
- Process that data - Once the data has been validated, the
agent must process it. Processing often involves 1) data
storage and retrieval and 2) inter-application communication.
- Data Storage and Retrieval - Often a web application must
access data from some data source like a Relational Database Management
System (RDBMS)
or a local file on the web server. Web applications
usually need to be able to read and write to these data
sources.
- Inter-Application Communication - Web applications also need
to be able to work with other resources such as email,
fax-gateways, file locking mechanisms, encryption applications and even other web
servers.
Typically, the programmer must spend most of her time defining
the logic of the work flow. This is the piece that is
most often thought of as the web application. For example
you might specify that a data source should be opened, a
specific data row should be selected based on a given keyword
and search description, and the row should be updated based on
the submission of a Structured Query Language (SQL) statement.
- Respond to the client who submitted the request
in the first place. Usually, the developer will
code the server agent to send a
response to the client based on the processing.
This might be as simple as a dynamically generated
thank you note HTML file or e-mail receipt, or as
complex as a formatted report with images generated
on-the-fly.
In performing these generic functions, a web application
should be
- Secure - Both the privacy of the data and the
access to supporting server resources must be secure.
- Scalable - The application must be able to serve
one client at a time or ten thousand clients
at the same time without a noticeable degradation of
service.
- Fast - The execution of the application must appear
rapid to the user even within the context of clogged
Internet bandwidth.
- User-Friendly - The application must be so simple to
use that a user on the web should need no instructions,
or only minimal instructions,
in order to perform the task they want to complete.
- Maintenance-Friendly - Because web application services
must change rapidly, the application must be built
so that it can be modified, fixed, or maintained with
minimal outlay of time and money.
- Reusable - The cost of reinventing agent processing
for each task is too great. Processing agent technology must
be reusable between projects if it is to be affordable.
As you can see, designing a web application represents quite
a bit of design and coding.
|
|
 |
WebWare 1.0:The Old Framework
|
 |
|
|
|
|
In 1994, Selena Sol and Gunther Birznieks were doing just that - writing
lots and lots of code! By the time they had completed their third
proprietary web store, they realized that this web application
development model needed to be streamlined.
To do so, the two began to write modularized, generic web
application "cores" that could be more easily molded and remolded
from one project to another.
Rather than adopt proprietary languages/environments such as
Cold Fusion or NetObjects, Perl and Java were chosen as the
languages to be used. They were selected because of their ease
of use and wide-spread appeal. Ease of use was important because
they wanted the applications to be accessible to both beginning
and advanced developers. "Wide-spread appeal" was important
to allow the applications to benefit as wide an audience as possible.
|
NOTE: Now we want to be VERY clear that we have nothing
against proprietary solutions for web application
development. For instance, we think Cold Fusion is an
excellent product by itself.
And coding in a proprietary environment may often be the best
solution to a specific problem.
However, when we were designing our code, we were interested
in writing code that would be useful to the masses of
web users who might not have
access to these, often expensive, technologies.
Such clients might include "Joe Web" who has a standard
internet account hosted by ANY_ISP_USA. ANY_ISP_USA
probably offers Joe CGI
functionality but not a personal RDB or application
development environment like Cold Fusion.
But this made sense for Joe since not only
could he not afford such extras, but he would not
know what to do with them. Joe is in the
business of selling widgets on the web, not
programming or database administration. Joe
should have a solution that meets his needs.
We also wanted to develop an open standards protocol for
developing applications that could be used as a springboard
for other developers not associated with us. We hoped
to provide a foundation upon which others could build without
fear of us, or anyone else, pulling the code out from under them.
|
This process of generalization and modularization involved
three stages:
- Extracting user interface code into HTML configuration
files
- Extracting Programmatic Intelligence into application
setup files
- Documenting the code to death
|
|
|
|
Essentially, any code that sent GUI (HTML or Java) data to
the user was extracted from the application and placed in
a GUI configuration file.
The great benefit of this was that when the GUI code was
separated away from the main code,
a user unfamiliar with application programming,
but familiar with basic GUI programming, such as HTML, could
modify the user interface code without worrying about
breaking the program.
After all, GUI code tends to change more often between sites
than other types of code. Every web site tends to have their
own specific look-and-feel.
In addition, separating out the
GUI allowed users to apply bug fixes
without disturbing their customized GUI changes.
Since program logic was separated from GUI logic, a fix
to the program logic did not require the user to redo all their
GUI/HTML changes every time a new version came out.
Here is an excerpt from one such GUI configuration file:
sub required_fields_error_message
{
print "Content-type: text/html\n\n";
print qq~
<HTML>
<HEAD>
<TITLE>Error in Processing Form -
Required Fields</TITLE>
</HEAD>
<BODY BGCOLOR = "FFFFFF" TEXT = "000000">
<BLOCKQUOTE>
<H2>
Whoops, I'm sorry, the following fields
are required:
<UL>
<LI>Name
<LI>Email
<LI>Comments
</UL>
Please click the "back" button on your
browser or click <A HREF =
"$url_of_the_form">here</A> to go back and
make sure you fill out all
the required information.
</H2>
</BLOCKQUOTE>
</BODY>
</HTML>~;
}
You can see how HTML-like the code appears. We found that
by extracting out the code like that, users were much less
intimidated about making changes.
This separate GUI code was imported into the
main application code using Perl's require or use keywords.
Then, the GUI-specific routines could be called from there.
|
|
 |
Extracting Programming Logic
|
 |
|
|
|
|
Extracting implementation-specific programming logic was the next step
in the eXtropia scripts evolution.
We knew that we would have to provide a host of services for each application
that could be
turned on or off depending on what services
each installation would support. To do that we needed to
write the methods into the base code and then provide the
user with "switches" in an application setup file.
A user need only specify the general work flow of the application by
answering "questions". Below is an excerpt from such a file:
$should_i_email_orders = "yes";
$should_i_use_pgp = "no";
$should_i_append_a_database = "yes";
The actual code would check for each case and act accordingly
using "if tests" such as:
if ($should_i_email_orders eq "yes")
{
Go ahead and mail;
}
else
{
Don't mail didilly squat;
}
The trick was to predict all the myriad ways the script
might need to function, build in that generic code, and
finally provide "should_i" options in the configuration file to
activate the relevant code in the main application.
|
|
|
|
Finally, the code was documented extensively
so that a developer could easily make site-specific
modifications. We commented the code so well that a number
of people said they had learned to program in large part by studying
the comments and the accompanying code. When more seasoned
programmers began to grouse that the scripts were more comments
than code, we figured we had accomplished our goal.
Note that we did not want to be in the business of
installations and customizations, we wanted to provide
extensible source code so that others could do that.
|
|
 |
The Result: Limits of the Model
|
 |
|
|
|
|
We had a great deal of success with this streamlined web
application development model. Thousands of sites implemented
the code and found that they never needed to do anything
beyond editing the HTML in the GUI definition file.
However, within a couple
of years, chinks in the model became more apparent.
For one, it was hard to organize group programming projects
such that add-ons were easily transferred to the entire group.
The code needed to be far more "Lego-ized" for that to happen.
That is, there needed to be a way to easily disassemble
code and re-assemble it in different configurations with
very little effort. Similarly, new and more efficient
routines should be easily "popped" into existing applications
without breaking the old routines that depended on it.
We had modularized the GUI and implementation specific setup
information, but we still needed to modularize the internal
generic algorithms. Once these algorithms were modularized,
a more efficient, secure and robust algorithm could instantly
replace an older algorithm in the main code without breaking
any code that used that routine.
Likewise, services such as data access could be more
transparent, so that a user could easily move the
code back and forth between databases such as Sybase or Oracle.
|
|
 |
Moving to Object Oriented Design with Perl 5 and Java Servlets
|
 |
|
|
|
|
When Object Oriented Perl 5 became ubiquitous and
Server Side Java/Servlets became a reality, it was clear which
direction the development framework would have to go.
Object Oriented Design (OOD) was the answer!
So what exactly is OOD and how does it help
solve the problems we just discussed?
Well, OOD is based on the concept of objects.
You can think of an object as a little "black box" of
functionality that accepts some standardized input and
produces some standardized output or behavior.
|
"Black Box" is an engineering term used to describe
a thing that is encapsulated, or has the property of
being "plug and play". A black box
provides some service in such a way that the
system architect (such as a programmer) need not know
anything about the internals of the object to use it.
An object just "plugs-in" to an existing system.
|
What is an example of an object that we can sink out
teeth into?
Think of a telephone. Do you know how it works--the details
of the circuitry from end to end?
Probably not. However, whether or not you know how the phone or the
underlying telecommunications systems work, you can still
call your mom on Christmas.
This is because a phone is a black box. It accepts
a phone number as input and returns a phone connection as output.
All the magic of creating phone connections in the international
telecommunication network is handled magically and invisibly
by the phone object (and a bunch of other helper objects--networks,
switches, etc.--that it depends on).
All you need to be concerned with is the
protocol for picking up the receiver, dialing, and
what you are going to say.
So, in summary, objects have the following attributes:
- Objects provide "plug-and-play" functionality.
They "hide" the internal
machinery of how they do their job. Users need not
understand these internals in order to use the object.
In computer-science literature, this property goes by the fancier name
"encapsulation."
- Objects accept standardized input.
Objects have an API (Application Programmers Interface),
and all objects of the same type share the same interface.
The computer-science jargon for this property is that these objects
"implement" the same interface.
- Objects produce some desired output or behavior.
Unlike the inputs and the interface, this doesn't have to be the
same for all objects of the same type. For example, a cellular
phone contacts the telephone network by transmitting radio waves,
rather than using copper wires, but the intended outcome--contacting
your mom--is the same. The computer-science term for this is
"polymorphism."
Essentially, the API defines a kind of contract: if you follow the rules,
and provide the object with data in the right form, it will
do what you ask.
In the case of the phone, you use the standard interface (dial or keypad)
to ask the object to do something (make a call), with some data you provide
(the phone number), and the object performs the action you requested.
Software objects work the same way. When you call one of the
routines in the interface, your data gets "magically" transformed
inside the object, and then the transformed data is returned to you,
or the action you requested is performed.
|
|
 |
How OOD Solves Problems of the Old Model
|
 |
|
|
|
|
There are three primary reasons that objects are
an excellent tool for large, complex
programming projects... particularly ones that must
be frequently changed, and where code reuse is important.
First, since you need not concern yourself with the
internals of objects in order to use them, you can
create complex programs built on a library of objects
without needing to be an expert in each area of the
program.
For example, if you want to incorporate
database access into a program, and you can use a
pre-written database connectivity object to do it,
you needn't worry about how database connectivity
actually works. You just let the object, designed by a
database connectivity specialist, do it for you.
Using objects allows you to focus on the work flow of a
program rather than the nitty gritty of particular algorithms.
Well chosen object and variable names actually allow the
programmer to program in terms of real world objects.
Not strings and arrays and hashes, but messages and shopping
carts and users. This makes it easier to write the code,
and even more importantly, easier to read it.
Second, an object-oriented framework makes it very easy to
divide development work among community members.
Objects can be developed independently and submitted
to the common pool of objects. Different objects can be
written to interface with various tools on various platforms.
And as long as all of objects of the same type
conform to the same standard API, they can be plugged into other
people's work with little or no effort.
Finally, objects can be modified (made more efficient, secure,
and robust) without breaking all the code that uses them. Since
the internals are hidden anyway, the client code (the code using
an object) does not know or care that the implementation is changed.
So long as the API stays the same, the client code is happy.
|
|
 |
The Existing Perl Modules
|
 |
|
|
|
|
Soon after the release of Perl 5, the community was blessed
with a flood of excellent objects (Perl Modules) that
could be used in support of web-based applications. Among
these modules were the CGI, LWP and DBI modules.
The CGI
module takes care of organizing form name/value
pairs and HTML output, as well as other useful features such
as file uploading, error handling, and manipulation of the
environment variables.
The LWP (Library for Web Programming) module made networking simple.
A standard API allows your Perl script or CGI
application to take advantage of all the network services available to
a web browser, including FTP, GOPHER ,HTTP, local files,
and HTTPS (Secure Socket Layer) connections.
Finally, the DBI
module gives your CGI application access to
almost any commercial, shareware, or freeware
database that is used in support of
web applications. Without knowing
anything about database technology, a developer can use the
simple interface to access and manipulate the most complex
of databases. Further, since the DBI interface ensures that
all of the database drivers provide the same interface,
if the backend database suddenly was changed from one database to
another--for example, from Oracle to Sybase--
no modification would be necessary in the client code. You would
just need to plug in a different DBD (Database Dependent) module.
We will discuss interface modules of this sort in depth a little later.
|
|
 |
Proposed eXtropia Extensions
|
 |
|
|
|
|
The CGI, LWP, and DBI modules are all fantastic tools for
web application development. However, they are not enough.
In fact, for the most part, these three modules are more useful
at a deep infrastructure level. Most real-world application code
deals with higher order issues (or should, at any rate).
With 5 years of experience designing generic, open source
applications behind us, we are in a unique position
to know what other black boxes must be developed for truly
efficient web application development. In addition,
the valuable feedback and ideas we have gathered from users
and developers from all types of businesses during that time,
allow us to leverage that knowledge to create a framework
that is likely to be extensible to any type of business these
components will be used in.
Coding of several modules began in November 1998.
The new modules include:
eXtropia Modules
Datasource
File
DBI
HTTP
FTP
FileSystem
DBM
XML
Mail
SendMail
SMTP
Blat
Postie
Encryption
PGP
Crypt
MD5
Authentication
Server
CGI
Session
Cookie
HiddenField
Search
Dynamic
Indexed
DataHandler
American
Application
All of these modules take advantage of object-oriented concepts,
including encapsulation, interfaces, inheritance and polymorphism.
These concepts are explored in more depth in the next section. Let's
look at interfaces first.
|
It is especially important to note that not all these modules will
be written by us. The Comprehensive Perl
Archive Network (CPAN) contains a rich set of modules.
In the best object-oriented tradition, where appropriate,
we will use these modules rather
than building our own entirely from scratch.
|
|
|
|
|
An interface is a type of middleware which in the case of WebWare is used
to connect application code to helper applications in a way that
removes all proprietary helper application specific references
from the application code.
Interfaces are needed because at different stages in its lifetime,
a single application must speak to many other helper applications
(email, encryption, database1, database2, database3), all of which
may speak different languages or idioms.
Without an interface, an application developer MUST recode the
application every time the helper application changes (a pain in
the butt) or provide the application with the knowledge to speak
to every single proprietary helper application it will ever
come across (impossible)....
For example, suppose you write a CGI application which talks to an
Oracle database....
DB APP with ORACLE specific connection code -> ORACLE
Now, suppose that later you are hired to do the same project
for a client, but the new client uses SYBASE instead of ORACLE
or perhaps your company changes from ORACLE to SYBASE....what
would you have to do?
Well, given this application design, you would have to
recode the application
DB APP with SYBASE specific connection code -> SYBASE
Now suppose the next client wants to talk to a flat
file on a UNIX box over a network (perhaps using LWP to
a secure server). That might be a real pain in the butt
with lots of new code....
DB APP with new LWP and flat file access code ->
network available file
Not only would this take a lot of recoding, but the new
application would almost assuredly look totally different
from the original.
Further, the progression from project one through project
three would have introduced all sorts of little spaghetti
code eddies into your code. As such, it will be difficult
to maintain or document the code and the code itself
will become progressively harder to transfer from you
to another application developer who has similar, but not quite the same,
needs or for you to transfer the code between your own
projects.
The idea of distributed, reusable programming is shot down at
this point. We can all share a few algorithms and the basic
set of ideas, but we cannot easily share complex solutions.
This problem is solved, by using an interface.
An interface sits between an application and its
helpers (like ORACLE and SYBASE in the example above). Its
job is to provide a single API to the app developer and
to handle the translation to the myriad of helper apps.
Thus, for example, the application developer need not worry what
helpers the application uses. The application developer simply
speaks to the interface in the standard API defined syntax and
the interface takes care of all the back-end proprietary mumbo
jumbo.
Thus, continuing our example from above, you write a single
database access script and it suddenly works with absolutely no
changes on ORACLE, SYBASE, or across a network to a file!
DB APP with standard Interface API calls -->
Oracle Datasource driver translates calls to ORACLE-speak -->
ORACLE
same DB APP with standard Interface API calls -->
Sybase Datasource driver translates calls to SYBASE-speak -->
SYBASE
same DB APP with standard Interface API calls -->
FTP Datasource driver translates call and manages network
connection -->
File over a network
|
|
 |
How does an Interface work?
|
 |
|
|
|
|
So how does the interface perform its magic?
Well, first, the interface defines an API (Application Programer
Interface) such as
createDatasource()
search()
getAllRecords()
closeDatasource()
Your code, the application code, simply calls these methods
to perform the required actions. Except in the single line
of code where the Datasource is created, there is no need to specify what
type of Datasource is being used, because the Datasource is a black box.
The application
developer need only speak in the language defined by the API.
The interface "abstracts away," or hides, the nitty gritty.
Now that the interface can rely on a standardized set of
inputs, it must figure out how to deal with all of the
helper applications.
Extracting proprietary code from the application and dumping
it into the interface would help, but eventually, you would
still have spaghetti code. What is needed is a way to
abstract the process even further so that all proprietary code
is isolated and independent. So it can be plug and playable.
To do this, the interface relies on a horde
of "drivers". These drivers provide the connection between the
interface and the helper applications. This is where all the
proprietary connection code is located.
Each driver knows how to speak two languages: the language
the interface speaks to it, and the language it speaks to a single
helper application.
An ORACLE driver, for example, would never be responsible for
talking to SYBASE. Yet both the ORACLE and SYBASE
drivers can speak to the interface.
And that is it...
|
|
|
|
So where are we after all this theory? We now have four (or more)
pieces where we once had two.
Originally we had 1) an application and 2) helper application(s)
Now we have 1) an application, 2) an interface, 3) driver(s) and
4) helper application(s).
How is this better? It seems more complex!
Well the first reason it is better is that the code which is
proprietary is removed from all but the drivers...thus we never
mix code and create spaghetti. Proprietary stuff is isolated from
everyone else so it cannot cause harm.
Also if the application developer wishes to use a different
helper application, she need not worry about coding new things
in the application...all she needs to do is specify a different
driver.
Switching from an ORACLE backend to a DBM backend requires
changing one line of code!
Switching from the DBM application to one based on distributed
inventory flat files on multiple servers across a network
requires changing that same single line of code!
This will increase your productivity between projects immensely.
What's more, suppose you have never worked with INFORMIX even though
you know ORACLE and MYSQL. Nevertheless, your boss wants an
INFORMIX-backend database manager just like the one you did
for ORACLE on his desktop by end of the business day.
No problem. If you have already written the interface driven
application code you are probably already there. You are not
responsible for writing the Informix code...That driver can be
downloaded from a library at eXtropia and can be plugged into
your app with one line of code!
In the case of this database example, since the eXtropia
Datasource Interface talks to DBI for
handling RDBMS systems, you can easily switch back and forth
between a vast number of databases on the market as freeware,
shareware or payware.
|
|
|
|
One of the first interfaces we realized that we needed to create
was the Datasource.
But why create a Datasource module? Why not just use DBI, which
provides exactly the kind of database-independent interface that we
described above? The eXtropia Datasource
interface is a simpler, higher-level interface, and
includes some features specifically tailored to CGI applications.
It also provides access to other sources of information that are not
commonly considered databases, such as file systems, FTP sites,
web pages, etc. It is lightweight, only loading the code necessary
for the task at hand.
Specifically, we thought, data should be able to be drawn from many
sources besides standard databases. In the real-world, many
web sites do not have access to databases. These sites, hosted on
ISPs, have only a CGI-flatfile capability at best. Their ISPs
will not provide them database services. And even if they did,
such services may well be more costly than the typical web site owner
can afford.
Further, the administration of a database may be more work than
the typical website owner wants to deal with.
Finally, for many
data storage needs, a full-fledged database may be overkill.
For typical mailing lists for example, a site may get 20 or 30
signups per week. Why set up a heavy-duty database engine.
like Oracle for a mere 20 or 30 additions per week?
In an increasingly connected world, data should be accessible across
the network as easily as local data is. Data
should be accessible via FTP, HTTP or even via another CGI script
on a different server. Where the data is located should be irrelevant
and transparent to the application developer.
Thus, products for an "outlet" web store might contain products
from 10 or 12 other web stores across the net. The outlet
store must be able to grab product information from a dozen networked
locations and be able to present them in a single interface.
For this reason, we devised the Datasource module
for accessing data in a manner that is suitable for web applications.
The Datasource module wraps around and incorporates the basic
functionality of DBI but adds on the ability to think of
a data source not as a database but any data structure such as
a local flat file, a CGI script, a URL-accessed flatfile,
or even hierarchical/object-based systems such as entire
file systems, XML documents, and Object-Oriented databases.
It is important to note that DBI, and the SQL standard that it
is based on, is not the interface to Datasource.
DBI is merely one of several mechanisms for implementing
a Datasource.
The reason for this is that DBI is relatively heavyweight unless
you need the power of a database such as mySQL, Sybase, or Oracle.
For handling file-based data, we wanted a simple,
lightweight interface without worrying about SQL parsing. Furthermore,
we wanted to support non-SQL oriented data sources such as XML.
And finally, we didn't want to force developers to have to learn SQL
in order to do the simple kinds of queries and updates that account for the
vast majority of web application data access.
The following is a summary of the design goals of Datasource:
- Provide a lightweight way of accessing simple files such
as log files, database flat files, and even relatively
unstructured data such as message files.
- Provide an easy migration path to SQL databases so the
core CGI applications can scale.
- Provide flexibility in the types of data to be accessed
and include distributed files as well as hierarchical
data types in the mix.
- Provide an interface to handle common tasks such as querying,
updating, deleting, adding elements on a Datasource.
- Provide an automatic mapping for certain common fields.
For example, a record timestamp (modification time) is information that
comes automatically when records are stored as files in a file-system.
However, in the DBI Data source, a modification time field
could (optionally) be automatically updated by the
Datasource whenever a record gets added or changed in the database.
This will allow a true plug-n-play system where objects
that you would not normally think of as a Datasource
can be mapped easily into a database. For example,
Message files in a BBS forum subdirectory could easily become
Message records inside of a Forum table in a relational
database.
Inside an application, Datasources can be used for everything from logs,
to user information, to shopping carts. Within other modules, Datasources
can be used to store session data to preserve state,
and any other data that is persistent.
|
|
|
|
We also see a need to use the interface design pattern to handle
emailing on multiple platforms. In this new methodology
we have created a simple Mail interface with a send() method.
The send() method is implemented by several different driver
modules to support various mail applications including
Sendmail (UNIX), SMTP mail (Mac),
Postie (Windows) and Blat (Windows).
The benefit of this
architecture is that modules can quickly be written for any
email package available to a site administrator,
if they do not use one of the common ones we currently support.
And once such a module is written, all applications that use
the eXtropia mail interface can use this new email tool.
|
|
|
|
Data encryption is becoming increasingly important, especially for
applications involving e-commerce.
In the eXtropia framework, encryption will be handled as an interface
with just two methods, encrypt() and compare(). The underlying
algorithm-specific modules will initially
include PGP, crypt and MD5. Additional mechanisms will be added in the
future to include additional algorithms and to keep pace with the state
of the art in encryption technologies.
|
|
 |
The Authentication Module
|
 |
|
|
|
|
Authentication modules identify, and retain information
about, each user, so that a more user-friendly, customized,
and secure environment can be created.
We are rewriting the current authentication libraries to take
advantage of the Datasource, Encryption, and Session modules.
|
|
|
|
One of the main challenges of developing applications in a
web environment is that HTTP is a "stateless" protocol:
the web server treats each connection as a separate transaction,
and doesn't remember anything about a user from one click
to the next. The Session module helps solve this problem
by storing some of this important information, using a unique
"session key" that is given back to the browser, and passed
back to the server with the next request. The session key is
like an ID card, which allows the web application to know who it
is dealing with.
Initially, we will provide two kinds of Session objects, to
use either hidden fields or cookies to maintain state.
|
|
|
|
The capability of searching HTML and other data files is an
important tool to allow users to find the information they need quickly.
eXtropia search modules will allow these searches to be performed on
live HTML files or other documents in entire branches of a directory
tree, or using pre-built indexes to accelerate queries.
In addition to site-wide keyword searches, other applications will
take advantage of this capability. For example,
a BBS might allow you to search previous BBS messages and the HTML
template based web store allows you to search the HTML
in the Web Store for a product.
|
|
|
|
We will provide a module for the basic validation of user
input that is a frequently requested service for
many applications. The module will include validation
services for:
- Dates (US and European)
- Prices
- Credit Card Numbers
- Email Addresses
- Censored words
- Alphanumeric characters
- Integers
- Floating point numbers
- Alphabetic characters only
- Enhancing security by detecting unexpected or disallowed user responses
|
|
|
|
Finally, we will provide application-level framework objects that
will allow the current suite of web modules to be
easily combined into a working application.
The first generation of WebWare applications were relatively
standalone and monolithic, except
for libraries that dealt with infrastructure such as authentication
and mail libraries. This model was OK for standalone applications.
But adding BBS functionality to a
web store was difficult without cutting and pasting a
lot of code. And in some cases, naming conflicts would arise when
the cut and paste technique was used.
WebWare 2.0 removes many of these barriers. The modules are fully
encapsulated, each with its own namespace. The applications are
modular, and loosely coupled. In particular, the application logic
is separated from
the presentation or GUI code. The next generation of scripts,
built with these loosely coupled application components,
will be much more flexible.
The power of this approach is that a developer can integrate these
objects together to create hybrid applications. For example,
the Forum object could be placed into a Web Store in order
to allow users to
chat about the products that they are looking for. In
fact, a virtual salesperson could be on call watching
for messages to appear, and thus increase the interactive
experience of shopping on the Net.
In addition, the loose coupling between GUI and module logic
will allow us to create many different-looking applications
based on the same components. Not just in terms of cosmetic
look and feel, but deep changes in the program flow. For
our suite of applications, this will mean that applications that
are very similar will get merged. For example, the Chat
script is really a specialization of the BBS. With the
next generation of Application objects, a different
mixture of objects will form a Chat script with minimal
coding once the main BBS application is glued together.
These two distinct applications can be considered two different
Views, incorporating the same basic algorithms in a slightly
different context.
After these core applications have been constructed as
examples of how to use the components, the new concept
of releasing web applications will be to release them as different
Views of data. For example, a calendar script will be easily
modified to show a single day, week, month, quarter, or
year--all based on the same underlying application logic,
encapsulated in the application modules.
Web applications will simply consist of code to glue the core application
logic modules together. This will make sharing customizations
within the development community much
easier, since significantly different functionality can be produced
using the same basic building blocks.
|
|
 |
Suggested Areas for Community Development
|
 |
|
|
|
|
Given this new model, there will be many opportunities for community
participation starting in March. In particular, developers
can contribute in the following areas:
- Additional Email modules
- Additional Datasource modules
- Datasource, Authentication and DataHandler API Extensions
- Additional Application-level modules
- Building custom applications and thus testing the modules
on many different platforms and in various environments.
The new modular framework will make it very easy for code to be shared
among the community. The architecture was devised to facilitate sharing
code and encapsulating your expertise.
|
|
 |
New Security Enhancements and Development Template
|
 |
|
|
|
All of the new modules and applications will follow a new enhanced
regimen of security features. Added to the usual list
of security considerations will be:
- The Security Template - We are writing a CGI template that
can be used as the base of any new application. This template
will automatically enforce some of the basic security
considerations
- Support for -T and -w flags, and the "use strict;" pragma
- All of the new modules and scripts will be taintmode compliant and
run clean under -w for extra robustness. Special attention will
be paid to making sure the new suite of applications runs well under
environments that are meant to provide scalability improvements such
as mod_perl on Apache or
FastCGI.
- We will be adding more documentation regarding how to
protect secure files including the use of .cgi extensions
for admin files, file renaming, the incorporation of
index.html files in the CGI tree, and more detailed
explanations on how to move admin files out of
web server document tree. We will also redesign the
architecture of applications so that it is very easy to move
and rename the admin files.
|
|
 |
Embracing Other Open Standard Technologies
|
 |
|
|
|
|
We plan to eventually provide support for
- Integrating with Active
Server Pages (ASP)
Since the majority of the applications will reside in objects, they
will be relatively easy to glue together inside of the ASPs. ASPs are
basically GUI glue that Microsoft IIS and some other Windows NT Web Servers
make use of. Since ActiveState provides a Perl scripting engine for ASPs,
we can glue the application components together and make use of ASP
technology.
- Integrating with eXtensible Markup Langue
(XML)
HTML works well enough for displaying data to users. But what about allowing
other computers to process your Web Application data? What about being able
to process one or more web applications' data in your program?
This will be made much easier with XML. XML provides a loosely typed tool
for marshalling data over the web. XML was designed to be lightweight and
easy to parse, yet powerful enough to describe just about any set of
data.
XML will be supported primarily by the Datasource module in two ways.
First, XML documents will be able to be decoded and queried by the
Datasource object. Second, the Datasource object will be able to
interface with an XML-based display object to convert Datasource
records and fields in an XML-friendly manner.
We welcome your comments or questions about this document at
|
|