eXtropia: the open web technology company
Technology | Support | Tutorials | Development | About Us | Users | Contact Us
 ::   Tutorials
 ::   Presentations
Perl & CGI tutorials
 ::   Intro to Perl/CGI and HTML Forms
 ::   Intro to Windows Perl
 ::   Intro to Perl 5
 ::   Intro to Perl
 ::   Intro to Perl Taint mode
 ::   Sherlock Holmes and the Case of the Broken CGI Script
 ::   Writing COM Components in Perl

Java tutorials
 ::   Intro to Java
 ::   Cross Browser Java

Misc technical tutorials
 ::   Intro to The Web Application Development Environment
 ::   Introduction to XML
 ::   Intro to Web Design
 ::   Intro to Web Security
 ::   Databases for Web Developers
 ::   UNIX for Web Developers
 ::   Intro to Adobe Photoshop
 ::   Web Programming 101
 ::   Introduction to Microsoft DNA

Misc non-technical tutorials
 ::   Misc Technopreneurship Docs
 ::   What is a Webmaster?
 ::   What is the open source business model?
 ::   Technical writing
 ::   Small and mid-sized businesses on the Web

Offsite tutorials
 ::   ISAPI Perl Primer
 ::   Serving up web server basics
 ::   Introduction to Java (Parts 1 and 2) in Slovak


Sherlock Holmes and the case of the broken CGI Script

This month, I'd like to address the general question, "I downloaded a script and it won't work." that I get about 5 or 6 times every day. However, the way that I will answer the question, is by telling you a story. I am going to tell you a story about me. A mystery.

It is a story about how I debug scripts on the zillions of different, intractable, curmudgeonous systems that exist on the web when I have tried all the common debugging practices.

It is a story about how I find the culprit bug when I am not exactly sure about how the operating system works, which web browsers are trying to run the script, what funky directives the local system administrator has applied to the server, or any other number of big, hairy, ugly question marks that stand between me and a programming-free weekend.

It is a mystery which, as most mysteries do, begins with Sir Arthur Conan Doyle.

Let's see what Doyle has to say about debugging.

    "By a man's finger-nails, by his coat-sleeve, by his boots, by his trouser-knees, by the callosities of his forefinger and thumb, by his expression, by his shirt- cuffs -- by each of these things a man's calling is plainly revealed. That all united should fail to enlighten the competent inquirer in any case is almost inconceivable."
    - From A Study In Scarlet

Well, this may not seem like a discussion of software debugging, but it really is. What Doyle is trying to say is that all software and hardware bugs WANT to be caught. In fact they want to be caught so badly, that they carefully lay clues for you as to their whereabouts. Perhaps Doyle meant to say something like the following:

    "By a scripts error message on the command line, its output to STDOUT, by the HTTP message it sends to the web browser window, by its entry in the error log, by the interaction of its algorithms, by the libraries it calls and the responses sent by them, -- by each of these things a scripts failures are plainly revealed. That all united should fail to enlighten the competent hacker in any case is almost inconceivable."
    - From A Study In CGI

As a debugger, it's your job to listen to those clues, put them together into a theory which can be tested, and test the theory against the software package. In every case, you will bat yourself on the brow and say to yourself, "Doh! Of course, how simple!". Because, when all is said and done, computers are pretty simple creatures and when they break there are pretty simple reasons.

The Virtue of Nothingness  
Benjamin Hoff once revealed this interesting little story about Taoism and I suppose I might pass it along to you.

    "I am learning," Yen Hui said.
    "How?" the Master asked.
    "I forgot the rules of Righteousness and the levels of Benevolence," he replied.
    "Good, but could be better," the Master said.
    A few days later, Yen Hui remarked, "I am making progress."
    "How?" the Master asked.
    "I forgot the Rituals and the Music," he answered.
    "Better, but not perfect," the Master said.
    Some time later, Yen Hui told the master, "Now I sit down and forgot everything."
    The Master looked up, startled, "What do you mean, you forgot everything?" he quickly asked.
    "I forgot my body and senses, and leave all appearance and information behind," answered Yen Hui. "In the middle of Nothing, I join the source of All Things."
    The Master bowed. "You have transcended the limitations of time and knowledge. I am far behind you. You have found the Way!"

Benjamin, added, "An empty sort of mind is valuable for finding Perls and Tails and things because it can see what's in front of it. An Overstuffed mind is unable to." (Well he actually spelled it like "Pearls", but we know what he meant!)

What does this have to do with CGI debugging you ask? Well, it has everything to do with CGI debugging. CGI debugging is not a skill. It is not a thing you learn in school. It is not something that is particularly aided by FAQS, or books, or system administrators, or discussion boards.

CGI debugging is a state of mind.

If I have spent more than an hour on a problem I stop. Very few problems necessitate more than an hour to solve, so if I've been sitting there for an hour I can be sure that it is most likely that the problem I am having is not the bug, but me.

At this point, I turn off the monitor, light a candle and some incense, (which I always have a store of in my middle desk drawer) turn on some music, and lie on the floor and try to isolate instruments in the songs into their separate tracks.

Note: For debugging I recommend "Technotronic: The Best of Trance", anything by the Cocteau Twins or Enya, "Wish You Were Here" from Pink Floyd or "Kiss" from the Cure.)

I might even go out and walk around the block if it is warm and sunny...there happens to be a good climbing tree outside my work.

About 20 minutes later I should be ready to get back to work having achieved several crucial things.

  • I am not one script closer to a heart attack at 35.
  • I am not angry or frustrated.
  • I have cleared my mind of all my preconceived ideas about what I think the bug is saying and am prepared to "listen" to the bug to find out what it says it has to say.
  • I am not intimidated by the script. Programming is like riding horses. The minute the script thinks that it is in charge is the minute it throws you off. (Well, most horses are not that mean, but you know the expression)
Newtonian Methodology and the Nitty Gritty of Debugging  
Upon returning from the void, the first thing I do is to set aside the program and start by coding something really, really small.

You see, debugging is a Newtonian exersise. And in a Newtonian universe the best thing you can do is break everything up into the smallest pieces you can because the whole is going to be a summation of the parts and when you find the faulty part, you find the problem. (1)

Starting with Hello World  
So, you should start by creating the most minimal CGI program you can so that you can determine what special traits your local executing environment has that might cause a more complex program to fall apart.

Try this little script which you might call hello_world.cgi....

print "Content-type: text/html\n\n";
print "Hello World";

Okay, now set the permission for this little script so that it is readable and executable by the web server. Typically, you will use the following command on a UNIX server...

chmod 755 hello_world.cgi (2)

Next, run the "hello world" script from your browser. You will probably need to access it with a URL something like the following:


Does it work? If not...

  • The first line (#!/usr/local/bin/perl) might be wrong or you may have accidentally put a blank line before it so that it is not "really" the first line.

    Why would this cause an error?

    Well, because the first line of any cgi script will define the local location of the "Perl Interpreter" which is a program that the server uses to execute your CGI script written in Perl. In this example, you have told the server that the Perl Interpreter (which is a program called "perl") is located in the "/usr/local/bin" directory.

    The script needs to know the location of the Perl Interpreter if it is to use it to execute the program and it is possible that on your server, the Perl Interpreter is not located in the /usr/local/bin directory.

    There are many ways to find out where the Perl Interpreter can be found on your server (several are discussed in Instant Web Scripts with CGI/Perl in chapters freely available from /books/instant_web_scripts/index.html) but the best way is to just ask your sysadmin. That is one of the few easy questions you can ask her. (3)

  • You mis-typed the HTTP header. In order for your browser and server to communicate, you must correctly follow the HTTP protocol. This protocol specifies that an HTML-based response, be preceded with "Content-type: text/html" followed by two newline characters.

  • You did not set the permissions correctly and the web browser has not been given the right permissions to execute the script. Check again.

  • You are not allowed to execute CGI scripts from the directory that you have created the hello_world.cgi in and you either got a 500 error or you received the text of the script in your web browser. The system administrator has restricted you because CGI scripts can be dangerous and she wants to protect her system from your incompetence. Most likely, the system admin has either created a special directory like CGI-BIN for you to put CGI scripts or has allowed you to create special "Access files" which tell the server that in this special case, it is okay to run a CGI script. Either way, you should check with your system admin and ask her how she has determined to deal with CGI scripts and in which directories it is okay for you to run them.

At this level, you can be pretty sure that if the "hello world" script did not run, it was because of one of the three reasons above. After all, there is not much that can be wrong with three lines of code! That is the reason that we are starting so small. We can get our teeth around this!

So I will assume that you've gotten this far and we can go on.

Figuring out Where you are  
The next thing I do is to try to get my little script to talk to library files since most likely, I will be using cgi-lib.pl to interpret incoming form data. So one way or another, we will need our CGI script to be able to talk to other files. To do that, I will need to grab cgi-lib.pl. [typically for Perl 4--ed.] from the web and place it in a sub-directory called Library. The subdirectory should be readable and executable by the web server and cgi-lib.pl should be readable by the web server.

So now I have something like this (open your browser window wide so this does not wrap)

cgi-bin Directory (readable and executable by the web server)
..|___hello_world.cgi (readable and executable by the web server)
..|___Library Directory (readable and executable by the web server)
........|___cgi-lib.pl (readable by the web server))

Now, let's use the "require" command to pull cgi-lib.pl into our "hello_world.cgi" program.

print "Content-type: text/html\n\n";
print "Hello World";
require ("./Library/cgi-lib.pl");

Getting pretty complex here pretty quickly eh? That is okay because we know for sure that cgi-lib.pl does not have any bugs in it since it is being used everywhere and has been for years now. So that means all we really did was:

  • Add a subdirectory called Library
  • Transfer cgi-lib.pl from somewhere to the Library subdirectory
  • Add one line of code to hello_world.cgi.

So what could go wrong with that?

  • Permissions, permissions, permissions! Check Library and cgi-lib.pl!

  • If you got an "EOF error at line xxx of cgi-lib.pl" error, it is a good bet that you did not transfer cgi-lib.pl to your server correctly. Specifically, you must make sure that you do not introduce bad characters into the text of the program as it is sent over the internet from my web server to yours.

    Typically, you must make sure that you set your ftp client to transfer in BINARY mode when you transfer scripts from one server to another.

    Also, you must make sure that you do not edit any scripts using a text editor which will introduce bad characters (especially line breaks and newline characters). I use PICO or EMACS for UNIX, Simple Text for Mac and Notepad for Windows.

  • It may also be that your hello_world.cgi program was unable to find cgi-lib.pl. Note that when you said "./Library/cgi-lib.pl", you were telling your script to grab cgi-lib.pl in the Library directory which is a sub-directory of the directory hello_world.cgi is in (./).

    Well, the problem with this is that on some ISPs the system admins have decided that all accounts are based on virtual servers. This means that every account sees itself as the root server, when in actuality, there is one root server which has aliases to each account.

    Virtual Servers are more secure for the ISPs, so they prefer them. The use of Virtual Servers also allows you to have your own domain name instead of the domain name of the ISP so they are also nice for you. However, they can cause lots of problems when trying to install scripts which need to talk to other files on the file system (like hello_world.cgi needs to talk to cgi-lib.pl). Specifically, virtual servers get kinda screwy when it comes to what path is the "real" path.

    Typically, the path that you see from the command line may be totally different from what the web server sees when it runs. Thus, what you may see as:

    the web server sees as

    and when you tell it to require somethig like "./Library/cgi-lib.pl", the web server may look for

    instead of

    Of course, it won't find anything in this case. The solution is to ask your sys admin what path you should use when loading files into your cgi script. Another way to find out what path the web server is seeing if you are using a UNIX server is to use the "pwd" command. You can try adding these lines to your script.

    print "Content-type: text/html\n\n";
    print "Hello World";
    $pwd = `pwd`;
    print $pwd;

    This should echo back the present working directory as seen by your web server. This path will help you determine what you need to type in order to get your script to access a supporting file like cgi-lib.pl. But remember that you can always just work this out with your sys admin, that is what she is there for!!! That is what you pay her to do!

What the Script Sees  
Once I have successfully loaded cgi-lib.pl, I use it to make sure that my CGI script is actually getting the information from the browser that it is supposed to get from the browser.

    Since the usage of cgi-lib.pl is covered in depth in the chapter in Instant Web Scripts with CGI/PERL and is one of the free chapters available from the Scripts Archive, I am not going to explain its usage. If you are not sure what &ReadParse is, just do some reading. It is a quick chapter and pretty straight forward.

So I will add the usual lines to my little CGI script.

print "Content-type: text/html\n\n";
print "Hello World<P>";

require ("./Library/cgi-lib.pl");

foreach $incoming_form_variable (keys(%form_data))
print "$incoming_form_variable = $form_data{$incoming_form_variable}\n<BR>";

So what did I add to my script. I simply added a small foreach loop which goes through each of the incoming form variables stored in the %form_data associative array created in the ReadParse subroutine of cgi-lib.pl and printed out the name of the form variable and the value of the form variable.

    The usage of foreach, associative arrays, and the keys function is covered in great detail in my Perl Faq

However, there is one piece missing. I need to actually send some form data to my script. Of course, I don't actually have a form frontend to my script, so I will pass the script form data via a URL encoded string like:


When I do so, the result should look something like the following:

This little foreach loop is an invaluable tool when you want to check to see what the script thinks its variables are. While debugging, you can always dump in this foreach loop to zip through the current variables and check to see what they are. It may be that you have 1) accidentally overwritten a variable, 2) the script has lost some values for variables you thought it had, or 3) the script never received variables that it needs.

    Often I forget to pass state information from page to page via hidden variables. If you forget to add state info to every HTML page, it is easy to lose it along the way. Most of the time, that state info is crucial. So anytime you have a CGI which utilizes several screens of info, you need to print out your variables when debugging to make sure they are all getting passed back to the script.

Oh, and one more thing, you can also get a listing of the current ENVIRONMENT variables by adding the following foreach loop:

foreach $environment_variable (keys %ENV)
print "$environment_variable = $ENV{$environment_variable}<BR>";
Advanced Error Hunting  
So what happens if I introduce logical errors to the script while I am debugging? Worse yet, what if there are 1000 lines of code and I am not sure where the error is because I was coding poorly and jumping back and forth through sections without constantly checking myself to see what I did?

Well, this is actually pretty common and there are quite a few ways to go about finding the error depending on your taste.

Command Line Tactics
The first and most common way to check to see where a script is failing is to run it from the command line because the command line will give you much more information than the web browser when you are trying to debug.

Perl makes it very easy for you to check the syntax of your CGI script by offering you a special "debug" mode. In order to check the syntax of your CGI script, simply type the following from the command line:

perl -c scriptname.cgi

The debug mode actually checks the syntax of your CGI script without actually executing the code. A listing of all of the debugging commands can be listed by typing

perl -h scriptname.cgi

Of course, if executing the code has no affects other than outputting, you can also just try running the script itself without debugging using the following command:

perl scriptname.cgi
Perl will attempt to execute your CGI script and will output errors if there are any. A typical error message that you might see looks like:

Here is another one you'll see alot:

As you can see, Perl sends back a good deal of useful information about your problem. Typically, it will do its best to analyze what the problem was as well as give you a line number so that you can look into the problem yourself.

Well, as you may have guessed, there are quite a few commonly made syntax errors that will plague your command line executions. In fact, in "Teach Yourself CGI Programming in Perl", Eric Herrmann lists the following common suspects:

Symbol Name Description
; semicolon Each command in your Perl program must end with a semicolon. Unfortunately, the error message you get may not give you any hints. You'll usually get something to the effect of, "syntax error at 1.pl line 6, near print". It is up to you to track down the error.
{} braces Braces are used to delimit sections of the program (such as if, while or for loops). the most common problem is leaving off a closing brace to correspond with an opening brace.
() parentheses Every now and then, you will forget a parentheses in an if statement, just beware.
"" Quotation Marks Perl allows quoted strings to include multiple lines. This means that of you leave off a closing quote the rest of your entire program might be considered part of the quoted string. Also, beware of having quotes inside of quotes such as print "She said "hello""; How can perl know which quote is meant to be printed and which quote is meant to end the string to be printed?
@ At Sign The @ character is used to name list arrays in Perl. Thus, if you are going to print an @ character such as when you print an email address, you must make sure to "Escape" it using a backslash such as print "selena\@eff.org".

Log File Analysis
Assuming that your system administrator has given you access to this file, another useful debugging tool is the error log of the web server you are using. This text file lists all of the errors which have occurred while the web server has been processing requests from the web. Each time your CGI script produces an error, the web server adds a log entry.

If your sys admin does not allow access to the log file, you may ask her to email you a version of the log file with only errors related to your work. She can create such a version by using the GREP command and it should not be too difficult.

On the other hand, if you do have access to the log file, it can usually be found in the "logs" directory under the main web server root.

For example on NCSA serves, it can be found at


Dressing up as a Web Browser
In "Teach Yourself CGI Programming in Perl", Eric Herrmann outlines a method which you can use to test your CGI scripts using TELNET. I recommend reading the section if you have the chance. In the meantime, here is a quick explanation...

If you are able to use the TELNET program to contact your web server, you can view the output of your CGI script by pretending to be a web browser. This makes it easy to see "exactly" what is being sent to the web browser.

The first step is to contact the web server using the telnet command:

telnet www.yourdomain.com:80
Typically, web servers are located on port 80 of your server hardware. Thus, for most of you, you need only contact port 80 on the server.

Once you have established a connection with the HTTP server, you formulate a GET request using the following syntax:

GET /cgi-bin/testscript.cgi HTTP/1.0

This command tells the server to send you the output of the requested document, which in this case is a CGI script.

After your GET request, the web server will execute your CGI script and send back the results which will look something like the following:

eff.org:~$telnet www.mydomain.com:80
trying ...
Connected to www.mydomain.com.
Escape character is '^}'.
GET /cgi-bin/test.cgi HTTP/1.0
<HTML>Hello World</HTML>

Connection closed by foreign host.

Using print "Content-type: text/html\n\ntest";exit;
However, I will note a third method that you can use to find out where a logical error is when it is not a "syntax error" but an HTTP error. An http error causes the favorite, "404 document contains no data" error which the command line and error logs won't necessarily help with. The script will run fine from the command line, but it won't run from the web.

Look at the hello world script with a couple of minor changes

require ("./Library/cgi-lib.pl");

foreach $incoming_form_variable (keys(%form_data))
print "$incoming_form_variable = $form_data{$incoming_form_variable}\n<BR>";

print "Content-type: text/html\n\n";
print "Hello World<P>";

When you run this script, you will get a "404 document contains no data" error because you have sent text to the browser (the variable names and values) BEFORE you have sent the magic HTTP header line "Content-type: text/html\n\n". But how would you find out that this is a problem.

The solution is to use the "print "Content-type: text/html\n\ntest";exit;" line to walk through your routine one step at a time to discover at which point the problem begins. let's try it.

print "Content-type: text/html\n\ntest";exit;
require ("./Library/cgi-lib.pl");

foreach $incoming_form_variable (keys(%form_data))
print "$incoming_form_variable = $form_data{$incoming_form_variable}\n<BR>";

print "Content-type: text/html\n\n";
print "Hello World<P>";

That is going to work just fine. The web browser will read "test" and we will know that the error is not being caused by the first line of the script. (Notice that because we use the "exit" function, Perl will stop executing the script so we will not get any of the other info.)

Next, let's move the testing line down...

require ("./Library/cgi-lib.pl");
print "Content-type: text/html\n\ntest";exit;

foreach $incoming_form_variable (keys(%form_data))
print "$incoming_form_variable = $form_data{$incoming_form_variable}\n<BR>";

print "Content-type: text/html\n\n";
print "Hello World<P>";

That is going to work just fine too! I'm getting bold there jumping two lines at a time, but when you actually use this method, you can feel free to jump entire routines if you are sure they are not the cause of the bug. Just don't jump too many at once. Okay, now let's dump the line into the foreach loop.

require ("./Library/cgi-lib.pl");

foreach $incoming_form_variable (keys(%form_data))
print "Content-type: text/html\n\ntest";exit;
print "$incoming_form_variable = $form_data{$incoming_form_variable}\n<BR>";

print "Content-type: text/html\n\n";
print "Hello World<P>";

Okay, That works too (remember to pass some variables as URL encoded data as shown above).

Finally, we move the line to the end of the foreach loop and we see that we get the 404 document contains no data problem!

require ("./Library/cgi-lib.pl");

foreach $incoming_form_variable (keys(%form_data))
print "$incoming_form_variable = $form_data{$incoming_form_variable}\n<BR>";
print "Content-type: text/html\n\ntest";exit;

print "Content-type: text/html\n\n";
print "Hello World<P>";

That is it, we just discovered where the bug was. We can bonk ourselves on the head and say "Of course, the HTTP header MUST be the first thing printed to the browser!

In Conclusion  
Well, that's all folks. If you are comfortable with the debugging tools outlined here and you are ready to get your mindset in gear, then you should have no worries. Think of CGI debugging as fun. In fact, to get practice, try going to a CGI discussion forum and helping people solve their problems. You will not only hone your own skills, but make the CGI community a happier group to be a part of. Good luck.
  • By the way, the Newtonian perspective is horrible for the software "creation" process which is a complex system not a Newtonian process. If you are familiar with the concepts of complex systems, emergent properties, and extropianism, forget them in the debugging process, they won't help you here as much as the stodgy old Newtonian paradigm. If you are not familiar with these paradigms, I recommend you read through " Out of Control: The New Biology of Machines, Social Systems and the Economic World" by Kevin Kelly. It will assuredly make you a better software designer.

  • If you have trouble understanding "chmod" for goodness sakes, buy a UNIX book! Permissions can often be a real drag. Many systems have funky automatic settings that you need to override if you are to allow the world (web browsers) to utilize your scripts.

    Permissions are UNIX's (If you are using an NT or Mac web server, you probably don't need to worry about this) way of allowing or disallowing various people access to files on them. Permissions are set by using the chmod, chown and chgrp command. For the most part though, you will only need to worry about the chmod command.

    To find out what permissions your files are set to, type "ls -l -g" at the UNIX command line.

    You'll probably receive something like (if not, type man ls and figure out how your setup works)...

        eff.org:~/Fortune$ ls -l -g
        total 237
        -rw-rw-r--  1 selena   doc           607 Oct  9 17:14 README.txt
        drwxrwxr-x  2 selena   doc           512 Apr  6 10:05 Test/
        -rw-rw-r--  1 selena   doc          2065 Jan 29 07:37 eff_quotes
        -rw-rw-r--  1 selena   doc        104434 Oct  9 17:14 fortunes
        -rw-rw-r--  1 selena   doc        106720 Oct  9 17:14 fortunes.dat

    The permissions can be seen in the part of the line which looks like -rw-rw-r-- The part is divided into four sections. directory or file?, User permissions, group permissions and world permissions. In this case you can see that Test is a directory "d" it can be read, written to and executed (that is, we can read files within that directory) by user and group "rwxrwx" and can be read and executed by world "r-x". fortunes.dat is a file "-" and can be read and written to by user and group "rw-rw" and can only be read by world "r--".

    My general rule for web permissions is to begin by setting "all" files and directories to chmod 777. This is the most insecure, but if your scripts work at this level, but do not work otherwise, you have isolated the problem to a permissions problem. Then, one by one, increase the security. I usually jump to chmods 775, then 755, etc. for directories and scripts and 666, then 664, then 644, etc. for files. The idea is to narrow things down until you have the most secure setting. But, make sure that you do not leave things at 777 unless absolutely necessary (temp, user carts, and session directories, for example, will probably need to be left as 777 because the web server must write to them, but your main application filename.cgi should probably not be left as 777...actually, you may want to make it 755)!!!!!!!

    Also, in Teach Yourself CGI Programming with Perl, Eric Herrmann writes "Most servers run CGI Scripts as user NOBODY. A file that "you" can access is not necessarily accessible by that user." Very true...beware. Something that runs okay from the command line may not work from the web. To avoid this, you might try telnetting to port 80 and doing GET, becoming your own browser and getting more info. Anyway, enough of that, I recommend looking at a UNIX manual for a real explanation. It is not all that hard actually.

    Also, Matt Wright found this great list of links to the UNIX MAN pages on the web at Yahoo I recommend that you take some time to read these pages as permissions is the number one reason that scripts don't work on your server right out of the box.

    Finally, in some cases, you most also change the permissions of the root directory itself because the main scripts must write to the same directory in which it is located. Make sure you go up a directory and chmod the root directory of the script you are trying to run.

  • If you are confused as to what Perl, CGI, Server, or Perl interpreter" all means, why not re-read last month's article Web Programming 101.