CGI troubleshooting FAQ
Q: What
is Unix? And what is A Unix/System Prompt? (by Matt Wright)
Short
Answer: Unix is an operating system that is one of the most popular
among net users. As more and more people are getting on you will see more
and more Macintosh and Windows/DOS Environments. Unix comes in many names,
some of them: HP-UX, SunOS, OSF/1, Linux, BSD, and many more.
Long
Answer: All of these scripts were written and tested on machines running
OSF/1 Unix and SunOS along with HP-UX. They have been successfully installed
by me on all of those platforms with both the Apache and NCSA servers.
When I refer to Unix I am talking about an operating system, just like
DOS, Macintosh (and arguably Windows). Unix is one of the most complex
operating systems, and used to be what kept the newbies clueless the most.
When I refer to a system or unix prompt, I am talking about your Korn Shell
or C Shell, or the many other shells out there. From there you can execute
such commands as chmod,
mv, cp, vi, and many other useful tools. These scripts have also been set
up on other Platforms and Operating Systems
More
Information:
- Unix
Usenet FAQ
- Everything
you ever wanted to know about Unix and More! These are the frequently asked
questions on the Unix Usenet Newsgroups.
- The Unix Reference Desk
- FAQs,
Applications, Programming, Humor and more. This site has lots of information
about Unix.
- Unix
Resources
- This
site also houses a lot of information about Unix, including tutorials,
reference and more!
- Yahoo:
Unix
- Over
250 Links and Resources to Unix Information! If you're still looking for
more information, check out this site.
Q:I know my HTML downpat, but this CGI stuff is
mighty confusing. What is CGI and how can I get it to work on my pages?
Firstly, if you don't know anything about CGI, I suggest you not start
with an application like the Database Manager or the Electronic Outlet.
Start with an easy one like Form Processor or Guestbook and work your way
up.
I'm always amazed at the amount of people who want to start with the
hardest possible project without doing any research first. They ask me
"I know nothing about CGI but I want to install the BBS. How do I
do it?" The problem with this question is that it is impossible to
answer without writing a book. The question is far too vague. As with all
technical fields, the catch 22 is that you need to understand what you
are doing before you are able to ask questions which can be answered.
CGI takes research and time to master. Once you have gotten the basics
down however, CGI programming, especially in Perl, is very easy. However,
though incredibly short, the learning curve is very, very steep. That means
you should be prepared to spend a solid month studying CGI before you will
be ready to do real CGI work or ask answerable questions. After that month,
which can sometimes seem like a year, you should be very comfortable. But
you MUST dedicate the month of hair-pulling study.
In the "Research Library" link on my scripts archive I list
several books which I recommend you buy. If you are going to be doing Perl
CGI, I recommend that your first book is "Learning Perl" by O'Reilley
publishing. "CGI Programming With Perl in a Week" by SAMS Publishing
is also a good introductory work as is "Introduction to CGI/Perl"
by M&T Press. Either of these should be your "second" purchase.
You can also spend some time reading through the scripts which I have
made available. I have taken great pains to comment them so that a beginner
could pick them up and understand what is going on. In fact the original
goal of this site was not to distribute code, but rather, to provide a
place for others to learn how to program CGI applications by example. I
am NOT a guru or a professional programmer. I am learing too. But hopefully
the documentation of my learning process will be of use to you.
And then of course, there are the many CGI sites around the net with
much valuable information that I have linked from my "Offsite Resources"
page.
Here are a few thoughts of mine to help you on your way. CGI (Common
Gateway Interface) is sort've like a translator. It translates the needs
(questions) of a person (who we will call a "client") into a
language that the computer (which we will call a "server") can
undertand. Then, it translates the server's answer into a language that
the client can understand.
Thus, CGI is not a programming language as much as it is an activity.
Actually, you can use most if not all programming languages to create CGI
applications. The most common programming languages used for CGI these
days are Perl, C and Visual Basic. I use Perl because it is 1) easy to
understand, 2) quick and painlesss to modify and customize 3) highly portable
from operating system to operating system, 4) easy to find people to hack
scripts if you are a business on the web, 5) excellent at handling strings
which is very important in CGI and 6) fast and powerful enough for most
CGI needs...C is an overkill most of the time.
The problem with HTML is that it is "static". That is, an
HTML page is designed once and is displayed that way every time it is accessed.
That is nice if you just want to shove pre defined information down the
client's throat, but what if you want the client to interact in a more
active way than just clicking on one "pre" designed link to another?
What if the information displayed depends on their complex interaction
like searching a database or ordering products from an online wherehouse.
The answer, of course, is CGI. CGI is able to listen to the client for
instructions. Then, it is intelligent enough to make some decisions about
what to do with those instructions, follow through with its decision by
utilizing the resources of the server, and respond to the client (hopefully
with the information requested). This output could be as simple as the
current time or as complex as a database row in a database of 24,000 entries
with 19 fields.
Whatever the case, CGI sits in between the two worlds and must mediate
between them constantly. To do this job it must 1) be able to accept information
from a client, 2) decide what to do with the information, 3) use the resources
of the local server to do what it decides must be done, 4) accept whatever
information the server sends back and 5) present that information back
to the client who asked in the first place.
Step one is done for the most part by using HTML forms or URL encoded
strings. Both of these methods are ways that the standards developers of
the web created to pass information to a CGI. I am not going to go into
this stuff because the use of forms and URL encoded strings are covered
in many, many places available to you for free and as published works.
Here are two online sources of information:
Step Two involves writing a program with enough logic to figure out
what to do with the information submitted. This is a CGI program. As I
said above, a CGI Program can be written in any language and can be used
for any operating system that has a web server.
The first thing the CGI program must do is "parse" the incoming
form or URL encoded data. Parsing means that the program must split incoming
information into name/value pairs. That is, the CGI program creates an
"associative array" which associates an administratively defined
variable name with its client defined value. So if a form had two input
fields,
<INPUT TYPE = "text" NAME = "name"">
<INPUT TYPE = "text" NAME = "email"">
and I filled in Selena Sol and selena@eff.org and hit the submit button,
the CGI would get name = Selena Sol and email = selena@eff.org. The CGI could
then use those variables and values to do things that it was designed to
do.
Step Three involves using the server's resources to do something with
the client defined information. Such a resource might be email. The CGI
script could take the submitted name and email and then email those values
to someone who is maintaining a guest log for example.
Thus, the CGI program needs to be able to talk to the server it is on
and ask the server for resources that the server provides. Other examples
might be encryption, data handling (keyword searching), string manipulation
(substituting one value for another), or reading or writing to files.
Step Four is similar to Step Three. The program needs to be able to
get the output from the server for whatever project that the program assigned
to it.
Step Five involves taking the server supplied information and returning
to the client a dynamically generated HTML page with the information supplied
by the server.
In returning a dynamically generated HTML page, the CGI is basically
doing what the server would do when it is asked for an HTML page...it simply
returns to the client's web browser a file containing HTML using the standards
that the server would use to do the same. Instead of sending back a pre
designed HTML page, however, the CGI writes its own HTML page and sends
back that.
Thus, the CGI must know 1) how to write in HTML and 2) what are the
correct protocols that are used to send an HTML document back to a web
browser (like Netsc ape 2.0). Number One is done by hardcoding HTML into
the scripts logic and writing other logic which can be used to generate
specific HTML depending on what information has been supplied by the server.
Thus, you might design logic into a database searching application which
says, take a keyword submitted by the client from the web and search a
local database for a database row contsaining that keyword. When you find
it, output that database row as an HTML table putting each database row
field into a table cell. Thus, though we cannot tell ahead of time what
the information in the table is going to be, we can make the program smart
enough to just fill in the HTML table with whatever information it retrieves
from the server.
Part Two is done by following basic HTTP protocols, like the first text
seen by the browser must be Content-type: text/html followed by two new
lines. Again, a discussion of HTTP protocols are more than I want to cover
here and can be found elsewhere. Fo the most part, you just have to remember
that the first line outputted by your CGI script must be:
Content-type: text/html
(Notice two blank lines)
This is about all I can think of as far as introductory stuff. As I
said before, do your research. CGI is not an insurmountable skill to achieve.
You just need to put in the time.
Q: What
is a cgi-bin? (by Matt Wright)
Short
Answer: A cgi-bin is a special directory designated in the config files
of a web server to allow execution of CGI scripts in specified directories.
Long
Answer: The cgi-bin is used to help keep systems secure. Keeping CGI
scripts limited to trusted users is necessary since CGI opens up a lot
of security risks. Therefore, most web platforms allow only files inside
of a cgi-bin to be executed....
Selena
adds...
In
most cases, you cannot put images or HTML files within a cgi-bin directory
tree because any references to those documents will be considered as directoves
to execute the files as scripts. Thus, if you have scripts which take advantage
of HTML and IMgaes, you will need to move those HTML and Images to an HTML
area in order to load them with your CGI programs.
Q: Do
I need a cgi-bin to run [CGI] scripts? (by Matt Wright)
Short
Answer: Yes
Long
Answer: Yes and No. A cgi-bin is needed to run these scripts unless
your system administrator has turned on ExecCGI, which enables .cgi extensions
to be used in any directory. To find out if you can use CGI scripts, your
best bet is to ask your system administrator if you can. He/She may opt
to give you a cgi-bin in your directory, turn on ExecCGI or check your
script and then place it in the server cgi-bin. Any of these options [are
okay for cgi] scripts purposes. Just make sure you set all variables to
reflect any changes you have to make to the location of files and scripts.
Q: How do I download one of
your scripts
A: The first thing you should do is find the archived/compressed version
of the application you are interested in. In most cases, I will will have
a link called "Download the scripts as one tar file", which when
chosen will download the archived scripts directly. In other cases, I will
have a link like "Read through the Scrits". Usually, one of the
files under that link will be something like filename.tar or filename.tar.gz.
This is the link you shuold choose. More on this in a bit.
If the file you download ends in .gz, that means that I have compressed
the file using gnuzip. The reason that one compresses a file is to make
it more compact for download. For large files, to save time, we use compression
to make the file smaller using complex mathematical algorithms. Gnuzip
is the one most UNIX CGIers use. "Unzippping" this file is the
first thing you will have to do and the process is extraordinatrily easy.
>From the command line type the following:
gunzip filename.gz
For example, here is what it might look like from the command line:
kronos(selena) 10>ls
banner.tar.gz
kronos(selena) 11>gunzip banner.tar.gz
kronos(selena) 12>ls
banner.tar
Notice that when you are done unzipping, the .gz extension will be gone.
If you do not use UNIX, there are many applications to unzip gnuzip
files available at Shareware.com
Once you have unzipped the file, you will still probably see a .tar
extension as in the example above. Similarly, you may not have seen a .gz
file in the first place and only a .tar file.
Most of the applications on my site consist of many separate files/subdirectories
which work together and must be placed in specific locations relative to
each other (in the "directory heirarchy"). A file's position
relative to the base heirarchy is essential. For example, a shopping cart
script will be written to expect to see a User_cart subdirectory with some
user_cart files inside them...if the user_carts are not being created there,
because you have moved that subdirectory or have not even created it, the
scripts will fail.
Such relative placement is assured by creating a "tar" file.
Tar is a utility that gathers many files and subdirectoies into one big
file which has internal references so that when you "untar" it,
it will "expand" into the original file heirarchy.
Thus, I create one file which preserves the dirctory heirarchy necessary
for the scripts to run.
To untar a tar file, you type the following at the command line:
tar xvfp filename.tar
Here is an example of what you should see from the command line
kronos(selena) 12>ls
banner.tar
kronos(selena) 13>tar xvfp banner.tar
x Banner, 0 bytes, 0 tape blocks
x Banner/nph-banner.pl, 334 bytes, 1 tape blocks
x Banner/Images, 0 bytes, 0 tape blocks
x Banner/Images/2.gif, 6916 bytes, 14 tape blocks
x Banner/Images/3.gif, 2279 bytes, 5 tape blocks
x Banner/Images/4.gif, 8028 bytes, 16 tape blocks
x Banner/Images/5.gif, 2070 bytes, 5 tape blocks
x Banner/Images/1.gif, 20798 bytes, 41 tape blocks
x Banner/frontpage.pl, 213 bytes, 1 tape blocks
kronos(selena) 14>
Now you will have a subdirectory called Banner, with 2 files and one
subdirectory called Images with 5 files
If you are not working on a UNIX system, there are equivalent programs
which you can use to untar an archive available at Shareware.com
Q: What are .tar and .gz and how do I download
your scripts?
A: .gz means that the files were compressed using gzip, and .tar means
they were archived using tar. If you don't have these programs, you can
get them at Shareware.com
If you are on a UNIX sysstem, the command to unzip is:
gunzip filename.gz
And the command to untar is:
tar xvfp filename.tar
Q: Once I unarchive and uncompress,
then what?
Once you have the application expanded onto your server, you will need
to configure the scripts to run on your local system. Many of my applications
have a .setup file which centralizes your server specific varioables and
options into one place.
For the most part, you need only change these setup variables to the
specifics of your local server. If there is no setup file, it should be
the case that all such options and server specific variables are defined
in the first lines of the main scriopt. Check there and make appropriate
changes.
This is the first difficult thing for you to overcome by yourself, because
there is no way for me to tell you what your own environment is like. I
recommend that you try to get the distribution scripts working as they
are before you start changing options and moving subdirectories or files.
You will also have to make sure that your perl interpreter is referenced
correctly on the first line. If you don't know what I am talking about,
read the next section on trouble shooting. There is a long section on this
topic.
Finally, you will need to make sure that the permissions are set correctly
for the web server to run any applications, read any data or setup files
and write to any supporting files it needs to. Read the section on permissions
below if you have any trouble figuring out how to do that.
Q: I can't get one of your scripts to work on
my server! Can you help?
A: Selena Sol has a full time job, full time school, develops freelance
CGI applications constantly, volunteers 20 hrs/wk at the Electronic Frontier
Foundation and has a life to boot. Please be understanding when you ask
for help installing, customizing and trouble shooting scripts. Being understanding
means
- Provide me with as much information as possible and take some time
to figure out how to state the problem as clearly as possible.
- Read through this FAQ and try to trouble shoot it on your own first.
- Be patient
That said, please feel free to ask when you can't solve the problem
on your own. I will do my utmost to help you figure out the problem as
quickly as possible.
If you email me, however, there are a few things you need to know.
- It is hard to trouble shoot scripts.
- It is very hard to trouble shoot scripts virtually (That is, not having
access to the server to try things out myself. Like auto maintenance, scripts
generally require one to get there hands inside of the machine and fiddle
with things. Every server has its own special quirks that can only be discovered
with intuition and hands on twiddling).
- It is near impossible to trouble shoot virtual scripts without receiving...
- The script in question and the associated URL.
- The HTML which calls the script, if any.
- The "exact" eror message you receive using a web browser.
- The "exact" error message you receive when running the scripts
from the UNIX command line.
Q: How do you trouble shoot?
A: Troubleshooting scripts is a lot like fixing a motorcycle. It is
all about inductively discovering the problem with a motley of diverse
"clues" which will change from project to project. As a cgi developer,
you must realize that every client, browser, server software, server hardware,
operating system, web server configuration, security configuration, cgi-application,
and web administrator is different. Thus, very few generic Public Domian
(PD) scripts will work right out of the box on your system.
That is, the fact that you get a 500 Server Error after installing a
PD script is not necessarily the fault of the programmer and is not necessarily
something to get flustered over. Most good PD scripts, in fact, will be
designed generically, so that they will work in as many environments as
possible with configuration, but on no environments without (this is especially
the case with complex cgi applications like a BBS or a database manager).
Thus, you will need to play with all the scripts you download to configure
them for the specificities of your own environment.
For example, consider the first line of any
Perl script:
#!/usr/local/bin/perl
The above line will be mandatory for every perl cgi but, unfortunately
can vary widely from system to system. For example, though I distribute
my code with the first line as shown above. I have had to change that line
when I am freelanceing to such values as:
#!/usr/bin/perl
#!/bin/perl
#!/opt/bin/perl
The reason that this line may be different from system to system is
that the systems administrator may have installed the "Perl Interpreter"
in any number of directories. Thus, the first thing you must do in customizing
your scripts is to find out where your local Perl interpreter is located
so that you can reference it correctly on the first line.
To find the location of the perl interpreter, you can use commands such
as which, wehereis, whence or find from the UNIX prompt, but a simple email
to your sysadmin will probably suffice. Ask him or her "would you
please tell me the absolute path of the perl interpreter" and she/he
will respond with the path.
So what does the first line actually mean? Well the #! tells the server
that the file is an interpreted executable and that in order to run it,
it must pass the contents of the file to the "Perl interpreter (perl)
which is located in the directory /usr/local/bin.
The Perl intrpreter will then translate your CGI script, which is written
in English (sort've) into machine code binary which is legible only to
your computer (well some people can read machine code).
Thus, if your perl is located in a different directoy, this first command
will be wrong, the server will not be able to find the perl interpreter,
and you will get a 500 server error from the web.
So then what? What if you change the first line to represent the local
path of perl and you still get an error? My general strategy is to fiddle
with things, see what happens, and let the solution...just...emerge. I'm
sure that that does not help you very much right now with your boss breathing
down your neck incessantly tapping his timex. But it is the truth and you
should feel comfortable telling him/her "I don't know, let me play
with the scripts." As I said, customizing scripts is very much an
intuitive process...so time limits are very hard to define, espeically
when customization involves more than just getting it to run, but involves
changing its GUI or data strcutures.
Step one, of course, is always to sit back and take three deep breaths.
Take the advice of Douglas Adams, "Don't Panic".
Once you are ready to begin, there are some standard customization tools
you can use though. It isn't all magic...
First of all, do not trouble shoot from the web! Web error messages
are completely useless for the most part. At best, you'll get some frustrating
message like:
500 Server Error
The server encountered an internal error or misconfiguration and was
unable to complete your request.
Error: HTTPd: malformed header from script /cgi-bin/web_store.cgi
As you can see, this tells you nothing about what the problem actually
is.
What you need to do is to telnet in to your server and run the scripts
from the command line (If your ISP only gave yuo ftp access, I'm very sorry...I
recommend another ISP, CGI is going to be a really big headache). From
the command line, I usually type something like the following:
perl blah_blah.cgi
or better yet, just
blah_blah.cgi
Which one of the above commands works will depend on your UNIX environment.
If neither works, ask your sysadmin to "please put the perl interpreter
in my PATH in my shell configuration setup file".
If the command ran fine, hopefully, you will not receive an error message.
That is, hopefully you will see whatever it was that you wanted the script
to display. For example, look at what I did here.
%perl test.cgi
Content-type: text/html
This is a successful test
%
However, if there is a syntax error in the code, you may get something
like the following...
%perl test.cgi
String found where operator expected at test.cgi line 3, at end of line
(Missing operator before "; ?) Can't find string terminator '"'
anywhere before EOF at test.cgi line 3.
The error messages you receive from the command line will be much, much
more useful than what you receive from the web. For example, perl will
tell you which line it thinks the mistake was on. It will also do its best
to diagnose the problem. In this example, I simply removed the quote mark
that was on the line print "This is a test"; thus, screwing up
the syntax.
Note: in "Teach Yourself CGI Programming
in a Week", Hermann has an excellent section on the most common syntax
errors that you will find. It is on Page 420-422 and is an easy read. To
summarize, he notes the top three bugs as 1) punctuation problems 2) assignment
of equality operators, and 3) string and numeric equality operators. I
recommend you read this section if you can, but for most of you, who will
not be designing, but will be customizing, the main problem I see is that
when you edit the HTML output, you will invariably use the @ sign in an
email addres or a "" in an URL. You must remember that "
and @ have meaning to perl, so if you use them, you need to "escape"
them with a backslash. Thus, <A HREF = "mailto:selena@eff.org">selena@eff.org</A>
becomes <A HREF = \"mailto:selena\@eff.org\">selena\@eff.org</A>.
Note: Sometimes, you need to be aware that the web server will be
running with a different "environment" than you. Thus, for example,
it may not have permission to edit a data file that you do. If this was
the case, what worked from the command line would still not work from the
web. However, you would have discovered a valueable clue. Savor these clues,
and use deduction to weed out possible erros until the only explanation
is the only one left.
Alternatively, you may need to change the CGI extension to .cgi instead
of .pl or .sh. Every sysadmin has set a different level of security as
far as how CGIs can be executed. Some systems only allow you to run scripts
from a special direcotry (often called cgi-bin).
Others allow you to run cgis from HTML area directories with the .cgi extension
(also you may have to create a .htraccess file or the equivalent for a
non-NCSA-based server). It is crucial that you find out the specifics of
how your sysadmin has configured the cgi executable options for your web
server because, what may work from the command line, may still fail from
the web. But hey...at least you have isolated the problem down further.
Now you know that it is not a syntax bug.
In order to run scripts from the command line, you may very well have
to "hardcode" incoming form variables since the script will be
run from the command line rather than from being called by some HTML form.
Often, I will actually insert something like...
$MYDATA{'phone_number'} = "xxx"
so that I can comment out the routine which parses the form data having
faked the script into thinking it was actually getting incoming form data.
Why? Reduce the routines to nil. Deal with one routine at a time, make
it work right, then build. Kevin Kelly has a great discussion of emergent
copmplexity based on simple routines compounded in Out of Control and it
would be worthwhile to read that as a study of trouble shooting.
Also, force the script to tell you what it is doing at every point.
I have it do so, by adding print lines everywhere.
print "$MYDATA{'phone_number'} = FORMDATA PHONE NUMBER";
Thus, you can make sure that the right values are getting assigned to
the right variables. A way to automate this, for example, if you are using
cgi-lib.pl to parse data with a call of
&ReadParse(*form_data)
is to use the following code...
@incoming_form_variables = keys (%form_data);
foreach $incoming_form_variable (@incoming_form_variables)
{
print "$incoming_form_variable = $form_data{$incoming_form_variable}";
}
Thus, you will get a print out of all the variables sent in from a form
as well as the values of all those variables.
Another very useful tactic is to type the line
print "test";exit;
directly after the line
print "Content-type: text/html\n\n";
Hopefully the Content-type line should be one of the very first lines.
Thus, when you run the script from the web, you should receive a lone "test"
in the browser window and the script will exit. Now you know that the lines
up to the print "Content-type... line are all okay. So, cut the print
"test";exit; line and paste it a few lines further down. Then
try again from the web browser. Repeat this process until you finally find
the line which causes the problem. Then sit down and think about it awhile.
What are the symptoms and what are my clues?
Note: Thus, it is always best to have the content-type... line at
the very beginning, not just because you want the browser to not time out,
but in order to debug when necessary. not to mention that the Content-type..."
flag must be printed out BEFORE ANY other text is sent to the browser or
else you will get a 500 server error.
Note: beware of "if tests". You may not hit the print "test";exit;
line if it is inside an if test that returns false.
Finally, if I simply cannot isolate a problem, I'll comment every damn
line in the script and decomment them one by one until I hit a wall. Of
course as Nick Bicanic wrote me, "however here's one for you that
I'm sure others have encountered too. Something doesn't work. No amount
of fixing fixes it. So you comment everythig out - then uncomment one by
one....and now everything works - however nothing has been changed. Fuck
that shit pisses me off."
In response, Greg Greene writes,
"An answer to your FAQ about the irritating problem of scripts that
don't work till you commented evry line out then uncommented them. The
problem is pesky control characters caused by hitting the control key instead
of the shift key that puts an unprintable (and unspeakable) hidden control
character in the script. Commenting it then erasing the comment also erases
the control character and voila ! the damm think works. Sometimes an editor
that can show ALL 255 ascii characters on demand can save you time !! I
ran into this often back in the early Benton Harbour Basic days when everything
was a port from some mainframe somewhere."
When debugging scripts, especially those with associated files (libraries,
data files, setup files...) you want to be very careful about setting permissions.
Permissions errors are THE MOST COMMON error I see. Make sure to read the
discussion of permissions below.
Yet another method for debugging is to use the perl debugger. This is
a very simple process and the information that the debugger will tell you
will be pretty good. Simply type the following at the command line...
perl -c programname.cgi
This will check the syntax of the program without actually executing
the code...this is sometimes good when you are worried about affecting
datafiles during debugging.
You can also test the scripts using telnet by "pretending"
that you are a web browser. This is covered in "Teach Yourself CGI
Programming with Perl in a Week" by Hermann on Page 425 and 428 or
in CGI Programming on the WWW by Gundavaram on page 373-374.
You can also use the server's error log which will have yet more information
about problems. This is also covered in "Teach Yourself CGI Programming
with Perl in a Week" by Hermann on Page 431-432.
You can also use a debugging tool like CGI Lint as discussed in CGI
Programming on the WWW by Gundavaram on page on pages 375-380.
I recommend that before you do anything else, you send the following
letter to your sysadmin (the answsers to these questions will be essential
for you in customizing and invaluable to those you ask for help, in fact,
it is pretty frustrating for me to get help requests from people who do
not provide me with this basic information):
"I would like to run CGI scripts and I have a few questions to
ask you.
- What is the absolute path of the perl interpreter which I should use?
Please include both the location of Perl 4.x and 5.x if you have both.
- What is the absolute path of sendmail...or do you have a sendmail interface
that I can use?
- What web server software do you use?
- What Operating System do you use?
- What are the specific security barriers you have set up regarding CGI
usage?
- Which directories may I execute CGI applications from? May I use the
central cgi-bin directory, or must I create my own cgi directory within
my own html area?
- Must I use .cgi extension or another such extension for my cgi's?
- Are there any special access files like .htaccess that I need to create
in my local cgi-bin area which will allow me to run CGIs.
- Do you use a CGIWRAPPER? If so, what is the URL that I should use to
reference a cgi script and what is the "real", "absolute"
path that I may use when referring to supporting files within a CGI application
since the web server will not be inherritting my personal environment.
- Is there any other information you can think of that might be crucial
for me in trying to customize public domain CGI's for my web site?
Thank you very much for your time. I know you are extremely busy, but
if you answer these questions, perhaps it will save you time in the long
run.
Q: I'm trying to run a script of yours
on my server and I get a "403 Forbidden" Error.
A:Permissions can often be a real drag. Many systems have funky automatic
settings that you need to override if you are to allow the world (web browsers)
to utilize your scripts. Permissions are UNIX's way of allowing or disallowing
various people access to files on them. Permissions are set by using the
chmod, chown and chgrp command.
To find out what permissions your files are set to, type "ls -l
-g" at the UNIX command line.
You'll probably receive something like (if not, type man ls and figure
out how your setup works)...
eff.org:~/Fortune$ ls -l -g
total 237
-rw-rw-r-- 1 selena doc 607 Oct 9 17:14 README.txt
drwxrwxr-x 2 selena doc 512 Apr 6 10:05 Test/
-rw-rw-r-- 1 selena doc 2065 Jan 29 07:37 eff_quotes
-rw-rw-r-- 1 selena doc 104434 Oct 9 17:14 fortunes
-rw-rw-r-- 1 selena doc 106720 Oct 9 17:14 fortunes.dat
eff.org:~/Fortune$
The permissions can be seen in the part of the line which looks like
-rw-rw-r-- The part is divided into four sections. directory or file?,
User permissions, group permissions and world permissions. In this case
you can see that Test is a directory "d" it can be read, written
to and executed (that is, we can read files within that directory) by user
and group "rwxrwx" and can be read and executed by world "r-x".
fortunes.dat is a file "-" and can be read and written to by
user and group "rw-rw" and can only be read by world "r--".
My general rule for web permissions is to begin by setting "all"
files and directories to chmod 777. This is the most insecure, but if your
scripts work at this level, but do not work otherwise, you have isolated
the problem to a permissions problem. Then, one by one, increase the security.
I usually jump to chmods 775, then 755, etc. for directories and scripts
and 666, then 664, then 644, etc. for files. The idea is to narrow things
down until you have the most secure setting. But, make sure that you do
not leave things at 777 unless absolutely necessary (temp, user carts,
and session directories, for example, will probably need to be left as
777 because the web server must write to them, but your main application
filename.cgi should probably not be left as 777...actually, you may want
to make it 755)!!!!!!!
Also, in Teach Yourself CGI Programming with Perl, Eric Herrmann writes
"Most servers run CGI SCripts as user NOBODY. A file that "you"
can access is not necessarily accessible by that user." Very true...beware.
Something that runs okay from the command line may not work from the web.
To avoid this, you might try telnetting to port 80 and doing GET, becomong
your own browser and getting more info. Anyways, enough of that, I recommend
looking at a UNIX manual for a real explanation. It is not all that hard
actually.
Also, Matt Wright found this great list of links to the UNIX MAN pages
on the web at Yahoo
I recommend that you take some time to read these pages as permissions
is the number one reason that scripts don't work on your server right out
of the box.
Finally, in some cases, you most also change the permissions of the
root directory itself becasue the main scripts must write to the same directory
in which it is located. Make sure you go up a directory and chmod the root
directory of the script you are trying to run.
Q:I keep getting the
error "500 Server Error. The server encountered an internal error
or misconfiguration and was unable to complete your request.
A: You need to check the sucker out from the command line, there is
probably a syntax error somewhere, cause the script is not executing.
Possible errors include...
- Your Perl path on the first line of the script
is wrong.
- You have syntax errors. Have you included
illegal characters in a print statement? Common ones are " and @.
If you've changed some HTML output to be personalized for your site, I'd
bet that is what happened. BTW, in many of my scripts, I use the print
"<< end_of_html"; method to prevent the need to escape
those chracters...but that is only good for long blocks of HTML really.
You still need to know about escaping bad characters.
- Are you sure that you gave the server the right script name in your
form tag?
- Does the script have the right permissions?
Here is a installation report from Scott
Barkey which details errors which may occur when transferring scripts
via ftp... "This is the description of the problem I was
having while trying to install the db_manager.cgi script. I hope this can
help another.
First of all I downloaded the db_manager_scripts form your site then
I made the changes in the db_manager.setup file and ftp'd the files to
my server from my computer. I then ran telnet and ran chmod +x db_manager.cgi
and then ran the file in my browser, after getting a very general 500 server
error from netscape, I ran the script from telnet using perl db_manager.cgi
and got the error...
% perl db_manager.cgi
Can't find string terminator " end_of_html" anywhere before
EOF at db_manager.cgi line 132.
Well the moral of the story is that after many hours of torment and
a couple of emails to selena trying to figure out what I did wrong, I discovered
that when I ftp'd the files to my server the ftp program was in binary
transfer mode instead of ascii format. After changing this everything was
well with the world."
Here is another one from Chris
Edmunds
I have run into this problem before (when printing a text file and having
it bomb at the printer). You can use something like (cat -v file.txt
>newfile.txt) to display non-printable characters - and then do a (diff
file.txt newfile.txt) to see if that was the problem.
You can also search for control characters in an editor - for example
using vi in command mode to remove all control C's you would type:
:g/^C/s///g (in order to get ^C you would have to type control v, then
control C
Here is another one from
Jeff Wilkinson
BTW, you might mention in your FAQ that a common problem
is using WinZip or a similar program to unzip and
having the "Tar file Smart CR/LF Conversion" ON. This
converts to PC-style CR/LF's. If you then upld that file
and try to execute it on a unix box, it will not work.
Took me a while to find/fix that one.
Q: I keep getting a "Document Contains No
Data" Error.
A: My guess is that somewhere along the way you have misnamed a variable
and the typo is causing the script not to find the file it was asked to
find. Maybe it is a file you are opening or one you are requiring. It could
also be a permissions problem. The main idea you should realize is that
there is not a syntax problem...it is that the file is not outputting anything
to the web...so somewhere along the way, your print statements are coming
up empty. What you should do is go into the script and isolate the output
lines and figure out why they are not getting anything to output.
Drea Leed writes, "The most common reason that I get the no data
box is because the directory I'm dealing with isn't writeable by the httpd
daemon. Three different scripts I had this problem with; the program writes
to a tmp file, the tmp file is changed to the working file, and if the
tmp file can't be created than the whole thing goes spla."
This is great advice. I would say that 60% of the file contains no data
help requests I get are due to this. Not only do you have to worry about
permissions in the root directory of the application you are trying to
install, but you need to worry about the permissions of the root directory
itself.
The other 40% of these occurances are due to the fact that you are probably
using a virtual server which has created an alias for your cgi-bin with
some program like cgiwarp in order to maintain greater security. Thus,
the "real" present working directory that you see is not the
same as the present working directory that the web servre sees. Thus, when
you tell a script that it's data file is located in "./Data"
and your environment is set to read as /selena/cgi-bin, when you run the
script from the commandline, it will see the correct path of "/selena/cgi-bin/Data/"
and all will be fine.
However, the webserver may be running with a different environment (probably
one at a more "root" level. What it sees could be something like
"/usr/local/www/home/selena/" thus, when it sees reference to
a ".Data", it may look for "/usr/local/www/home/selena/Data"
and not "/usr/local/www/home/selena/cgi-bin/Data/".
What you need to do is find out what the web server is seeing as its
present working directory. That way, you can refernce all files absolutely.
I do not yet know how tyo do this internally to Perl, so I can only give
you a UNIX solution (although some service providers might disable this
option too).
Try putting the following script in the same directory as the file you
are trying to customize:
#!/usr/local/bin/perl
print "Content-type: text/html\n\n";
$present_working_directory = `pwd`;
print "$present_working_directory";
exit;
Then run that from the web. Hopefully, you will see the directory that
your web server thinks it is in. Then you can modify all of tyou server
specific variables in the setup file or wherever they are located and go
on your way.
This in from Advanced Business Systems
Document contains no data:
There may be serveral causes of this error.
- This is the MAJOR reason. The server has run out of threads, and
simply terminates the connection with no warning message. The client
then sees this as no data returned. Because the sendmail is called
so many times (over and over again) the server simply runs out of
threads. Providers are limiting the total threads one can have as a
way to get around the "unlimimited hits/download" thing.
- The client has cached the document, and is empty. It therefore
sees the cached document, and the server does not get asked for the
document again. You need to remove the empty document from the cache
(this step is client dependent).
- The client is going through a proxy/gateway server, which is
refusing the requests, or malfunctioning. This causes the client to
see an empty document. Use a different proxy server, or don't use a
proxy server if you have direct access to the server.
Something for you to try:
I have found it was my provider. They were not allowing enough
threads. Try this from telnet. Remember to use a phony *.list. I have
set this to Private.list. I found I can run this from telnet with no
problem. I think it is because access is different that allows me to
do it this way. If I run the same script with my browser, I will get
the "document contains no data" error message.
Lucky...
!/usr/local/bin/perl
#######################################################################
# where's sendmail located?
#######################################################################
$sendmail = "/usr/lib/sendmail -t";
$Email_From = "you\@domain.com";
#######################################################################
# Now set up the database you wish to send from
#######################################################################
$database = "./Databases/Private.list";
#######################################################################
# Let's try it out
#######################################################################
open (DATABASE, "$database") || &Exit ("I am sorry, but I was not
able to open the data file in the Do a Mass Mailing routine. The
value I have is $database. Would you check the path and
permissions.\n");
@lines = ;
chop (@lines);
close (DATABASE);
@emails = @lines;
$counting = 1;
foreach (@emails) {
($recipient_email, $recipient_name) = split (/\|/, $_);
if ($recipient_name ne "") {
$to = "Hello $recipient_name";
} else {
$to = "Hello <$recipient_email>";
}
open(MAIL,"|$sendmail") || &Exit("Could not execute \"$sendmail\"\n");
print MAIL <<"TAG";
From: $Email_From
To: $recipient_email
Subject: Testing
Errors-To: $Email_From
$to,
This is just a test!
------------------- End of Document -------------------
TAG
print "$counting) Sent To: $recipient_name $recipient_email\n";
close(MAIL);
$counting ++;
}
exit;
sub Exit {
local($errorheader) = shift(@_);
print "It looks like the mailer died...";
exit(2);
}
Q: When I try running the script from the web, all
I receive is the text of the script
A: Many servers are configured to execute perl scripts only from the
root cgi-bin directory. Usually, however, they have allowed you to run
a script if it has a .cgi suffix and if you have some access file set like
.htaccess for NCSA. You'll need to check with your sysadmin about how your
particular system is setup, but you can at least try changing the script
from .pl to .cgi. You might also make sure that you have set the script
to be executable...chmod 775. Here are a couple of sample .htaccess lines
to give you an example of the kind of things you might be looking for.
AddType: application/x-httpd-cgi .cgi
Options ExecCGI.
The .htaccess file is defined in the access.conf file in the conf directory
on NCSA and will vary from server to server.
Q: The
scripts work fine on my site, they just won't mail
A:The
most likely reason for this is that you are not calling the mail program
correctly. If you are using mail-lib.pl, the most probable reason is that
you did not change the variable which defines the location of the sendmail
program for your local server. On line 42 is the following command:
$mail_program
= "/usr/lib/sendmail -t -n";
You
need to change the path to the correct value for your local server.
Q: Miscellaneous
things people wrote me that I thought was useful
Regarding
the Groupware Calendar 2.0, Jonathan K. Cohen wrote,
"I
had to replace all your instances of die; at the end of routines (as opposed
to filetests) with exit(0);. die; caused an instantaneous and inelegant
abort in all cases."
Selena
notes:
I
have since stopped using die, but this was a nice thing to learn.
Jonathan
also wrote, "Paths should be absolute where possible. Relative paths
(./) led to problems."
Q: How
do I port your scripts to NT? (By Netscape Tech Support)
A:If you are having trouble getting Perl scripts to work as CGI on Windows get
a simple CGI batch file working on your system first. It is wise to be
sure that a simple batch file works first before you go through the extra
steps required to set up a Perl script.
Also be sure to download the NT version of the application you want!
And read about tar files and Windows/Macintosh.
To
run Perl scripts on Windows NT, you will need to download and install a
Perl interpreter program, since Perl itself does not come with either the
Windows NT operating system or the Netscape server software. You can find
an Intel version of a Windows NT Perl interpreter at "ftp://ftp.intergraph.com/pub/win32/perl",
and an Alpha version of it at "ftp://ftp.garply.com/pub/pc/nt/alpha/ntperla.zip".
(Try using a web search utility to see if there are newer versions of these
programs available.)
To
run a Perl script as CGI, set up a batch file which calls it. In other
words, if "MYCGI.PL" is your Perl CGI script, then create "MYCGI.BAT"
which contains only these two lines (or the equivalent):
@ECHO
OFF C:\PERL\PERL.EXE C:\NETSCAPE\NS-HOME\DOCS\CGI-BIN\MYCGI.PL
and
then call it from your page page like this:
<A HREF = "/cgi-bin/mycgi.bat">My Perl Script</A>
If
you want to run this as a NPH script (non-parsed headers), then name your
batch file "NPH-MYCGI.BAT" so that the web server knows it's
calling a NPH script.
and
stdin properly, so you can use it as a form handler for a POSTed form,
but arguments from the URL will not be passed to it, so you can not use
it as a form handler whose METHOD is "GET". It is very dangerous
to allow URL's to pass arguments directly (or indirectly) to arbitrary
commands on your system.
anywhere
else where it could be accessed and run from a URL, because that would
let people pass arbitrary parameters to it which could wreak havoc with
your system.
Be
warned that due to the way the Netscape web server passes batch files to
the command shell on Windows NT, there is a security hole whereby a knowledgeable
user will be able to use it to execute arbitrary commands on your system.
(This security hole also occurs with other web servers, such as O'Reilly's
"Website" server.) Because of this, you should NOT use batch
scripts as CGI on your web server unless you're testing something temporarily
(such as getting Perl to work in the first place, as above), or you're
certain that your web server will be safe from malicious users (this is
helped by being behind a firewall or on a small, restricted network). To
get around the security hole entirely, you should make all of your CGI
programs be compiled EXE programs instead, since EXE files are not subject
to security hole (EXE files do not need to be passed to a command shell
in order to be run). You can have a very simple EXE program call your Perl
script safely.
Q:
On my NT, the img src = "script" tag only works for small images
A: Marlin
sent the following help. Recently I ran into a problem with an NT/Perl
script that others might encounter. The goal was to have Perl place a graphic
on an HTML page by responding to an <img border=2 ></a> tag.
The tag for the Perl call looks like this:
<img src="...cgi-bin/placepic.pl" border=2 >
The
code (Perl) that grabs a image and sends it to the browser looks like this
on a Unix machine:
print
"Content-type: image/gif";
print "\n\n";
$adpic="D:/path/pics/graphic.gif";
open (IMAGE, $adpic) || print "can't open image file";
read (IMAGE, $buffer, -s $adpic);
print $buffer;
close (IMAGE);
exit;
When
this is used on the NT nothing gets returned that the browser can recognize;
you see a standard Netscape "missing picture." The missing ingredient
for NT is "binmode." Both the read in and the send out must be
specified "binary." Do this by adding the lines shown below:
print
"Content-type: image/gif";
print "\n\n";
$adpic="D:/path/pics/graphic.gif";
open (IMAGE, $adpic) || print "can't open image file";
binmode (IMAGE); #ADDED FOR NT
read (IMAGE, $buffer, -s $adpic);
binmode (STDOUT); #ADDED FOR NT
print $buffer;
close (IMAGE);
exit;
Q:
Some browsers bvreak when they are trtying to run your scripts and others
work just fine. What can I do?
A:
As it turns out, the latest version of Internet Excplorer has some serious
bugs when it comes to handling CGI. Apparently, for example, it does not
deal well with the ampersand sign (&) which is an important CGI tool
for passing URL encoded data. I have heard that Microsoft is aware of the
problem and is going to fix the bug in the next version of IE. In the meantime,
there is not much CGI programmers can do. The following is information
provided by Jeff Gordon,
about netscape 3.0beta and Web Explorer...maybe this will help you...
I've
just run around on the Net, with -both- OS/2's WebExplorer and the new
beta OS/2 Netscape browsers. The two exhibit -very- different behaviors
that might be important to your scripts.
First:
trying to write a reply to a previously posted BBS message, I can crash
the Netscape beta, every time without fail. It reports it's unable to load
in text from the earlier message (though WebExplorer has no trouble).
Second,
though: WebExplorer has -failed- to recognize a link at one location, that
Netscape sees. At the front page of the Webstore/Outlet that's installed
at www.heartnsoul.com, there's a simple link to take folks back to the
home page of the site. Netscape sees it and can activate it; WebExplorer
doesn't see that link as anything other than a call to the web_store.cgi
script, and can't activate the link, though the rest of the page seems
to work correctly.
Third:
WebExplorer doesn't wrap the elaborate shopping cart items display, so
that the items appear in columns under their headings; everything stays
on one line, extending to the right past the edge of the viewing window.
This has the secondary effect of causing the "change quantity"
entry fields to appear tiny, only one character wide.
Fourth
and finally: WebExplorer is drawing much larger entry fields than Netscape
is doing. The field for Logon name in the BBS, for example, extends essentially
all the way across the screen in WebEx, but is maybe one-sixth of the screen
in Netscape.
Q: You are a pretty pitiful FAQ-maker, are
there any better FAQs out there?
A: Yes, please try
Submit FAQ questions and answers.
Q: When using mail-lib.pl the From field does not
work right
A: From Joshua Gerth
...In particular I have gone through and replace most of my calls to
blat.exe (sendmail for NT) with calls to your mail-lib.pl sendmail
libraries. However, my only difficulty was that ( if I remember ) the
real_send_mail from your script did not produce a "To:" or "From:" in the
body of the sendmail text.
I did a pretty basic hack to your script which looks for any
comments inside the to and from:
/<(.*)>/;
/"(.*)"/;
Strips out the comment for the actual
RCPT To:
and
Mail From:
Then, just after "DATA" is printed and just before the body of the
message is printed prints:
print S "DATA\n";
$buf = read_sock(S, 6);
print S "From: $fromname\n";
print S "To: $touser\n";
print S $messagebody . "\n";
This way when the message is received it has a friendly "from" and
"to". If you are interested in seeing the hack I would be more then happy
to send it to you, otherwise you can completely ignore this message.
|