You are free to edit your Python scripts using your favorite text editor (such as, Notepad, IDLE, Editpad, pico, PythonWwin, vi, and so on). Of course, we can't forget about Emacs, which has one of the best Python editing modes available.
Remember to upload your scripts as text files to your Web server. In order to execute them, you need to make sure that they are in a "executable" directory, and that they have the right permissions. As I said before, most often CGI scripts live in the server's special cgi-bin directory. You should consider verifying whether the files, that your script needs to read or write, are actually readable or writable, respectively, by other users. In UNIX, the command to set the permissions is chmod.For example,
The mode argument 755 defines that the file's owner can read, write, and execute the file, whereas the other users can only read and execute it. The common UNIX mode values and their respective symbolic arguments are
Tip: Keep in mind that commands and filenames are all case sensitive if the Web Server is on an OS with case sensitive file names.For security reasons, the HTTP server executes your script as user "nobody", without any special privileges.
Therefore, it can only read (write, execute) files that everybody can read (write, execute). The current directory at execution time is usually the server's /cgi-bin directory, and the set of environmentvariables is different from what you get at login. In other words, don't count on the shell's search path variable for executables ($PATH) or the Python module search path variable ($PYTHONPATH) to be set to anything useful.
If you need to load modules from a directory that is not listed as part of the Python's default module search path, you can change the path variable in your script before trying to import them. In the following example, we add three more directory entries in the search path. Note that the last directory inserted, "/usr /python /testdict", is searched first.import sys
Instead of using "from cgi import *", you should use only "import cgi" because the cgimodule defines many other names for backward compatibility that can interfere with your code. It also might be useful for you to redirect the standard error (sys.stderr) to the standard output (sys.stdout). This will display all the error messages in the browser.
Sending Information to Python Scripts
Every time you use a URL to carry information to a CGI script, the data is transported as name/value pairs, separated by ampersands (&), and each pair is separated by an equal sign (=). Whitespaces between words are usually converted to the plus symbol (+).
Special characters are encoded to hexadecimal format (%HH) and preceded by the percent sign. Therefore, the string"Parrot sketch" is passed to the script as "Parrot%20sketch". As you can see, the previous example is implicitly using the GET method to pass the values to the CGI script. If you decide that the POST method is more suitable for your needs, you will need to use the urllibmodule inorder to send the information. The following example demonstrates its use.importurllib
Encoded Strings Used to Represent Special Characters When Dealing with URLs
Working with Form Fields and Parsing the Information
The first thing that most beginners in the Web development area want to know is how to get information out of HTML forms and do something with it. The following HTML code results in a Web page that queries the user about his login information. Note that we use the POST method in the form. Thus, the field values will not be displayed as part of the URL.
Login Form that calls a CGI script.
Also, pay attention to the way data fields are referenced in HTML forms. Each input element carries a name attribute that uniquely identifies the element within a form. For instance, the tag<input type="text" name="login">
Every CGI script must send a header (the Content-type tag) describing the contents of the document. The common values for this tag are text/html, text/plain, image/gif, and image/jpeg. A blank lineis used to indicate the end of this header.
Tip: The Content-type tag is used by the client browser and does not appear in the generated page. As you can see, a script is really executed, and not just displayed in the browser. Everything printed to sys .stdoutby the script is sent to the client browser, whereas error messages go to an error log (/usr /loca l/etc /httpd /logs /error _log in Apache).
The following script is the CGI program called by the HTML form from the previous code.#!/usr/local/bin/python
This example first verifies if the form is a valid form (line 15). If it isn't, a blank screen is displayed. If the fields have a valid format, the form performs an action and processes the results (lines 17–25). The last case is when the validation rule is not followed, and an error message must be displayed. A full implementation should repeat the form, and point out the error to the user.
Next, we have a simple check list to use while developing CGI scripts. It shows the basic structure of CGI script creation.
The following example is a small variation of the previous script. This one lists the values of all form fields.#!/usr/local/bin/python
The next example demonstrates that if you try to access a field that doesn't exist (line 15), an exception is generated. If you don't catch the exception with a try/except statement, this will stop your script, and the user will see a message like "Internal Server Error". Also, note that the cgi dictionary of attribute /value pairs does not support the values() method (line 14).#!/usr/local/bin/python
You have to watch out when passing fields to the shell. Never pass any string received from the client directly to a shell command. Take a look at the following statement:os.popen("dir %s" % form["filename"].value)
In order to solve problems like this, you have a few different kinds of approaches. We will look some of them.
First, you can choose to quote the variable:filename = pipes.quote(form["filename"].value)
A second solution is to get rid of every character that is not part of the acceptable domain of values.filename = re.sub(r"W", "", form["filename"].value)
Note: You should test for acceptable input, rather than for unacceptable input. You don't want to get caught by surprise when someone thinks of some input string you didn't think of, or exploits a bug you don't know about.
The third, and most radical, solution is to test the form, and return an error message in case a valid condition is not established. For example, if not re.match(r"^w+$", filename):raise "Invalid file name."
If you invoke an external program (for example, via the os .system() or os .popen() functions), make very sure that you don't pass arbitrary strings received from the client to the shell. It is a bad idea to use form data provided by random people on the Web without validating it; especially if you're going to use that data to execute a system command or for acting on a database. Naively written CGI scripts, in any language, are favorite targets for malicious system crackers. This is a well-known security hole whereby clever hackers anywhere on the Web can exploit a naive CGI script to invoke arbitrary shell commands. Even parts of the URL or field names cannot be trusted because the request doesn't have to come from your form.
To be on the safe side, if you must pass a string that you have gotten from a form to a shell command, you should make sure that the string contains only alphanumeric characters, dashes, underscores, and periods.
If you need to correlate requests from the same user, you must generate and assign a session key on the first contact of the user, and incorporate this session key in the next forms, or in the URLs.
If you implement the first solution, you need to use a hidden input field.<input type="hidden" name="session" value="74ght2o5">
If you decide that the second option will work better for you, you need to add the information after the script's name (separating with a slash). The information is passed to the CGI script through the environment variables, as you can see next.os.environment["PATH_INFO"]="74ght2o5"
The information manipulated by CGI scripts can come from any kind of data storage structure. The important thing to keep in mind is that your data must be capable of being managed and updated.
You have a number of options to use here. Plain files are the simplest way. Shelves can be used too—they are used to store whole Python objects, which avoids the parsing/unparsing of values. If you decide to go through dbm (orgdbm) files, you will find better performance as they use strings for key/value manipulations. If you really want to think about scalability or speed, you should consider choosing a real database.
If you don't have a real database in hands, don't worry. A number of sites only use plain file databases, and they don't have anything to complain about.
Whenever you are not working with real solution database systems, locking problems can drive you nuts because you have to worry about every single detail. For example, shelves and dbm (or gdbm) database files have no protection against concurrent updates.
In order to implement a good and efficient locking solution in Python, the best approach is to write a routine that locks only when writing to the file. Python handles multiple readers well, and when it comes to a single writer, Python can support it too.
In order to study a complex implementation of a locking algorithm, you should consider seeing the Mailman source-code (precisely, the LockFile.py file). Although this routine does not run on Windows systems, it works well on UNIX machines, and besides, it supports NFS.
We all know how hard it is to implement a good locking solution. Occasionally your process dies, and you lose the pointer to the locked file; other times you see your program hanging because the process took longer than expected.
Caution: If you have something to hide, it becomes very important to store the information in the cookies in a security format. You cannot let the user go to the cookies.txt file, which stores all the cookies information in the client machine, and change anything. In order to prevent that, you should consider storing the cookies using an encryption algorithm. Another important warning is that you shouldn't blindly trust the value of the cookie, the same as you shouldn't trust form variables.
The Cookie.py Module
Python has this module called Cookie.py, which basically handles everything that you might need to worry about for what concerns cookies.
This class enables the creation of a cookie object.>>> import Cookie
A cookie object generated by the Cookie.py module has a dictionary-like behavior. It exposes the following properties and methods, supporting all cookie attributes defined by RFC 2109.mycookie['username'] = "Andre Lessa" # Assign a value to a cookie
Note that the print statement must be executed before the content-type header.
This method outputs the contents of a cookie. You can also change the printable representation if you want.mycookie.output()
This method is used to extract cookies from a given string. You won't have a problem using escaped quotation marks and nested semicolons in the string.mycookie.load("userid=alessa;")
Cookie.net_setfunc() and Cookie.user_setfunc()
These two functions are defined in the Cookie module to help you encode and decode the contents of your cookies. Cookie.net_setfunc() takes in an encoded string and returns a value. On the other hand, Cookie.user_setfunc() takes in a value and returns the original encoded string.
Note that you are not obliged to use their implementations. You can override them at anytime, just by sub classing theCookie() class, and redefining these methods.
Creating Output for a Browser
You already know that straightforward print statements do a good job of sending information to the user's browser. Now, what about redirecting people from one page to another? In the next example, as soon as a browser sees the Location: header, it will stop and try to retrieve the new page.new_location = 'http://www.python.org/'
Maybe you are tired of just sending text to the user. What about sending images?
The next example demonstrates how you can output graphics, such as GIF files, using CGI scripts. As you can see, you just need to specify the correct MIME-type in order to tell the browser that you are sending an image.import sys
Caution: Note that you cannot use print image because it would append a newline or a blank to the data, in case you useprint image, (with the comma at the end), and the browser would not understand it. The previous simple example takes an existing GIF image file and processes it. Keep in mind that it is also possible to produce dynamic graphics images through Python code, using the Python Imaging Library.
CGI programs usually contain many blocks of HTML code embedded within the scripts. This is a problem for many teams of HTML designers and developers. Imagine the case in which both kinds of professionals need to make changes in the same file, at the same time. This kind of situation can generate many accidental errors in the code.
The most common solution for this kind of trouble is to separate the Python code from the HTML code by using template files. In a later stage, the HTML template can be mixed with Python code using either formatting substitution or Python's regular-expression.
The basic idea is after you have finished reading the template file, replace all special placeholders, such as<!-- # INSERT HERE # -->
Listing below defines a simple template that is going to be used by our Python script. Of course, real-production templates are more complex than this one.
Listing file: template1.html
Note the customized tag tag>><!-- # INSERT HERE # -->. If you just open this template file, nothing will show up. However, after you run the script, the program will search for this tag and replace it with our new content before displaying to the users.
Next, you have the CGI script that makes everything possible. This script reads the entire template file, storing it in memory. Then, after applying a regular expression substitution, it swaps our special tag with the new content.import re
As I told you before, another possibility is to use formatting substitution. In this new scenario, we have to write the template file as shown in Listing below.
Listing file: template2.html
Sometimes, it is necessary to receive files from users through the Web. This next example shows how to send a file across an HTTP connection using an HTML page, and how to later interpret it.
When a certain form field represents an uploaded file, the value attribute of that field reads the entire file in memory as a string. Sometimes, this might not be what you really want. Another way to get the information is to test for an uploaded file by checking either the filename attribute or the file attribute. You can then read the data, at your convenience, from the file attribute.
Note: The enctype="multipart/form-data" part is very important because without it, only the filename is transmitted. The next example is a slight variation of the previous example. This one assumes that you have a form with a field calledfilename that will transport a user file to the CGI script, and then it reads the uploaded file, line by line.importcgi
The cgiuploadmodule is a simple attempt to upload files via HTTP. Although the mechanism is not as efficient as other protocols (for example, FTP), there are circumstances where using the http protocol has advantages such as when a user login/password is not required, or when using firewalls because most firewalls allow the HTTP protocol to pass through. Note that HTTP file upload is about as efficient as email attachments.
Environment variables are one of the methods that Web servers use to pass information to a CGI script. They are created and assigned appropriate values within the environment that the server produces for the CGI script.
The next code generates a list of all environment variables that you have available at the moment, in your browser.importos
The following list is the output collected from my environment. Of course, yours might be different.HTTP_ACCEPT_ENCODING =>gzip, deflate
As an example, when checking the user environment variables, os.environ['HTTP _USER _AGENT'] gives you the user's browser, and os .environ ['REMOTE _ADDR'] gives you the remote IP address. Note that the user might be running a browser that doesn't send a User -Agent HTTP header, so you might not be able to count onos .environ ['HTTP _USER _AGENT'].
The following is a list of environment variables used by Web Servers:
Typed URLs and bookmarks usually result in this variable being left blank. In many cases, a script might need to behave differently depending on the referrer. For example, you might want to restrict your counter script to operate only if it is called from one of your own pages. This will prevent someone from using it from another Web page without your permission. Or, the referrer might be the actual data that the script needs to process. By expanding on the previous example, you might also want to install your counter to many pages, and have the script figure out from the referrer which page generated the call and increment the appropriate count, keeping a separate count for each individual URL. Some proxies or Web browsers might strip off the HTTP_Refererheader for privacy reasons.
HTTP_USER_AGENT— This is the name/version pair of the client browser issuing the request to the script. As with referrers, one might need to implement behaviors that vary with the client software used to call the script. A redirection script could make use of this information to point the client to a page optimized for a specific browser. Or, you might want it to block requests from specific clients, such as robots or clients that will not support appropriate features used by the normal script output.
PATH_INFO— The extra path information following the script's path in the URL. This is appended to the URL and marked by a leading slash. The server puts this information in the PATH_INFO variable, which can be used as a method to pass arguments to the script. The extra path information is given by the client. In other words, scripts can be accessed by their virtual pathname, followed byextra information at the end of this path. The extra information is sent as PATH_INFO. This information should be decoded by the server if it comes from a URL before it is passed to the CGI script.
PATH _TRANSLATED — Translated version of PATH _INFO, which maps it onto DOCUMENT _ROOT .Usually PATH _INFO is used to pass a path argument to the script. For example, a counter might be passed the path to the file where counts should be stored. The server also makes a mapping of the PATH_INFO variable onto the document root path and stores it in PATH _TRANSLATED, which can be used directly as an absolute path /file. You should use PATH _TRANSLATED rather than concatenating DOCUMENT _ROOT and PATH _INFO because the documents on the Web Server might be spread over more than just one directory (for instance, user directories under their home directories).
QUERY_STRING— QUERY_STRING is the equivalent of content passed through STDIN in POST, but for scripts called with the GET method. Query arguments are written in this variable in their URL-Encoded form, just as they appear on the calling URL. You can process this string to extract useful parameters for the script. The information following the ?in the URL that references a script is exactly what we call query information. It should not be decoded in any fashion. This variable should always be set when there is query information, regardless of command line decoding.
REMOTE_ADDR— This is the IP address from which the client is issuing the request. This can be useful either for logging accesses to the script (for example a voting script might want to log votersin a file by their IP in order to prevent them from voting more than once) or to block /behave differently for particular IP addresses. This might be a requirement in a script that has to be restricted to your local network, and maybe perform different tasks for each known host.
REMOTE_HOST— This variable contains the hostname from which the client is issuing the request (if the information is available via reverse lookup).
REMOTE_IDENT— If the HTTP server supports RFC 931 identification, this variable will be set to the remote username retrieved from the server. Otherwise, this variable should be left blank.
REMOTE_USER— If the server supports user authentication, and the script is protected, this is the username they have authenticated as.
REQUEST_METHOD— This is the method with which the request was made (usually GET, POST, orHEAD). It is wise to have your script check this variable before doing anything. You candetermine where the input will be (STDIN for POST, QUERY_STRING for GET) or choose to permit operation only under one of the two methods. It is also useful to identify when the script iscalled from the command-line because, in that case, this variable will remain undefined. When using thecgimodule, all this is taken care of for you.
SCRIPT_NAME— A virtual path to the script being executed, used for self-referencing URLs. This is very useful if your script will output HTML code that contains calls to itself. Having the script determine its virtual path, (and hence, along with DOCUMENT_ROOT, its full URL) is more portable than hard coding it in a configuration variable. Also, if you prefer to keep a log of all script accesses in some file and want to have each script report its name along with the calling parameters or time, it is very portable to use SCRIPT_NAME to print the path of the script.
SERVER_NAME— The Web server's hostname, DNS alias, or IP address. This information can provide the capability to have different behaviors depending on the server that's calling the script.
SERVER_PORT— The Web server's listening port number to which the request was sent. This information complements SERVER_NAME, making your script portable. Keep in mind that not allservers run on the default port and thus need an explicit port reference in the server address part of the URL.
SERVER_PROTOCOL— The name and revision of the Server information protocol that the request came in with. It comes in the format: protocol/revision.
SERVER_SOFTWARE— This variable contains the name and version of the information server software answering the request. The format used by this variable is name/version.
Debugging and Testing Your Script
Before putting your CGI scripts online, you need to be sure that they are working fine. You have to test them carefully, especially in near bounds and out of bounds conditions. A script that crashes in the middle of its job can cause large problems, such as data inconsistency in a database application. This is why you would use a transaction when updating a database from a cgiscript (if it was deemed important enough).
You should eliminate most of the problems by running your script from the command line. Only after performing this check should you test it from your http daemon. You have to remember that Python is an interpreted language, which means that several syntax errors will only be discovered at runtime. You must be sure that your script has been tested in every segment of the control flow.
Python is good for debugging processes because if things go wrong, you get a traceback message that is beneficial. By default, tracebacks usually go to the server's error_logfile. Printing a traceback message to the standard output is complicated because the error could occur before the Content-type header is printed, in the middle of a HTML markup tag, or even worse: the error message couldcontain markup elements itself.
You also need to be sure that incorrect input does not lead to an incorrect behavior of your script. Don't expect that all parameters received by your script will be meaningful. They can be corrupted during communication, or some hacker could try to obtain more data than normally allowed.
The following code suggests a simple way to debug Python CGI scripts.import cgi
Line 4: Calls the function that implements your application.
Line 2: We are using a content type of text/plain so that you can see all the output of the script.
Line 7: Calls a CGI function that safely prints a traceback message.
When creating a debugging framework, it is desirable that the user should never see a server error. Instead, you must provide a fancy page that tells him what has happened, along with helper information.
As a suggestion, your framework could interpret every traceback message and email it to the support team. This is a very useful solution for warning about problems in a live Web site, and besides, logging errors can help the tracking of application problems. If you are in the stage of doing quality-assurance testing procedures on your Web application, you should try to test it outside the live site first.
Let's see how you can do it. Check the script for syntax errors by doing something similar to python script.py. If you execute your script in this way, you are able to test the integrity and syntax of your code. If you have your script written as a module, adding the following two lines to its end enables you to execute your library module from the command prompt.if __name__ == "__main__":
A CGI script usually does not work from the command line. However, you should at least call it from the command line because if the Python interpreter finds a syntax error, a message will pop up on your screen. That's cool! At least you know if the syntax is all right. Otherwise, if you wait until you call your code through the Web, the HTTP server could send a very problematic error message to you.
Assuming that your script has no syntax errors, yet it does not work, you have no choice but to fake a form call to it.
If you are using UNIX cshor tcshshells, and your script uses the cgi.FieldStorageclass for form input, you can set the environment variables REQUEST_METHOD and QUERY _STRING.setenv REQUEST_METHOD "GET"
Check if your script is located at an executable directory, and if so, try sending an URL request directly through the browser to the script. In other words, open your browser and call your script, without forgetting to send the attribute/value pairs.
If, for example, you receive an error number 404, it means that your server could not find the script in that directory. As you can see, this might help you test and debug your script through the Web. Next, I list some considerations that you need to have in mind while debugging a Python CGI application. They are as follows:
The following example exposes all the previous considerations:import sys
Note that the assignment to sys.stdoutis necessary because the traceback object prints to the standard error output (stderr). The print " <PRE> " statement is being used to disable the word wrapping in HTML.
If your script calls external programs, make sure that Python's $PATH variable is set to the right directories because when it is inside a CGI environment, this variable does not carry useful values.
Python Related Interview Questions
|Perl Scripting Interview Questions||C++ Interview Questions|
|PHP Interview Questions||C Interview Questions|
|Ruby on Rails Interview Questions||Ruby Interview Questions|
|Django Interview Questions||Lisp programming Interview Questions|
|R Programming language Interview Questions||wxPython Interview Questions|
|Python Automation Testing Interview Questions|
Extending And Embedding Python
Objects Interfacing And Distribution
Working With Databases
Other Advanced Topics
Basic Network Background
Python And Guis
All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd
Wisdomjobs.com is one of the best job search sites in India.