Other Topics: File Transfers |
This Session discusses how your web application can handle uploading or downloading binary files. Most of the discussion relies on the HTTP File Transfer Library. A copy of the HTTP File Transfer Classes is in the Tutorial's source folder.
HTTPFileTransfer.cc contains two classes. The first is CGIUpload which is used for uploading files from a web browser to a Web server via a CGI application. The second class is CGIDownload. This class is used for streaming a file from a CGI application to a Web browser. The remainder of this Session will discuss these classes.
In the early days of the World Wide Web the only way to upload a file to a web site was to use an FTP server. This could be rather cumbersome because user accounts needed to be set up and separate software needed to be used. Moreover, it was not possible to integrate binary attachments with a web form. Thus, HTML message boards and Web based email systems could not handle attachments.
The CGIUpload class makes it possible to transfer files from web users without using an FTP server. Moreover, the files can be stored outside of web space and the developer can control filenames. The specification for HTML file uploads are found in rfc1867 (Request For Comments: 1867) which is titled “Form-based File Upload in HTML”.
There
are two key elements to implementing uploads from an HTML form. First,
the TYPE attribute
of the HTML INPUT tag
must be set to the FILE option.
This places an object on an HTML form that lets the user supply a file
as input. When the form is submitted, the content of the specified file
is sent to the server as the value portion of this object's
name/value pair. When
this object is used, the Web browser displays a “Browse” button next to
the file input field that lets the user select
a file from their system. The following is an example of this HTML form
element:
|
The second element necessary for implementing HTML uploads is to declare multipart/form-data as the MIME type of the HTML form. Your form tag will then look similar to this:
<FORM ENCTYPE="multipart/form-data" ACTION="upload.exe" METHOD=POST>
In a standard HTML form, the ENCTYPE attribute of the FORM tag is set to application/x-www-form-urlencoded. If your form does not define this attribute application/x-www-form-urlencoded is assumed. However, in order to use HTML file uploads, the MIME type must be changed to multipart/form-data.
Putting these two elements together, the HTML form that we must use to upload a file is similar to the following:
<FORM ENCTYPE="multipart/form-data" ACTION="upload.exe" METHOD=POST> File to process: <INPUT NAME="userfile1" TYPE="file"> <INPUT TYPE="submit" VALUE="Send File"> </FORM>
With these modifications to a standard HTML form, the CGI data is sent to the Web server as individual lines separated with boundary markers rather than as a string of name/value pair separated with ampersands. A boundary marker is selected by the Web browser and is sent as part of the header. The multipart form data sent by a Web browser looks similar to the following:
Content-type: multipart/form-data, boundary=-------16740297823514 -------16740297823514 content-disposition: form-data; name="field1" Joe Blow -------16740297823514 content-disposition: form-data; name="upload_file1"; filename="file1.gif" Content-Type: image/gif ... contents of file1.gif ... -------16740297823514--
The primary task of the dBL CGIUpload class is to parse this multipart data stream.
CGIUpload is subclassed from WebClass.cc, however the connect() method is overridden and the functionality in loadArrayFromCGI is replaced with the ReadMultipart() method. This is an important point because it means that when you call connect(), the oCGI array is not loaded with name/value pairs.
To load the oCGI array, you must call ReadMultipart(). This method begins by reading the entire multipart data stream into a temporary file on the Web server. Then the method starts parsing that file line by line. If the parsing finds an uploaded file, it extracts the data and saves it as a file with the submitted filename. That filename is then added to the oCGI associative array. ReadMultipart can, if needed, handle more than one uploaded file in a single session.
For the data depicted above, the name would be "upload_file1" and the value would be "file1.gif". If a second INPUT tag is used in the HTML form so that a second file is uploaded, that name might be "upload_file2" and its value might be "file2.gif". ReadMultipart also loads standard name/value pairs into the oCGI associative array. Access to the name/value pairs then works the same as with a standard CGI application.
After the temporary file is parsed by ReadMultipart, it is deleted. By default the temporary file and the uploaded file(s) are saved in the same folder as the application. If you would like to change this location, use the TempPath and SavePath properties of the CGIUpload Class.
These paths can be hard coded into your program file, or they can be stored in the application's INI file and read with INI.cc. (For examples, see UploadSample1.prg and UploadSample2.prg found in the "Examples" folder.) If the desired location for saving the uploaded file is determined by information contained in the HTML form data, then you will need to let ReadMultipart save the file on the Web server and then copy it to the desired location. This is because the oCGI associative array is loaded at the same time as the uploaded file is saved to disk. We therefore cannot know the save location dictated by the web form prior to saving the uploaded file.
Note |
Here is a case where you may wish to use the extra path information
environment variable. You may recall from Session Three that a URL composes
a virtual path which points to the CGI program, and that, this can be
followed by extra path information. You need to start the path info with
a slash (/) to let the web server know where the script name ends. The
web server will then add this information to the PATH_INFO environment
variable.
When using this variable, your form's ACTION attribute would look something like this: <FORM ACTION="/dlearn/myApp.exe/extra/path/info" ...> In your program file you can get this path like this: cPath = getEnv("PATH_INFO") See also the PATH_TRANSLATED environment variable. |
Consider the following example form. The user selects a project to upload in the HTML Select list. And the file name is entered in the HTML INPUT element. The name of the former element is p_name and the name of the latter one is FileToUpload. On the server, after ReadMultipart is finished, these parameters are used to copy the uploaded file from the default location to the appropriate project folder (see UploadSample1.prg for details):
copyFrom = oCGI.SavePath + oCgi['FileToUpload'] copyTo = "d:\cmpt301\projects\" + oCGI['p_name'] + "\" ; + oCgi['FileToUpload']
Computing
101
Submit Project |
There are a few issues regarding CGIUpload that the developer should consider. First, the CGIUpload class does not limit the size of the upload. There is nothing that prevents a user from uploading 100 megabytes or more in a single session. To control this, you may wish to test the size of the upload before calling ReadMultipart. Something like the following would be appropriate.
#define MAXIMUM_UPLOAD 1000000 if MAXIMUM_UPLOAD > 0 and val(getEnv("CONTENT_LENGTH")) > MAXIMUM_UPLOAD oCGI.SorryPage('The file uploaded is beyond the maximum limits') endif
When MAXIMUM_UPLOAD is 0 (zero), there is no limit on the size of the upload, otherwise the upload would be limited.
Another issue the developer needs to consider is that CGIUpload has no facility for checking whether an uploaded file already exists on the server. If a file is uploaded with the same name as an existing file, the existing file will be overwritten. In some circumstances, a developer will want to obtain confirmation from the user before overwriting the existing file.
For this Exercise we will use teh CGIUpload class, and write a simple program that uploads a file to the "Data" folder.
FileTransClass.cc is sub-classed from WebClass.cc and contains the methods used for file uploads. To use the CGIUpload class, create a new CGI object
set procedure to 'HTTPFileTransfer.cc' additive oCGI = new CGIUpload()
Following this you need to call the connect() method, and then you can call readMultipart(). The following is the remainder of the program code.
oCGI.connect() if oCGI.ReadMultipart() cFrom = oCGI.SavePath + oCgi['userfile1'] cTo = "C:\Uploads\" + oCgi['userfile1'] // Copy uploaded file to desired location f = new file() f.copy( cFrom, cTo ) nSize = f.size(cFrom) f.delete( cFrom ) // stream response page oCGI.streamHeader('Upload') with (oCGI.fOut) puts([ <BODY bgcolor='white'>]) puts([<b><font FACE='Arial,Helvetica' size=+2>Upload Finished</font></b>]) puts([<p><FONT FACE='Arial,Helvetica' SIZE=-1>]) puts(oCgi['FileToUpload'] + " has been uploaded") puts([<P>] + str(nSize) + " Bytes uploaded. ") endwith oCGI.streamFooter() else oCGI.streamHeader('Upload') oCGI.streamBody("Not uploaded. <BR>There was an error reading your ; file. Please try again.") oCGI.streamFooter() endif
The web form used to send form data and binary files to the web server requires two important modifications from the standard web form: First, the form tag has a different "enctype" and second the input tag uses type=file.
<FORM ENCTYPE="multipart/form-data" ACTION="/app/Exercise0901.exe" METHOD=POST> File to process: <INPUT NAME="userfile1" TYPE="file"> <INPUT TYPE="submit" VALUE="Send File"> </FORM>
The complete program file is available in "Exercise0901.prg." Exercise0901.html is a "multipart" html form that calls Exercise0901.exe. Load this page in your web browser, enter values into the form, and submit it. The uploaded file(s) will be in the directory set by oCGI["TempDir"] and the form values will be in the associative array.
It is easy to create a static web page with a hyperlink that point to a binary file, like a spreadsheet or a compressed archive. The remote client needs only to click the link and the file begins downloading to the hard drive. At times, however, files posted on static web pages can be a problem. First, in order for the remote client to fetch a file via a hyperlink, that file must be stored in a folder within the web server's document root. A web server does not have access to files stored out side this directory structure. And second, the name and the content of the file to be downloaded is static.
The CGIDownload class makes it possible to overcome these limitations. Downloadable files can be stored outside of web space, which can give you better control over who has access to those resources. Additionally, the content of a file can be customized before it is downloaded. For example, your CGI application may create a temporary table that contains a subset of data from your database (copy to temp for x = y). This temporary table can then be streamed to the user and deleted from the server.
The CGIDownload class is actually quite simple. The code creates an instance of the dBASE file object, reads the binary file to be downloaded, and writes that data to the StdOut port where the web server is waiting. What makes this technique work is the CGI header that is sent to the web server.
In a standard CGI application the CGI Header sent to the web server is:
Content-type: text/html
This “MIME” type tells the web server that the data which follows is HTML and it is then served to the Web browser as such. In the case of our CGIdownload class, the CGI Header is this:
Content-Type: application/x-unknown Content-Disposition: attachment; filename=myFile.zip
This tells the Web server to handle the subsequent data as an attachment. When the Web browser gets this message, it pops up a Save As dialog box so that the user can store the file on their system.
Using the CGIDownload class is quite simple; the following code is all that is required:
set procedure to HTTPFileTransfer.cc additive oCgi = new CGIDownload() oCgi.connect() oCgi.download("c:\data\someFile.dbf","someFile.dbf") quit
The main work is done by the download() method. This method accepts two arguments. The first is the full path for the target file, including the file name, as it resides on the web server. As noted above, that file can be located anywhere on the web server or even in a network location. (You must be sure that the web server has sufficient permissions to read from the storage location.)
The second parameter is the name you want to assign to the downloaded file. In many case the file name will not change when it is downloaded, but in those case where you do wish to rename the file, this is where it is done. This second parameter is the file name that will appear in the client's Save File As dialog form.
When you use this technique to download files, note that there is no response page.
CGIDownload is subclassed from webclass.cc. This means that the developer has access to all the methods contained in webclass.cc. Connect() is the method that connects your CGI application to the web server and decodes the data sent from the web browser. This means that we can use information passed by the user to process the download file.
Consider, for example, the following form.
Computing
101
Retrieve a Project |
The user must select the project that they wish to download and enter their username. These two parameters are sent to my CGI application and added to the oCGI associative array. I can use the information to get the file identified by the value of p_name and rename it, on the fly, with the username value. (For details see DownloadSample.prg in the "Examples" folder.)
The above discussion focused on downloading files as attachments. Another type of download is where your CGI program creates a file, say an image file, and you want that file to be rendered in the response page. Consider, for example, a web page hit counter, or a custom chart, or a customized map. These images might be created based on information passed to your CGI program, and as such the images are dynamic. Typically a HTML document with an Image tag is used to displayed graphics. Hence the problem becomes how to get a dynamically generated graphic to display in a web page.
One way to deal with type of problem is to stream the data for the image directly into the image tag on the web page. Here's how it works. Normally an image file (gif or jpg) is stored on a web server and an HTML page includes an image tag that points to the file. The image tags SRC property contains the location of the image.
<IMG SRC="/images/SomeImage.gif">
To render the image dynamically, we can replace the URL pointing to the image with a URL pointing to a dBL CGI applet. Something like the following:
<IMG SRC="/app/GetImages.exe?TheImageID">
The GetImages applet can do just about anything a normal CGI applet can do. The key difference is that rather then returning a HTML response page, it returns an image. A simple applet might send an image that was just created by the main CGI applet, and then delete the image file form the server. Or it may send an image that, for security reasons, is stored outside of the web server's document root directory structure. A more complicated applet might create the image and then stream it to the browser.
The dBL code to make this work is actually quite simple. First, the image file must be read into a memory variable. The following code opens a file named "MyImage.jpg" and read it into a variable named "sImage."
cFilePath = "MyImage.jpg" oFile = new file() nBytes = oFile.size(cFilePath) oFile.open(cFilePath, "R") sImage = oFile.read( nBytes ) oFile.close()
Next the contents of the memory variable must be streamed to the browser. The key here is the CGI header.
with (oCGI.fOut) puts( "Content-type: image/jpeg" ) puts( "Content-Length: " + len( sImage ) ) puts( "" ) write( sImage ) endwith
As you can see, the MIME type is an image. If you are steaming a GIF file instead of a JPG file, the MIME type should be "image/gif".
Note |
If you develop a system like this, and your GetImages program does not
need to access a database table, you can speed up the processing by runing
the applet with the database engine turned off.
|
Next we will consider a variation of the foregoing. Rather than streaming the image directly into the image tag, we will stream some programming code that will define the URL to be retrieved.
For this technique you would use a CGI applet to write information into a static web page. Say you want to put your own hit counter on your home page, or you want a random image displayed on your page, or you want to customize a page with some unique information.
One way to do this is to use JavaScript to write the html in your web page and to use a CCI program to write the JavaScript. So inside the body tag, at the point in your page where you want the image or text to appear, insert the following tag.
<SCRIPT LANGUAGE="JavaScript" SRC="/app/image.exe"> </SCRIPT>
There is no need to write any JavaScript inside this tag block -- dBASE will do that for you.
You will note that the JavaScript's Src attribute is an executable file, just like the Image tag's Src attribute in the above example. This executable should be your dBASE application. The dBASE Plus program file doesn't need to be more than this:
1 fOut = new file() 2 fOut.Open("StdOut", "RA") 3 fOut.Puts("Content-type: text/plain") 4 fOut.Puts("") 5 fOut.Puts([document.writeln('<IMG SRC=/Images/YourPic.gif>')]) 6 Quit
Look at line 5. On this line, dBASE writes a JavaScript statement to the Standard Out port. So, inside the SCRIPT tag of your web page, the following line will be generated:
document.writeln('<IMG SRC=/Images/YourPic.gif>')
And then, the browser's JavaScript interpreter will render this to:
<IMG SRC=/Images/YourPic.gif>
You can, of course, precede line 5 above with any manipulations and you can use variables. For example:
cMyPic = "/Images/YourPic.gif" fOut.Puts([document.writeln('<IMG SRC=] + cMyPic + [>')])
Also notice that in line 3 above, the CGI header is "text/plain". Your dBASE application is not sending an entire html page to the web server. It is merely sending a snippet of plain text.