Rick Strahl's Weblog
Rick Strahl's FoxPro and Web Connection Weblog
White Papers | Products | Message Board | News |

Web Connection and Content Compression and Encoding


February 06, 2007 •

I spent a bit of time over the last couple of days adding a number of significant infrastructure features to Web Connection. One common feature that is very useful on the Web is GZip compression and Web Connection now supports GZip compression both as a generic tool and specifically integrated as part of the Web Connection Response mechanism. Note that GZip is not the same as Zip archives – same algorithm but no file system meaning you can only compress/decompress single strings or files. But for the needs in Web Connection that’s all that needed.

 

First there are new generic GZipCompressString and GZipUncompressString functions in the wwAPI class as static functions (ie. not part of the class). These functions use the open source ZLib (www.zlib.net) library to compress and decompress data in string format. The library actually uses stream compression so the simplified code I used uses files, but it’s plenty fast even through the intermediate file output. I’ll post the code that does this at the end of this article.  This is nothing new – in fact I peeked at code from John Burrows who built a more sophisticated class, but I ended up create just two compact functions that  are specific to what’s needed in Web Connection:

 

 

************************************************************************

* wwAPI ::  GZipCompressString

****************************************

***  Function: Compresses a string using GZip

***    Assume: Requires ZLIB1.DLL

***      Pass:

***    Return:

************************************************************************

FUNCTION GZipCompressString(lcString,lnCompressionLevel)

LOCAL lcOutput, lcOutFile,lcInFile, lnHandle

 

*** 1 - 9

IF EMPTY(lnCompressionLevel)

   lnCompressionLevel = -1  && Default

ENDIF

 

*** Must write to files

lcOutFile = SYS(2023) + SYS(2015) + ".gz"

lcInFile = lcOutFile + ".in"

 

*** Failure to write the file

IF !FILE2VAR(lcInFile,lcString)

   RETURN ""

ENDIF

 

IF !VARTYPE(_GZipLoaded) = "L"

      GzipLibraries()

ENDIF

 

TRY

   lnHandle = gzopen(lcOutFile,"wb")

   IF (lnHandle < 0)

      RETURN ""

   ENDIF

 

   *** Set the compression level

   gzsetparams(lnHandle,lnCompressionLevel,0)

 

   gzwrite(lnHandle,lcString,LEN(lcString))

   gzclose(lnHandle)

CATCH

   IF lnHandle > -1

      gzclose(lnHandle)

   ENDIF

ENDTRY

 

lcOutput = FILETOSTR(lcOutFile)

 

ERASE (lcOutFile)

ERASE (lcInFile)

 

RETURN lcOutput

 

 

************************************************************************

* wwAPI ::  GZipUncompressString

****************************************

***  Function: Uncompresses a GZip string

***    Assume: INCORRECT IMPLEMENTATION - ZLib Format

***      Pass:

***    Return:

************************************************************************

FUNCTION GZipUncompressString(lcCompressed,llIsFile)

 

lcOutFile = SYS(2023) + SYS(2015) + ".gz"

IF llIsFile

   lcInFile = lcCompressed

ELSE

   lcInFile = lcOutFile

   FILE2VAR(lcOutFile,lcCompressed)

ENDIF

 

IF !VARTYPE(_GZipLoaded) = "L"

      GzipLibraries()

ENDIF

 

lcOutput = ""

TRY

   lnHandle = gzopen(lcInFile,"rb")

   IF (lnHandle < 1)

      RETURN ""

   ENDIF

 

   lcOutput = ""

   DO WHILE .T.

      lcBuffer = SPACE(65535)

      lnResult = gzread(lnHandle,@lcBuffer,LEN(lcBuffer))

      IF lnResult < 1

         EXIT

      ENDIF

      lcOutput = lcOutput + LEFT(lcBuffer,lnResult)

   ENDDO

CATCH

   * Nothing

FINALLY

   gzclose(lnHandle)

ENDTRY

 

RETURN lcOutput

* eof GZipUncompressString

 

************************************************************************

* wwApi ::  GzipLibraries

****************************************

***  Function:

***    Assume:

***      Pass:

***    Return:

************************************************************************

FUNCTION GzipLibraries()

 

PUBLIC _GZipLoaded

_GZipLoaded=.T.

 

* Opens file for writing

DECLARE LONG gzopen IN "zlib1.dll" AS "gzopen" ;

   STRING @ zFile ,;

   STRING @ zMode

 

* Writes data from a compressed file - gzip

DECLARE LONG gzwrite IN "zlib1.dll" AS "gzwrite" ;

   LONG FILE ,;

   STRING @ uncompr,;

   LONG uncomprLen

 

*** Set options on the compression

DECLARE LONG gzsetparams IN "zlib1.DLL" AS "gzsetparams" ;

   LONG  gzFile,;

   INTEGER LEVEL,;

   INTEGER strategy

 

DECLARE LONG gzread IN "zlib1.dll" AS "gzread" ;

   LONG gzFile,;

   STRING @ buf,;

   LONG LEN

 

* Closes the file

DECLARE LONG gzclose IN "zlib1.dll" AS "gzclose" ;

   LONG FILE

 

RETURN

 

This code’s pretty self explanatory – nothing real new here.

 

What’s new however is that this is now an optional integral part of Web Connection via a new wwPageResponse::GZipCompression property. When set to .T. the content of the current request is automatically GZip encoded before being sent back to the client.

 

FUNCTION SomeProcessMethod()

 

lcXml = this.BusinessObj.GetXml() && some large Xml string

Response.GZipCompression = .T.

Response.ContentType = "text/xml"

Response.Write(lcXML)

 

ENDFUNC

 

This works pretty much everywhere. For example you can do the same as part of a Web Connection Web Control Framework page:

 

FUNCTION OnLoad()

 

*** Force all Page output to be GZipped

Response.GZipCompression = .T.

 

IF this.chkPreserveProperties.Checked

   *** Preserve properties that aren't posted back explicitly

   this.lblMessage.PreserveProperty("Text")

   this.lblMessage.PreserveProperty("ForeColor")

   this.btnColor.PreserveProperty("ForeColor")

ENDIF

 

ENDFUNC

 

The GZipCompression property is also smart enough to detect clients that don’t support GZip and if not supported sends uncompressed output back. It also doesn’t compress content smaller than a certain size (10kb  by default -  GZIP_MIN_COMPRESSION_SIZE in wconnect.h). No sense compressing small content.

 

Getting GZip output from the server is now totally trivial. This also means that GZipping can be applied against the Web Connection AJAX script library (wwScriptLibrary.js) which has been slowly growing in size to 39k and which now compresses down to 9k!

 

Speaking about the script library – I recently added support for WebResources, which are files that can be served from Web Connection so that the files don’t need to be available as external files that get to be out of data which is a common concern especially with script files. So rather than manually including wwScriptLibrary.js and updating it with each release of Web Connection files like this can be served from Web Connection from a compiled file.

 

If the file is served as a WebResource in this manner the file can also be compressed on the fly. And the beauty is that this is so pretty trivial to do:

 

************************************************************************

* wwProcess ::  WebResource

****************************************

***  Function:

***    Assume:

***      Pass:

***    Return:

************************************************************************

FUNCTION WebResource()

LOCAL lcKey, lnIndex, loContent, lcContent, lc

 

lcKey = LOWER(Request.QueryString("Resource"))

 

DO CASE

      *** Fixed Resources

      CASE lcKey = "wwscriptlibrary"

         lcContentType = "text/javascript"

           

         *** Allow GZip compression if the client allows   

         Response.GzipCompression = .T.

        

         lcContent = GetwwScriptLibrary_JavaScript(Response.GZipCompression)

         Response.AppendHeader("Content-Encoding","gzip")  

        

      OTHERWISE

            *** Retrieve globally stored Resource

            loContent = Server.oResources.Item(lcKey)

            IF ISNULL(loContent)

                  Response.Status="404 Not Found"

                  return

            ENDIF

           

            lcContentType = loContent.ContentType

            lcContent = loContent.Content

ENDCASE

 

Response.ContentType=lcContentType

Response.AddCacheHeader(1200)  && Cache this output

Response.Write(lcContent)

 

ENDFUNC

*  wwProcess ::  WebResource

 

WebResources are easy to use from anywhere as well. There’s a new wwProcess::GetWebResourceUrl() method that retrieves a url that retrieves a resource. For example if you need the URL for the above script library you’d do:

 

lcUrl = Process.GetWebResourceUrl("wwscriptlibrary")

 

which creates a url like this:

 

/wconnect/weblog/WebResources.ext?Resource=wwscriptlibrary

 

That routes to the WebResource handler. Any scriptmapped extension works with this and the .ext will vary depending on the scriptmap you are currently using.

 

 

Content Encoding – UTF8 and Unicode

Along the same lines as the GZip encoding there’s now also content encoding available on the wwPageResponse object. If you do this from anywhere in your code:

 

Response.Encoding = "UTF8"

 

The output from the current request will automatically UTF8 encoded (prior to GZip encoding <s>). The encoding is applied at the very end of the request processing and you can easily apply this globally in a central place like wwProcess::OnProcessInit. The encoding routines are smart enough to encode only encode text content types and they add the appropriate charset to the content type.

 

Now UTF-8 encoding is a good idea so that content can properly display extended characters and handle multiple locales simultaneously and one might be very tempted to use UTF-8 on everything. As I mentioend above Web Connection checks and ensures that the content is text before encoding the content, but there are other scenarios where you this full UTF-8 Encoding might cause problems.

 

For example, if you use templates for Scripts/Templates or the new Web Connection Web Control Framework for the Page templates. If these templates are already UTF-8 encoded by whatever editor you're using, you don't want to UTF-8 the entire content. This is a sticky issue either way you slice it because FoxPro of course doesn't do Unicode natively and when you read a UTF file into FoxPro it'll read it as an ANSI string.  Which means you get UTF-8 markup characters which then get re-encoded and produce garbage.  The best solution to avoid this is to keep pages stored on disk in ANSI formatting (like Windows 1252 for example). This guarantees the data is properly formatted and you can then simply merge content into these templates directly and the content can then be wholesale UTF-8 encoded.

 

If the content is already UTF-8 encoded currently there's no good way to fix up the content. In that case the only way to ensure there's no conflict is to make sure that all expressions are UTF-8 encoded.

 

I've been thinking about this, but I can't think of a good way to deal with this. It'd be possible to import templates and UTF-8 decode the templates. The problem is there's no easy way to tell whether a file is UTF-8 encoded (possibly a Byte Order Mark, but it's inconsistent). This is something to think about for the future I suppose - something that can possibly be handled by the File2Var function which could detect a BOM and automatically UTF-8 decode a file from disk. <shrug>. If anybody has any ideas on that I'd be interested in hearing about it.

 

A little sidetracked there <s>. Regardless of this issue which is not all that common as most of us with Fox tools use ANSI text anyway, the UTF-8 encoding provides a very useful mechanism for properly feeding content to the client.

 

All of these encoding features are optional of course and they are only available on the new wwPageResponse object.

 

wwPageResponse makes this possible

This is where the new wwPageResponse object is really paying off. I know there are a few small incompatibilities between older versions of Web Connection, and there's been some grumbling about the small incompatibilities. But these small changes were well worth it and this is now paying off going forward as features can be added transparently to the Response object in many cases without affecting existing functionality. If you are still using the old wwResponse object with Web Connection 5.0 it might be a good idea to take another look at updating to the new wwPageResponse object.

 

This article refers to yet unreleased version 5.20 of Web Connection  

Posted in:

Feedback for this Weblog Entry