web content translation • 1

Aventail Web Translation

Web Developer Guide

May 2007

© Aventail Corporation 2007. All rights reserved.

Aventail, Aventail.Net, AventailExtraNetCenter, Aventail ExtraWeb, Aventail ExtraNet, Aventail Connect, and their respective logos are trademarks, service marks or registered trademarks of Aventail Corporation.

Other product and company names mentioned in this publication are the trademarks of their respective owners.

Table of Contents

Overview

Introduction

Version Compliance

Who is this document for?

How does the Aventail Web Translation server works?

Content-Type of Web Pages

Recommendations

Character Encoding

Recommendations

Cookie translation

Recommendations

URLs

Recommendations

HTML translation

Recommendations

CSS translation

JavaScript translation

Translation rules

Recommendations

VBScript translation

Java applet, ActiveX and Flash translation

Recommendations

XML translation

Recommendations

Web aliases

Recommendations

Miscellaneous

Referrer lookup

Overview

Introduction

A truly clientless VPN appliance requires a robust web-content translation engine. The reason is simple: all network references within the web content must be changed to point to the VPN appliance instead of internal hosts. With full-client VPNs or pseudo-clientless VPN appliances that use web-deployed ActiveX or Java clients, this host mapping can be done on the client. For VPN use on the broadest possible browser base however, web content translation is indispensable.

A simple example is effective in illustrating the translation of web content. Imagine an HTML page with the following anchor tag that links to an internal resource:

<a href=“ Web Access</a>

Within the corporate network, such a link works perfectly. When the user clicks on the link in the browser, the latter asks the internal DNS server what the IP address of “owa.in.aventail.com” is and retrieves the desired page.

Outside the corporate network however, say at an employee’s home, this link does not work. The browser asks the DNS server of the local ISP what IP address corresponds to “owa.in.aventail.com” and is told that that address doesn’t exist. Even if the link were to a routable IP address within the corporate network, the corporate firewall would probably prevent the browser from accessing the desired resource.

Web content translation is the process of changing (translating) the link above into something like:

<a href=“ Web Access</a>

The hostname is changed from the internal hostname to the DNS-resolvable hostname of the VPN appliance. However, the appliance doesn’t hold the desired resource; therefore that end resource must be encoded in some way within the URL. In our example, it is encoded within the path portion of the URL.

If the only kind of translation necessary were a translation of HTML links such as the above, things would be easy. This unfortunately is not so. There are numerous ways to reference network resources in HTML alone. Javascript, the now-ubiquitous web scripting language, augments the scope of the problem tremendously. Javascript in fact makes the problem intractable. It provides means of executing code on the browser and it allows the user to feed in additional input that is unknown at the time the server-side translation is done. For example, the user can be prompted for a URL using Javascript and the browser can then be instructed to go to that URL.

Version Compliance

Users of the document must note that this document is updated for each ASAP version that is released by Aventail. Please check the version you are running on Aventail box is in compliance with that of this document.

This document is valid for releases ASAP 8.6 through ASAP 8.8.

Who is this document for?

This document is for Web Application Developers who wish to make their software easy to translate by the Aventail translation engine. It provides a set of guidelines to achieve this goal and gives a brief overview of certain aspects of the translation engine.

How does the Aventail Web Translation server works?

The Aventail Web Translation server is part of the Aventail VPN appliance which sits at the network perimeter. It isolates and protects private Web-based resources from unauthorized external access.

A user first logs in to the Aventail appliance and is presented with the Workplace page. The user then follows a link on that page to request a resource from the internal network, or enters a URL on the Workplace page. All URLs point to the Aventail appliance.

The Aventail Web Translation server translates an incoming URL using an "alias" contained in theURL. Aliases are used to obscure the URLs that point to resources on your internal (or“downstream”) servers. Because all requests are directed to the Aventail appliance, theuser only sees the incoming URL that contains the alias. The Aventail Web Translation server matchesthe alias to a list it stores in memory and translates the URL.

Once it determines that the URL submitted by the user is valid and points to aresource on the network, the Aventail appliance checks its access control and authentication rulestomake sure the user is authorizedto access the requested resource.

Content-Type of Web Pages

Although the Aventail translation engine possesses heuristics to guess the type of content in an HTTP response from the backend web server, it is best to avoid relying on this and to instead specify the type explicitly.

Recommendations

The single most important thing you can do to ensure proper translation is to make sure that all pages are served up with the correct “Content-Type” header. In particular, it is imperative that:

  1. HTML content isserved up with the “text/html” Content-Type.
  2. Javascript contentis served up with the “application/x-javascript” Content-Type.
  3. XML content is served up with the “text/xml” Content-Type.

Character Encoding

As an internationalized network device, the Aventail appliance uses UTF-8 exclusively for its internal work.

Recommendations

  1. Use UTF-8 exclusively for all your Web content. Do not use the Microsoft code-pages. This particularly important when POSTing form data.

Cookie translation

The path portion of a “Set-Cookie” header is translated. The domain portion of this header is discarded. For example, if the backend web server sends the header:

Set-cookie: x=y; path=/; domain=.in.aventail.com

and the alias associated with the web resource is “morty”, then this header is translated to:

Set-Cookie: x=y; path=/morty/

This forces the web browser to send this cookie back only to the alias (and therefore the web server) that set the cookie.

Recommendations

  1. Avoid sophisticated client-side cookie manipulations using Javascript
  2. Avoid using URLs in cookies. Although an attempt is made to translate those URLs, there is some risk of letting them through.

URLs

The Aventail translation engine can handle URLs in any form:

  1. Fully-qualified URLs (e.g. “
  2. Absolute paths (e.g. “/dir1/dir2/file.html”)
  3. Relative paths (e.g. “../dir2/file.html”)

Recommendations

  1. It is best to use relative paths exclusively in your web application. This of course also has the advantage of making your web application more portable (e.g. to another web server and directory).

HTML translation

HTML translation is handled very reliably by the Aventail appliance.

Recommendations

  1. Make sure your HTML is formatted according to standard, especially the quotes around attributes in tags. Ideally, use XHTML formatting. HTML attributes containing a value (for example, src="path") may not be translated if they contain any of the following errors:
  2. Spaces before or after the equal sign.

src ="path" or src= "path"

  1. Leading or trailing spaces within the value.

src=" path" or src="path "

  1. Missing lead or end quotation mark.

src="path or src=path"

  1. Avoid base tags, such as:

<base href=" />

in your HTML code.

  1. The “meta” tag is commonly used to redirect users to another page. For example:

<meta http-equiv="refresh" content="5;url=redirectURL.html" />

The meta tag’s content attribute must be formatted carefully; don’t include line breaks or spaces.

CSS translation

CSS content should be handled without difficulty.

JavaScript translation

JavaScript translation is complex and there are certain coding practices that you can use to make sure your JavaScript code translates correctly.

Translation rules

The current Aventail JavaScript translation engine is a parse-tree based engine that can handle complex syntax. It is a rule-based translator that makes use of Aventail’s client-side JavaScript library. The rules are stored in:

/usr/local/extranet/etc/jstrans.cfg

The translation rules are divided into four categories:

  1. Assignment statements (type ASSIGNMENT)
  2. Function calls (type CALL)
  3. Substitution of one language token with another (type SUBSTITUTION)
  4. Special kind of substitution in a function call (type SUBARGS)

You should not need to write any new rules. It is however useful to be aware of the rules as you follow the recommendations below.

Here are the majority of the JavaScript rules as of September 2006:

# Javascript Translation

# Assignment Statement Translation

#

# TypeLeft Hand Side (LHS)Encapsulate RHS with

#

ASSIGNMENTlocationaventail.translate_url

ASSIGNMENT.locationaventail.translate_url

ASSIGNMENT.hrefaventail.translate_url

ASSIGNMENT.srcaventail.translate_url

ASSIGNMENT.actionaventail.translate_url

ASSIGNMENTdocument.domainaventail.setDomain

ASSIGNMENTdocument.cookieaventail.setCookie

ASSIGNMENT.innerHTMLaventail.postText

ASSIGNMENT.urlaventail.translate_url

# Function Call Translation

#

# TypeFunction NameParamEncapsulate param with

#

CALL.addBehavior1aventail.translate_url

CALL.showModalDialog1aventail.translate_url

CALL.showModelessDialog1aventail.translate_url

CALL .insertAdjacentHTML 2 aventail.postText

CALLlocation.replace1aventail.translate_url

CALLlocation.assign1aventail.translate_url

CALLeval1aventail.post

# Subsitution of one token with another

#

# lvalue/rvalue: 0: substitute always

# 1: substitute only if token is an rvalue (read from)

# 2: substitute only if token is an lvalue (written to)

#

# TypeTokenlval/Replacement

#rval

SUBSTITUTIONlocation.pathname0aventail.location.pathname

SUBSTITUTION.location.pathname0.aventail.location.pathname

SUBSTITUTIONdocument.domain1document.aventail.getDomain()

SUBSTITUTIONdocument.domain2aventail.junk

SUBSTITUTION.execCommand0.aventail.execCommand

SUBSTITUTIONlocation.pathname0aventail.location.pathname

SUBSTITUTION.location.pathname0.aventail.location.pathname

SUBSTITUTIONlocation.host0aventail.location.host

SUBSTITUTION.location.host0.aventail.location.host

SUBSTITUTIONlocation.hostname0aventail.location.hostname

SUBSTITUTION.location.hostname0.aventail.location.hostname

SUBSTITUTIONlocation.port0aventail.location.port

SUBSTITUTION.location.port0.aventail.location.port

SUBSTITUTIONlocation.protocol0aventail.location.protocol

SUBSTITUTION.location.protocol0.aventail.location.protocol

SUBSTITUTIONlocation.href1aventail.location.href

SUBSTITUTION.location.href1.aventail.location.href

SUBSTITUTIONlocation.search1aventail.location.search

SUBSTITUTION.location.search1.aventail.location.search

SUBSTITUTIONlocation1aventail.location

SUBSTITUTION.scripts1.aventail.getScripts()

# Subsitution of one token with another, with a twist:

# Take the "stem" of the call and make it the first argument in the new function.

# For example:

# If we have the token "foo.bar" and the replacement "aventail.ourFoo":

# We will replace the construction "anObject.foo.bar(arg1, arg2)" with:

# aventail.ourFoo(anObject, arg1, arg2)

# This allows us to verify the type of the anObject object prior to operating on it

#

# lvalue/rvalue: 0: substitute always

# 1: substitute only if token is an rvalue (read from)

# 2: substitute only if token is an lvalue (written to)

# 3: special case, turn a flat lvalue into a function call

#

# The "3" case above is used in cases such as "foo.location" to allow us to ensure

# that "foo" is an object such as a document, window, or frame, and not some user-defined

# object that just happens to have a "location" member.

#

# TypeTokenlval/Replacement

#rval

SUBARGSdocument.close0aventail.docClose

SUBARGSdocument.write0aventail.docWrite

SUBARGSdocument.writeln0aventail.docWrite

SUBARGS.open0aventail.objOpen

SUBARGS.Open0aventail.objOpen

SUBARGS.location3aventail.objLocation

Recommendations

  1. Do not use DOM references as variables names. For example, do not call any of your variables “location”. See the translation rules above to know what to avoid.
  1. Avoid the “with” construct: with(object) {statements}.
  1. Avoid passing DOM objects as parameters to functions. For example, avoid writing functions of the form:

function test(mywin) { mywin.location = “ }

Instead, make sure that the network-sensitive javascript appears verbatim, e.g.

window.location = “

In other words, do not hide the names of the underlying DOM objects.

  1. Do not set a base tag using JavaScript. This invalidates all the translated URLs on the page.
  1. Do not use conditional compilation for Internet Explorer (e.g. “@if …”)
  1. Do not use Microsoft Script Encoding (e.g. language “JScript.Encode”)

VBScript translation

VBScript translation is no longer supported.

Java applet, ActiveX and Flash translation

No explicit translation of Java applets, ActiveX or Flash objects is performed. If possible, avoid using them entirely.

Recommendations

  1. If it is not possible to avoid using these objects entirely, consider constructing the network references they need from the URL of the page they are on. Perform this construction dynamically at run time.

XML translation

Since XML needs to be described to make sense of the data, you will need to identify the portions of the XML content that require translation. This is done in the file:

/usr/local/extranet/etc/custom-xmltrans.cfg

The format of the rules to add to this file is:

ELEMENT ATTR1 ATTR2 ... ATTRn

This instructs the translation engine to look for element “ELEMENT” in the XML and to translation its attributes “ATTR1”,“ATTR2”,...,“ATTRn”. These attributes are URLs, of course.

Recommendations

  1. Add your XML translation rules to custom-xmltrans.cfg

Web aliases

Web aliases are declared when you configure a resource. They are used to hide the hostname of the internal server.

Recommendations

  1. Avoid using the same name for the alias as for the top level directory of your application. For example, if your web appliance lives in “ do not use “coolapp” as the alias for the Aventail resource.

Miscellaneous

Referrer lookup

When a request for an absolute or relative URL for which there is no matching alias comes in, the Aventail Web Translation server looks at the “Referer” HTTP header or the referrer cookie that it sets. This headeror cookie is used to correctly assemble thedestination URL. This is a best effort attempt and should not be relied upon.