<-
Apache > HTTP Server > Documentation > Version 2.4 > Modules

Apache Module mod_proxy_html

Available Languages:  en  |  fr 

Description:Rewrite HTML links in to ensure they are addressable from Clients' networks in a proxy context.
Status:Base
Module Identifier:proxy_html_module
Source File:mod_proxy_html.c
Compatibility:Version 2.4 and later. Available as a third-party module for earlier 2.x versions

Summary

This module provides an output filter to rewrite HTML links in a proxy situation, to ensure that links work for users outside the proxy. It serves the same purpose as Apache's ProxyPassReverse directive does for HTTP headers, and is an essential component of a reverse proxy.

For example, if a company has an application server at appserver.example.com that is only visible from within the company's internal network, and a public webserver www.example.com, they may wish to provide a gateway to the application server at http://www.example.com/appserver/. When the application server links to itself, those links need to be rewritten to work through the gateway. mod_proxy_html serves to rewrite <a href="http://appserver.example.com/foo/bar.html">foobar</a> to <a href="http://www.example.com/appserver/foo/bar.html">foobar</a> making it accessible from outside.

mod_proxy_html was originally developed at WebÞing, whose extensive documentation may be useful to users.

Support Apache!

Directives

Bugfix checklist

See also

top

ProxyHTMLBufSize Directive

Description:Sets the buffer size increment for buffering inline scripts and stylesheets.
Syntax:ProxyHTMLBufSize bytes
Default:ProxyHTMLBufSize 8192
Context:server config, virtual host, directory
Status:Base
Module:mod_proxy_html
Compatibility:Version 2.4 and later; available as a third-party for earlier 2.x versions

In order to parse non-HTML content (stylesheets and scripts) embedded in HTML documents, mod_proxy_html has to read the entire script or stylesheet into a buffer. This buffer will be expanded as necessary to hold the largest script or stylesheet in a page, in increments of bytes as set by this directive.

The default is 8192, and will work well for almost all pages. However, if you know you're proxying pages containing stylesheets and/or scripts bigger than 8K (that is, for a single script or stylesheet, NOT in total), it will be more efficient to set a larger buffer size and avoid the need to resize the buffer dynamically during a request.

top

ProxyHTMLCharsetOut Directive

Description:Specify a charset for mod_proxy_html output.
Syntax:ProxyHTMLCharsetOut Charset | *
Context:server config, virtual host, directory
Status:Base
Module:mod_proxy_html
Compatibility:Version 2.4 and later; available as a third-party for earlier 2.x versions

This selects an encoding for mod_proxy_html output. It should not normally be used, as any change from the default UTF-8 (Unicode - as used internally by libxml2) will impose an additional processing overhead. The special token ProxyHTMLCharsetOut * will generate output using the same encoding as the input.

Note that this relies on mod_xml2enc being loaded.

top

ProxyHTMLDocType Directive

Description:Sets an HTML or XHTML document type declaration.
Syntax:ProxyHTMLDocType HTML|XHTML [Legacy]
OR
ProxyHTMLDocType fpi [SGML|XML]
Context:server config, virtual host, directory
Status:Base
Module:mod_proxy_html
Compatibility:Version 2.4 and later; available as a third-party for earlier 2.x versions

In the first form, documents will be declared as HTML 4.01 or XHTML 1.0 according to the option selected. This option also determines whether HTML or XHTML syntax is used for output. Note that the format of the documents coming from the backend server is immaterial: the parser will deal with it automatically. If the optional second argument is set to Legacy, documents will be declared "Transitional", an option that may be necessary if you are proxying pre-1998 content or working with defective authoring/publishing tools.

In the second form, it will insert your own FPI. The optional second argument determines whether SGML/HTML or XML/XHTML syntax will be used.

The default is changed to omitting any FPI, on the grounds that no FPI is better than a bogus one. If your backend generates decent HTML or XHTML, set it accordingly.

If the first form is used, mod_proxy_html will also clean up the HTML to the specified standard. It cannot fix every error, but it will strip out bogus elements and attributes. It will also optionally log other errors at