<-
Apache > HTTP Server > Documentation > Version 2.4 > Modules

Apache Module mod_mime

Available Languages:  en  |  fr  |  ja 

Description:Associates the requested filename's extensions with the file's behavior (handlers and filters) and content (mime-type, language, character set and encoding)
Status:Base
Module Identifier:mime_module
Source File:mod_mime.c

Summary

This module is used to assign content metadata to the content selected for an HTTP response by mapping patterns in the URI or filenames to the metadata values. For example, the filename extensions of content files often define the content's Internet media type, language, character set, and content-encoding. This information is sent in HTTP messages containing that content and used in content negotiation when selecting alternatives, such that the user's preferences are respected when choosing one of several possible contents to serve. See mod_negotiation for more information about content negotiation.

The directives AddCharset, AddEncoding, AddLanguage and AddType are all used to map file extensions onto the metadata for that file. Respectively they set the character set, content-encoding, content-language, and media-type (content-type) of documents. The directive TypesConfig is used to specify a file which also maps extensions onto media types.

In addition, mod_mime may define the handler and filters that originate and process content. The directives AddHandler, AddOutputFilter, and AddInputFilter control the modules or scripts that serve the document. The MultiviewsMatch directive allows mod_negotiation to consider these file extensions to be included when testing Multiviews matches.

While mod_mime associates metadata with filename extensions, the core server provides directives that are used to associate all the files in a given container (e.g., <Location>, <Directory>, or <Files>) with particular metadata. These directives include ForceType, SetHandler, SetInputFilter, and SetOutputFilter. The core directives override any filename extension mappings defined in mod_mime.

Note that changing the metadata for a file does not change the value of the Last-Modified header. Thus, previously cached copies may still be used by a client or proxy, with the previous headers. If you change the metadata (language, content type, character set or encoding) you may need to 'touch' affected files (updating their last modified date) to ensure that all visitors are receive the corrected content headers.

Support Apache!

Topics

Directives

Bugfix checklist

See also

top

Files with Multiple Extensions

Files can have more than one extension; the order of the extensions is normally irrelevant. For example, if the file welcome.html.fr maps onto content type text/html and language French then the file welcome.fr.html will map onto exactly the same information. If more than one extension is given that maps onto the same type of metadata, then the one to the right will be used, except for languages and content encodings. For example, if .gif maps to the media-type image/gif and .html maps to the media-type text/html, then the file welcome.gif.html will be associated with the media-type text/html.

Languages and content encodings are treated accumulative, because one can assign more than one language or encoding to a particular resource. For example, the file welcome.html.en.de will be delivered with Content-Language: en, de and Content-Type: text/html.

Care should be taken when a file with multiple extensions gets associated with both a media-type and a handler. This will usually result in the request being handled by the module associated with the handler. For example, if the .imap extension is mapped to the handler imap-file (from mod_imagemap) and the .html extension is mapped to the media-type text/html, then the file world.imap.html will be associated with both the imap-file handler and text/html media-type. When it is processed, the imap-file handler will be used, and so it will be treated as a mod_imagemap imagemap file.

If you would prefer only the last dot-separated part of the filename to be mapped to a particular piece of meta-data, then do not use the Add* directives. For example, if you wish to have the file foo.html.cgi processed as a CGI script, but not the file bar.cgi.html, then instead of using AddHandler cgi-script .cgi, use

Configure handler based on final extension only

<FilesMatch "[^.]+\.cgi$">
  SetHandler cgi-script
</FilesMatch>
top

Content encoding

A file of a particular media-type can additionally be encoded a particular way to simplify transmission over the Internet. While this usually will refer to compression, such as gzip, it can also refer to encryption, such a pgp or to an encoding such as UUencoding, which is designed for transmitting a binary file in an ASCII (text) format.

The HTTP/1.1 RFC, section 14.11 puts it this way:

The Content-Encoding entity-header field is used as a modifier to the media-type. When present, its value indicates what additional content codings have been applied to the entity-body, and thus what decoding mechanisms must be applied in order to obtain the media-type referenced by the Content-Type header field. Content-Encoding is primarily used to allow a document to be compressed without losing the identity of its underlying media type.

By using more than one file extension (see section above about multiple file extensions), you can indicate that a file is of a particular type, and also has a particular encoding.

For example, you may have a file which is a Microsoft Word document, which is pkzipped to reduce its size. If the .doc extension is associated with the Microsoft Word file type, and the .zip extension is associated with the pkzip file encoding, then the file Resume.doc.zip would be known to be a pkzip'ed Word document.

Apache sends a Content-encoding header with the resource, in order to tell the client browser about the encoding method.

Content-encoding: pkzip
top

Character sets and languages

In addition to file type and the file encoding, another important piece of information is what language a particular document is in, and in what character set the file should be displayed. For example, the document might be written in the Vietnamese alphabet, or in Cyrillic, and should be displayed as such. This information, also, is transmitted in HTTP headers.

The character set, language, encoding and mime type are all used in the process of content negotiation (See mod_negotiation) to determine which document to give to the client, when there are alternative documents in more than one character set, language, encoding or mime type. All filename extensions associations created with AddCharset, AddEncoding, AddLanguage and AddType directives (and extensions listed in the MimeMagicFile) participate in this select process. Filename extensions that are only associated using the AddHandler, AddInputFilter or AddOutputFilter directives may be included or excluded from matching by using the MultiviewsMatch directive.

Charset

To convey this further information, Apache optionally sends a Content-Language header, to specify the language that the document is in, and can append additional information onto the Content-Type header to indicate the particular character set that should be used to correctly render the information.

Content-Language: en, fr Content-Type: text/plain; charset=ISO-8859-1

The language specification is the two-letter abbreviation for the language. The charset is the name of the particular character set which should be used.

top

AddCharset Directive

Description:Maps the given filename extensions to the specified content charset
Syntax:AddCharset charset extension [extension] ...
Context:server config, virtual host, directory, .htaccess
Override:FileInfo
Status:Base
Module:mod_mime

The AddCharset directive maps the given filename extensions to the specified content charset (the Internet registered name for a given character encoding). charset is the media type's charset parameter for resources with filenames containing extension. This mapping is added to any already in force, overriding any mappings that already exist for the same extension.

Example

AddLanguage ja .ja
AddCharset EUC-JP .euc
AddCharset ISO-2022-JP .jis
AddCharset SHIFT_JIS .sjis

Then the document xxxx.ja.jis will be treated as being a Japanese document whose charset is ISO-2022-JP (as will the document xxxx.jis.ja). The AddCharset directive is useful for both to inform the client about the character encoding of the document so that the document can be interpreted and displayed appropriately, and for content negotiation, where the server returns one from several documents based on the client's charset preference.

The extension argument is case-insensitive and can be specified with or without a leading dot. Filenames may have multiple extensions and the extension argument will be compared against each of them.

See also

top