trac.util.html

escape(str, quotes=True)

Create a Markup instance from a string and escape special characters it may contain (<, >, & and ").

>>> escape('"1 < 2"')
Markup(u'&#34;1 &lt; 2&#34;')

>>> escape(['"1 < 2"'])
Markup(u"['&#34;1 &lt; 2&#34;']")

If the quotes parameter is set to False, the " character is left as is. Escaping quotes is generally only required for strings that are to be used in attribute values.

>>> escape('"1 < 2"', quotes=False)
Markup(u'"1 &lt; 2"')

>>> escape(['"1 < 2"'], quotes=False)
Markup(u'[\'"1 &lt; 2"\']')

However, escape behaves slightly differently with Markup and Fragment behave instances, as they are passed through unmodified.

>>> escape(Markup('"1 < 2 &#39;"'))
Markup(u'"1 < 2 &#39;"')

>>> escape(Markup('"1 < 2 &#39;"'), quotes=False)
Markup(u'"1 < 2 &#39;"')

>>> escape(tag.b('"1 < 2"'))
Markup(u'<b>"1 &lt; 2"</b>')

>>> escape(tag.b('"1 < 2"'), quotes=False)
Markup(u'<b>"1 &lt; 2"</b>')

Parameters:

text - the string to escape; if not a string, it is assumed that the input can be converted to a string
quotes - if True, double quote characters are escaped in addition to the other special characters

Returns: Markup

the escaped Markup string

unescape(text)

source code

Reverse-escapes &, <, >, and " and returns a unicode object.

>>> unescape(Markup('1 &lt; 2'))
u'1 < 2'

If the provided text object is not a Markup instance, it is returned unchanged.

>>> unescape('1 &lt; 2')
'1 &lt; 2'

Parameters:

text - the text to unescape

Returns: unicode

the unescsaped string

stripentities(text, keepxmlentities=False)

source code

Return a copy of the given text with any character or numeric entities replaced by the equivalent UTF-8 characters.

>>> stripentities('1 &lt; 2')
u'1 < 2'
>>> stripentities('more &hellip;')
u'more \u2026'
>>> stripentities('&#8230;')
u'\u2026'
>>> stripentities('&#x2026;')
u'\u2026'
>>> stripentities(Markup(u'\u2026'))
u'\u2026'

If the keepxmlentities parameter is provided and is a truth value, the core XML entities (&, ', >, < and ") are left intact.

>>> stripentities('1 &lt; 2 &hellip;', keepxmlentities=True)
u'1 &lt; 2 \u2026'

Returns: unicode: a unicode instance with entities removed

striptags(text)

source code

Return a copy of the text with any XML/HTML tags removed.

>>> striptags('<span>Foo</span> bar')
u'Foo bar'
>>> striptags('<span class="bar">Foo</span>')
u'Foo'
>>> striptags('Foo<br />')
u'Foo'

HTML/XML comments are stripped, too:

>>> striptags('<!-- <blub>hehe</blah> -->test')
u'test'

Parameters:

text - the string to remove tags from

Returns: unicode

a unicode instance with all tags removed

plaintext(text, keeplinebreaks=True)

source code

Extract the text elements from (X)HTML content

>>> plaintext('<b>1 &lt; 2</b>')
u'1 < 2'

>>> plaintext(tag('1 ', tag.b('<'), ' 2'))
u'1 < 2'

>>> plaintext('''<b>1
... &lt;
... 2</b>''', keeplinebreaks=False)
u'1 < 2'

Parameters:

text - unicode or Fragment
keeplinebreaks - optionally keep linebreaks

Classes
	TracHTMLSanitizer Sanitize HTML constructions which are potentially vector of phishing or XSS attacks, in user-supplied HTML.
	Deuglifier Help base class used for cleaning up HTML riddled with `<FONT COLOR=...>` tags and replace them with appropriate `<span class="...">`.
	FormTokenInjector Identify and protect forms from CSRF attacks.

Variables
	html = `ElementFactory()`
	tag = `ElementFactory()`

Module html

escape(str, quotes=True)

unescape(text)

stripentities(text, keepxmlentities=False)

striptags(text)

plaintext(text, keeplinebreaks=True)