Formatting Text

Twitter · Blog · Discussions

In the last chapter, our browser created a graphical window and drew a grid of characters to it. That’s OK for Chinese, but English text features characters of different widths grouped into words that you can’t break across lines.There are lots of languages in the world, and lots of typographic conventions. A real web browser supports every language from Arabic to Zulu, but this book focuses on English. Text is near-infinitely complex, but this book cannot be infinitely long! In this chapter, we’ll add those capabilities. You’ll even be able to read this chapter in your browser!

What is a Font?

So far, we’ve called create_text with a character and two coordinates to write text to the screen. But we never specified its font, size, or style. To talk about those things, we need to create and use font objects.

What is a font, exactly? Well, in the olden days, printers arranged little metal slugs on rails, covered them with ink, and pressed them to a sheet of paper, creating a printed page (see Figure 1). The metal shapes came in boxes, one per letter, so you’d have a (large) box of e’s, a (small) box of x’s, and so on. The boxes came in cases (see Figure 2), one for upper-case and one for lower-case letters. The set of cases was called a font.The word is related to foundry, which would create the little metal shapes. Naturally, if you wanted to print larger text, you needed different (bigger) shapes, so those were a different font; a collection of fonts was called a type, which is why we call it typing. Variations—like bold or italic letters—were called that type’s “faces”.

Figure 1: A drawing of printing press workers. (By Daniel Nikolaus Chodowiecki. Wikipedia, public domain.)
Figure 2: Metal types in letter cases and a composing stick. (By Willi Heidelbach. Wikipedia, CC BY 2.5.)

This nomenclature reflects the world of the printing press: metal shapes in boxes in cases from different foundries. Our modern world instead has dropdown menus, and the old words no longer match it. “Font” can now mean font, typeface, or type,Let alone “font family”, which can refer to larger or smaller collections of types. and we say a font contains several different weights (like “bold” and “normal”),But sometimes other weights as well, like “light”, “semibold”, “black”, and “condensed”. Good fonts tend to come in many weights. several different styles (like “italic” and “roman”, which is what not-italic is called),Sometimes there are other options as well, like maybe there’s a small-caps version; these are sometimes called options as well. And don’t get me started on automatic versus manual italics. and arbitrary sizes.A font looks especially good at certain sizes where hints tell the computer how best to align it to the pixel grid. Welcome to the world of magic ink.This term comes from an essay by Bret Victor that discusses how the graphical possibilities of computers can make for better and easier-to-use applications.

Yet Tk’s font objects correspond to the older meaning of font: a type at a fixed size, style, and weight. For example:You can only create Font objects, or any other kinds of Tk objects, after calling tkinter.Tk(), and you need to import tkinter.font separately.

import tkinter.font
window = tkinter.Tk()
bi_times = tkinter.font.Font(
    family="Times",
    size=16,
    weight="bold",
    slant="italic",
)

Your computer might not have “Times” installed; you can list the available fonts with tkinter.font.families() and pick something else.

Font objects can be passed to create_text’s font argument:

canvas.create_text(200, 100, text="Hi!", font=bi_times)

In the olden times, American typesetters kept their boxes of metal shapes arranged in a California job case, which combined lower- and upper-case letters side by side in one case, making typesetting easier. The upper-/lower-case nomenclature dates from centuries earlier.

Measuring Text

Text takes up space vertically and horizontally, and the font object’s metrics and measure methods measure that space:On your computer, you might get different numbers. That’s right—text rendering is OS-dependent, because it is complex enough that everyone uses one of a few libraries to do it, usually libraries that ship with the OS. That’s why macOS fonts tend to be “blurrier” than the same font on Windows: different libraries make different trade-offs.

>>> bi_times.metrics()
{'ascent': 15, 'descent': 4, 'linespace': 19, 'fixed': 0}
>>> bi_times.measure("Hi!")
24

The metrics call yields information about the vertical dimensions of the text (see Figure 3): the linespace is how tall the text is, which includes an ascent which goes “above the line” and a descent that goes “below the line”.The fixed parameter is actually a boolean and tells you whether all letters are the same width, so it doesn’t really fit here. The ascent and descent matter when words in different sizes sit on the same line: they ought to line up “along the line”, not along their tops or bottoms.

Figure 3: The various vertical metrics of a font. All glyphs in a font share the same ascent, x-height, and descent, and are laid out on a shared baseline. However, the measure (or advance) of glyphs can differ.

Let’s dig deeper. Remember that bi_times is size-16 Times: why does font.metrics report that it is actually 19 pixels tall? Well, first of all, a size of 16 means 16 points, which are defined as 72nds of an inch, not 16 pixels,Actually, the definition of a “point” is a total mess, with many different length units all called “point” around the world. The Wikipedia page has the details, but a traditional American/British point is actually slightly less than 1/72 of an inch. The 1/72 standard comes from PostScript, but some systems predate it; TeX , for example, hews closer to the traditional point, approximating it as 1/72.27 of an inch. which your monitor probably has around 100 of per inch.Tk doesn’t use points anywhere else in its API. It’s supposed to use pixels if you pass it a negative number, but that doesn’t appear to work. Those 16 points measure not the individual letters but the metal blocks the letters were once carved from, so the letters themselves must be less than 16 points. In fact, different size-16 fonts have letters of varying heights:You might even notice that Times has different metrics in this code block than in the earlier one where we specified a bold, italic Times font. The bold, italic Times font is taller, at least on my current macOS system!

>>> tkinter.font.Font(family="Courier", size=16).metrics()
{'fixed': 1, 'ascent': 13, 'descent': 4, 'linespace': 17}
>>> tkinter.font.Font(family="Times", size=16).metrics()
{'fixed': 0, 'ascent': 14, 'descent': 4, 'linespace': 18}
>>> tkinter.font.Font(family="Helvetica", size=16).metrics()
{'fixed': 0, 'ascent': 15, 'descent': 4, 'linespace': 19}

The measure() method is more direct: it tells you how much horizontal space text takes up, in pixels. This depends on the text, of course, since different letters have different widths:Note that the sum of the individual letters’ lengths is not the length of the word. Tk uses fractional pixels internally, but rounds up to return whole pixels in the measure call. Plus, some fonts use something called kerning to shift letters a little bit when particular pairs of letters are next to one another, or even shaping to make two letters look one glyph.

>>> bi_times.measure("Hi!")
24
>>> bi_times.measure("H")
13
>>> bi_times.measure("i")
5
>>> bi_times.measure("!")
7
>>> 13 + 5 + 7
25

You can use this information to lay text out on the page. For example, suppose you want to draw the text “Hello, world!” in two pieces, so that “world!” is italic. Let’s use two fonts:

font1 = tkinter.font.Font(family="Times", size=16)
font2 = tkinter.font.Font(family="Times", size=16, slant='italic')

We can now lay out the text, starting at (200, 200):

x, y = 200, 200
canvas.create_text(x, y, text="Hello, ", font=font1)
x += font1.measure("Hello, ")
canvas.create_text(x, y, text="world!", font=font2)

You should see “Hello,” and “world!”, correctly aligned and with the second word italicized.

Unfortunately, this code has a bug, though one masked by the choice of example text: replace “world!” with “overlapping!” and the two words will overlap. That’s because the coordinates x and y that you pass to create_text tell Tk where to put the center of the text. It only worked for “Hello, world!” because “Hello,” and “world!” are the same length!

Luckily, the meaning of the coordinate you pass in is configurable. We can instruct Tk to treat the coordinate we gave as the top-left corner of the text by setting the anchor argument to "nw", meaning the “northwest” corner of the text: