PDF and eBooks: Linking Form and Content

By Pat Coyne

PDF and eBooks: Linking Form and Content

Paper is giving way to digital. Every newspaper with half an eye on its market now has a Web site. Three years ago, only 10-15 percent of scientific journals were online. Now that figure is more than 70 percent and soon, many predict, it will be virtually 100 percent. Encyclopaedia publishers don't bother with those huge volumes any more; they put their product out on CD-ROM and on the Internet. Likewise, dictionary publishers are starting to do the same. It's only a matter of time before all books are published electronically and the forests of the world can grow undisturbed.

That, if you believe the techno-prophets, is the future. Like many such visions, it contains enough truth to be plausible; but in its mixture of arrogance and ignorance, it borders on the risible.

The eBooks are Coming!

Electronic books will come. Indeed, they are here already. My company, among others, already publishes a whole range of titles, from classic literature to scientific treatises. So far, ebook publishers are mainly small, but the idea has begun to attract some larger players. Adobe Systems has announced its PDF Merchant and Web Buy products, aimed at what it describes as the "burgeoning eBooks Market," which will enable publishers to sell eBooks on the Internet while maintaining their intellectual property rights. As ever, not slow when it sniffs a buck to be made, Microsoft has announced its Clear Type technology and its own Reader, which will be used in conjunction with the XML-based Open EBook standard, which numbers Microsoft among its originators.

Now eBooks may be here -- but will they be read? Even more importantly, will they be bought? I (and every other ebook publisher) certainly hope so; but in my view, if reading books on screen is to be more than a hard-on-the-eye novelty, then publishers and the computer industry alike will have radically to change and improve the way they deliver text.

Why? Because of the competition. Paper is the best medium ever invented. It is light, cheap, capable of very high resolution (3000 dpi or more, 30 times better than the best screens currently available), renewable and needs no expensive device to read it. You can read it anywhere, flip through hundreds of pages in seconds, fold down the edge of pages for bookmarks, write all over it and even drop it from a great height without damage. And, if all else fails, you can keep yourself warm with it.

Compared to paper, computers barely get to first base. The problem is compounded by the prevalent screen design philosophy that emphasises discrete elements, aimed at grabbing attention, rather than allowing eye to read smoothly and sequentially through the text. The few advantages computers do have -- mainly searchability and speed of delivery -- are not much use if what is being delivered is a low-resolution, ill-designed mess.

In my view, the first step toward making the electronic book a practical -- let alone a desirable -- reality is a certain humility. The last five centuries of paper book publishing have not been a mere prelude to the arrival of the digital version. On the contrary, publishers of electronic books ignore the lessons of paper at their peril.

Appearance is No Accident

The first lesson: Form and content are inextricably linked. There is no such thing -- at least as far as a thinking, breathing human being is concerned -- as pure "information", independent of the form in which it is presented. There is, in short, a fundamental distinction between the "book" and the "text". The text is that element which most closely corresponds to pure information and can be a digital signal, Morse code, ASCII characters or whatever, but has no concrete meaning until it can be perceived. The book is the method by which the text is conveyed to the reader. It follows that the appearance of a book is no accident. The way it looks is an integral part of the information being imparted. Screenloads of undifferentiated ASCII will not be read with the same ease and enjoyment as a crisply printed page. Good layout improves comprehension; bad layout hinders it -- sometimes even prevents it.


Download A Christmas Carol
Electric Book Co's free PDF version
570K ZIP

My particular hero (in this area, at least) William Morris, who produced some of the most beautiful books of the 19th century, designed his own typeface, constructed and operated the printing press, had the paper and inks made to his own specification and hand-crafted the bindings. He not only specified the layout of text on the page, but also the size and proportions of the margins. Even white space, according to Morris, had its role.

A Different Mindset

The second lesson is that books are read sequentially -- you start somewhere and you read through to the point where you finish. The typography must reflect that; it must guide the reader's eye through the text. That means not only type that is clear and uncluttered, but also a proper hierarchy of headings, intros, sub heads and, yes, margins -- so that the reader can gauge, almost unconsciously, the relative importance of various elements of the text and take in its meaning with the minimum effort. Contrast that with the average Web page with its riot of colour, animations and text all over the place. You need a different mind set to design a good book page. You also need much higher resolution. You can get away with low resolution for small bits of text, but not for large blocks. Try reading more than a screen or two of text at standard VGA (640 by 480) and you will see what I mean.

So is the way to make the ideal electronic book simply to take the paper version and create an exact digital facsimile? Not quite. There are, in my view, important differences between screen and paper that need to be taken into account:

1. Typography:

There is a general feeling that sans-serif fonts are more readable on screen (the opposite is true for print). I'm not convinced, and in any event some texts, particularly novels, just don't look right in sans. In our experience, both serif and sans faces can be used, but generally the serif faces need to be rather weightier (i.e. thicker) than the typical Times of print. We tend to use a variant of Century (Century 731). Conversely, we find that the most used sans-serif fonts (Helvetica, Arial) are rather too weighty for large-scale screen text. Verdana is much better, but who wants to ape Microsoft? We prefer something lighter, like a News Gothic.

2. Leading:

The space between lines needs to be larger than for print. The minimum proportions should be something like 10/15 pt or 11/16 pt, and even more leading can help readability. On the other hand, too much white space slows down reading and can give either an over-designed or even slightly childish appearance to the text.

3. Line and page length.

This is a tricky issue. Purely subjectively, we find that slightly fewer words per line than the average printed line is more readable on-screen. We prefer line lengths of around 70 characters, rather than the 80-plus of a typical book. As for page lengths, the obvious option might be to make the page the same shape as the screen; but, ignoring the fact that differing screen resolutions have differing ratios of height to width, there is the undoubted fact that book pages do not look the way they do by accident. A typical printed page has a ratio of height to width of between 3:2 and 4:3. Pages with those sort of ratios are simply easier to read than those that are wider than they are high -- i.e. the shape of a screen. It may simply be acculturation, but I think there is more to it than that -- something to do with the Greek idea of the Golden Mean, perhaps. In our own publications we use normal book page ratios.

Screen Versus Paper

Most of these differences are the result of the difference in resolution between screen and paper. (I suspect, although without any evidence, that there may also be effects from the fact that paper is a reflective medium while screens transmit.) The best current screen resolutions are little more than 100 dpi (one-third that of the cheapest printers) and, while much higher resolutions will undoubtedly come, it is likely to be several years before they are in widespread use. That means for ease of reading the text has to be displayed significantly larger than for paper, the equivalent of at least 12 pt compared to 9 pt for a typical book. Personally, I think the realistic minimum system for reading large amounts of text on screen is 1280 by 1024 on a 17-inch screen. Those specifications are rapidly becoming standard in desktop machines and, at that size and resolution, an entire page is easily readable. The ability to see a whole book-shaped page on screen, together with properly weighted typography-headlines, intros, running headers, footers and margins-improves the readability of the text by an order of magnitude.

Page 2