E-Readers Connect the Data

Say a university press publishes a runaway bestseller. Say that while said bestseller is a well-written and insightful scholarly book, it's also the lastest on a long-established list of equally brilliant but more moderately-selling titles. What is the press to make of the singular phenomenon? How does it interpret and understand the sales numbers, and translate them into future success?

One strategy is the modern one: more data. As retailers are slowly letting on, sales are no longer the only big numbers available. Thanks to the advent of the e-reader, the book industry can also see patterns in e-reading habits: completion rates, browsing times, common searches, and more.

The most recent AAUP Digital Book Publishing Survey ascribes about 8-9% of university press sales to e-books. Most of these are sold on popular retailer platforms: the Kindle, the Nook, through Google Play, on Kobo. Trace these back to Amazon, Google, et cetera—the land of data-mining giants—and it comes as small surprise that each of these platforms have reader tracking capabilities: not only what is purchased and downloaded, but also how much of each item is read, and for how long, in how many sittings.

The Electronic Frontier Foundation has been following the trend since 2010; the Wall Street Journal wrote about it in 2012 and the New York Times in 2013. Is this new undercover marketing research a technical bauble, a threat to reader privacy and publisher integrity, or a useful tool?

The EFF provides a chart detailing the limitations of e-reader tracking on the most common devices. In the case of the individual reader, policies mostly align with common practices for booksellers and librarians: individual information is only shared with law enforcement or civil litigants. For now, the main value of tracking data appears to come from the aggregation of trends for e-platforms and their content providers.

An equally pressing question is: should publishers allow aggregate reader data any influence in-house? Should big data be competing with the voices of experienced editorial, production, and marketing staff? To answer that question, it helps to consider what measurements are currently included in such data, and thus gauge its practical limitations.

So to get back to basics: tablets allow e-book providers to track reader movements within the text, time (and times) spent,, and reader interaction, and extrapolate real-world behavior from that. Time, for example: how much does a reader spend with a book? Text place: how much of the book is read? Time x place: in how many sittings? And is more time spent in certain parts of the book? Interaction: are there common terms that readers search for? Passages they highlight?

A lot of this data would currently mean little to publishers, for the simple reason that they have no baseline. (Presumably, e-retailers have a lead there.) We don't yet know if or how the average amount of time spent in a book, for example, correlates to sales numbers. And depending on the title, lack of completion may not signal lack of purchasing interest—particularly for a scholarly book or anthology. It's presumable that any strong editorial influence would be built on a few years of establishing those baselines, finding out which data is predictive of what and how precisely.

One application touted by tablet retailers is the series. Say a fiction author writes a bestseller opening book, and millions buy but don't finish the sequel: it's a safe bet that a third sequel would sell in more moderate numbers. Thus, completion rates could help a publisher decide whether to continue investing in a series, and to what degree. But it's unclear that the series application would expand to well to nonfiction scholarly books, where "series" are more loosely defined, including a wider range of subject matter and usually several different authors.

Another application that could be more relevant is segmentation. Early data has borne out that readers are more likely to complete books broken into shorter chapters, and that readers are more likely to digest nonfiction in multiple shorter sittings. Added together, this would point to a strategy that many university presses already experiment with: chunking, as in e-book shorts or single chapter sales.

Ultimately, new data may simply confirm educated hunches, whether about content, formatting, or funding. But scholarly publishers are familiar with the principles of well-conducted research and may be best placed to responsibly and creatively explore the possibilities of e-reading platform data.


Regan Colestock
Communications Strategist, AAUP