Toolverse

How Word Counting Works: Reading Time, Character Limits, and Edge Cases

7 min read

Every platform has a length constraint. Twitter gives you 280 characters. Google truncates meta descriptions at roughly 155. Academic journals specify word counts in their submission guidelines. Knowing exactly how long your text is — and how long it takes to read — is a fundamental writing skill that most people outsource to their word processor. But what actually happens when software counts your words?

How Word Counting Actually Works

At its core, word counting splits text on whitespace boundaries — spaces, tabs, and line breaks. The resulting tokens are filtered to remove empty strings, and the remaining count is your word total. But edge cases make this surprisingly tricky.

Hyphenated compounds like "well-known" are typically counted as one word by most processors, matching the convention used by Microsoft Word and Google Docs. Contractions ("don't", "it's") count as one word. Numbers ("42", "3.14") each count as one word. URLs and email addresses are usually one word regardless of length.

The Unicode standard (UAX #29) defines formal word boundary rules that handle scripts without spaces (Chinese, Japanese, Thai) using dictionary-based segmentation. For Latin-script text, the whitespace approach works well for practical purposes.

Characters: With or Without Spaces?

Character count has two common variants. Characters with spaces counts every character in the text including whitespace — this is what Twitter and most social platforms use for length limits. Characters without spaces strips all whitespace first, which is the standard in translation and localization billing. The European standard EN 15038 for translation services defines a "standard page" as 1,500 characters without spaces.

For SMS messaging, the GSM 03.38 character set allows 160 characters per message. Using characters outside this set (emoji, accented characters in some cases) switches to UCS-2 encoding, dropping the limit to 70 characters per segment.

Sentence Detection Is Harder Than You Think

Splitting on periods seems straightforward until you encounter abbreviations ("Dr.", "U.S.A."), decimal numbers ("3.14"), URLs ("example.com"), and ellipses ("..."). A robust sentence tokenizer needs to distinguish sentence-ending periods from abbreviation periods.

The average English sentence length has decreased over time. Academic research by Rudolph Flesch found that the average sentence in published writing dropped from 29 words in the 1900s to about 20 words by the 1970s. Modern web content averages 15-20 words per sentence, with readability guidelines recommending 15-20 words for general audiences.

Reading Time: The Math Behind the Estimate

Reading time estimates divide word count by a words-per-minute (WPM) rate. The question is which rate to use. A 2019 meta-analysis by Marc Brysbaert, published in the Journal of Memory and Language, analyzed 190 studies and established 238 WPM as the average silent reading rate for English non-fiction text in adults.

This rate varies significantly by context:

  • Technical documentation: 200-220 WPM — readers slow down for code examples and unfamiliar terminology.
  • News articles: 250-280 WPM — familiar vocabulary and short paragraphs speed reading.
  • Legal or academic text: 150-180 WPM — dense terminology and complex sentence structures require re-reading.
  • Mobile screens: 10-15% slower than desktop due to smaller viewport and scrolling friction (Nielsen Norman Group, 2016).

Medium popularized reading time estimates on articles, using 265 WPM as their baseline. Most blogging platforms now include reading time because it sets reader expectations and increases engagement — a study by Marketing Experiments found that showing reading time increased content engagement by 13%.

Word Counts That Matter by Platform

Different contexts impose different length constraints:

  • Google meta title: 50-60 characters (pixels matter more than characters — Google truncates at ~580px on desktop).
  • Google meta description: 150-155 characters.
  • Blog posts for SEO: 1,500-2,500 words correlate with higher rankings according to multiple Backlinko and Ahrefs studies, though quality and search intent matter more than raw word count.
  • LinkedIn posts: 1,300 character limit; posts between 1,000-1,300 characters tend to get the highest engagement.
  • Academic abstracts: typically 150-300 words depending on the journal.

Key Takeaways

  • Word counting splits on whitespace — hyphenated words and contractions count as single words.
  • Character count "with spaces" is the social media standard; "without spaces" is the translation industry standard.
  • The scientifically established average reading speed is 238 WPM for adult English readers.
  • Reading time estimates increase content engagement when displayed alongside articles.
  • Optimal text length varies by platform — always check the specific constraints for your target channel.

Need a quick count? Use our Word Counter to get words, characters, sentences, paragraphs, and reading time from any text — no signup required.

Try it yourself

Put what you learned into practice with our free tool.

Open Tool

Frequently Asked Questions

What is the average reading speed used for reading time estimates?
The scientifically established average is 238 words per minute for adult English readers, based on a 2019 meta-analysis of 190 studies by Marc Brysbaert published in the Journal of Memory and Language. Medium uses 265 WPM as their baseline.
Do hyphenated words count as one word or two?
Most word processors including Microsoft Word and Google Docs count hyphenated compounds like 'well-known' as one word. This matches linguistic convention — the hyphen joins the words into a single compound modifier.
What is the ideal word count for blog posts?
Studies by Backlinko and Ahrefs show that posts between 1,500-2,500 words tend to rank higher in Google search results. However, search intent and content quality matter more than raw word count — a 500-word answer to a simple question can outrank a 3,000-word article.