What is the difference between extractive and abstractive summarization?

Extractive summarization selects the most important sentences from the source and returns them unchanged. Abstractive summarization, the kind a language model does, writes new sentences that paraphrase the source. Extractive is faithful and fast; abstractive is more fluent but can drift from the original.

How does a frequency-based summarizer work?

It splits the text into sentences, sets aside stopwords, and counts how often each remaining word appears. Each sentence is scored by the combined frequency of its meaningful words, usually with a length adjustment, and the top-scoring sentences are kept in their original order.

When should I use extractive instead of a language model?

Use extractive when fidelity, speed, or privacy matters: summarizing contracts, medical notes, or research where every word must trace back to the source, or any document you do not want to upload. A browser-based extractive tool runs instantly and keeps the text on your device.

How long should a summary be?

Match the length to the source and your goal. Two or three sentences capture a short article; five to seven, or a percentage, suit a long report. Start short and lengthen until nothing important is missing. If it reads like the whole document, the length is set too high.

Why does my extracted summary read a little choppy?

Because it stitches together real sentences from different parts of the text, transitions between them can feel abrupt, and a sentence that starts with "this" or "it" may lose its context. The fix is to read the extract and lightly edit, or add the sentence before a dangling one.

Is it safe to summarize confidential text online?

It is when the tool runs entirely in your browser. A client-side extractive summarizer never uploads your text, so contracts and private notes stay on your machine. You can confirm by checking that the browser network tab stays empty while you summarize.

Text Summarizer: The Complete Guide (2026)

Summarizing is no longer one thing. There is the kind that picks the best sentences out of your text, and the kind that writes something new. Knowing which you need, and when, is the difference between a faithful summary and a fluent guess.

On this page

Two kinds of summary
How frequency-based summarization works
When extractive is the right tool
When abstractive is worth it
Choosing the right summary length
Where extractive summarizers struggle
Using a summarizer in a real workflow

Two kinds of summary

Summarization splits into two families. Extractive summarization selects the most important sentences from the source and returns them unchanged, so the summary is built entirely from the author's own words. Abstractive summarization, the approach a large language model uses, generates new sentences that paraphrase the source. The distinction is not academic. It decides whether your summary can be trusted to match the original word for word, how fast it runs, and whether your text has to leave your device. Most people only know the abstractive kind because that is what ChatGPT does, but the extractive kind is older, faster, and in many situations the better choice.

How frequency-based summarization works

The most common extractive method scores sentences by word frequency, an idea that goes back to a 1958 paper by Hans Peter Luhn at IBM. The logic is simple and surprisingly effective. First, split the text into sentences and words and set aside the stopwords, the high-frequency function words like "the", "of", and "and" that appear everywhere and signal nothing about the topic. Then count how often each remaining word occurs. Words that recur across a document tend to name its themes. Each sentence is scored by the combined frequency of the meaningful words it contains, usually with a length adjustment so that long sentences do not win automatically. The top-scoring sentences are kept, then put back in their original order so the summary still flows.

When extractive is the right tool

Extractive summarization wins whenever fidelity, speed, or privacy matters. Because it never rewrites, it cannot misquote, invent a statistic, or subtly shift an argument, which is exactly what you want when summarizing a contract, a medical note, a legal filing, or a research finding. It is instant, since it runs as a bit of arithmetic rather than a model inference. And a browser-based extractive summarizer needs no server, so a confidential document never leaves your machine. The trade-off is style: because it stitches together real sentences from different parts of the text, the result can read a little choppy.

Paste any article and pull out its key sentences, ranked by importance, with the length you choose.

Open the Text Summarizer →

When abstractive is worth it

Abstractive summarization, from a language model, earns its place when you need a polished, readable summary that reads as if a person wrote it, or when the source is so loosely structured that no single set of sentences captures it. A model can combine ideas spread across paragraphs, smooth the transitions, and adapt the tone. The cost is real, though: it can hallucinate details that were never in the source, it sends your text to a third-party server, and it is slower and, on paid APIs, charged by the token. For public, low-stakes content where polish matters more than exactness, it is the better tool. For anything sensitive or where every word must trace back to the source, it is the wrong one.

Choosing the right summary length

A summary that is too short drops the point; one that is too long is not a summary. The right length depends on the source and your goal. Two or three sentences capture the gist of a short article. A long report needs five to seven, or a percentage of the original that scales with its size. A useful habit is to start short and lengthen until nothing important is missing, rather than starting long and trimming. If the summary reads like the whole document, you have set the length too high. The point of a summary is to let you skip the rest, so it should be the smallest version that still tells you what you need.

Where extractive summarizers struggle

Frequency-based extraction has clear failure modes worth knowing. It assumes the important ideas are concentrated in a few sentences; when meaning is spread thinly and evenly, no sentence scores high and the summary feels arbitrary. It struggles with dialogue, where short exchanges carry meaning that single sentences do not. It can be fooled by repetition that is stylistic rather than substantive. And it cannot resolve a pronoun, so an extracted sentence that begins "This is why it failed" may land in the summary without the context that "this" referred to. The fix is human: read the extract, and if a sentence dangles, add the one before it or edit lightly.

Using a summarizer in a real workflow

The best results often come from combining both kinds. Use an extractive tool first to pull the key sentences from a long or sensitive document quickly and privately, then, if you need polish, hand that short extract to a language model to rewrite, which keeps the model's input small and its chance of drifting low. For studying, the extract itself is usually enough. For research triage, run the extract to decide whether the full piece is worth reading. Pair the summarizer with a readability checker to confirm the summary reads cleanly, and a word counter to hit a target length. Extraction handles the fast, faithful first pass; you or a model handle the polish.

Sources and further reading

Written by SAVI. We build the tools we write about. Try the Text Summarizer used in this post.