How to Shuffle Lines (and Why Fisher-Yates Matters)

Shuffle is sort's wilder cousin. The right algorithm is Fisher-Yates — every permutation equally likely, runs in linear time, easy to implement in any language. The wrong algorithms are surprisingly common, and they bias outcomes in ways that can break statistical work.

What Fisher-Yates does

The algorithm is short. Walk the list from end to start. At each position i, pick a random index j in the range 0 to i. Swap the items at i and j. Move to the previous position. Stop at index 1.

That's it. After the walk, the list is uniformly randomly permuted — every one of the n! possible orderings is equally likely.

JavaScript implementation:

function shuffle(arr) {
  for (let i = arr.length - 1; i > 0; i--) {
    const j = Math.floor(Math.random() * (i + 1));
    [arr[i], arr[j]] = [arr[j], arr[i]];
  }
  return arr;
}

Python:

import random
def shuffle(arr):
    for i in range(len(arr) - 1, 0, -1):
        j = random.randint(0, i)
        arr[i], arr[j] = arr[j], arr[i]
    return arr

Six lines. Linear time. Uniform distribution. The standard library functions in most languages (random.shuffle in Python, shuffle in Ruby) all implement Fisher-Yates internally.

The wrong shuffle — sort by random key

An intuitive but broken shuffle:

// DON'T DO THIS
arr.sort(() => Math.random() - 0.5);

The idea: sort the array by a comparator that returns random ±0.5. The implementation looks clever; the result is biased.

The reason: comparison sorts (introsort, mergesort, timsort) call the comparator in a structured pattern. With a random comparator, some orderings are more likely than others. The classic experiment is to shuffle a 5-element list 100,000 times this way and count the resulting orderings — the distribution is visibly non-uniform.

For a quiz with 10 questions, the bias is invisible. For randomized treatment assignment in a clinical trial, it's a research-integrity issue.

The other wrong shuffle — repeated random swaps

// ALSO DON'T DO THIS
for (let k = 0; k < arr.length * 10; k++) {
  const i = Math.floor(Math.random() * arr.length);
  const j = Math.floor(Math.random() * arr.length);
  [arr[i], arr[j]] = [arr[j], arr[i]];
}

"Just swap random pairs a bunch of times." Convergence to uniform distribution is slow and never guaranteed. Some orderings remain biased even after many iterations. Fisher-Yates does in n swaps what this approach approximates in 10n swaps and still doesn't quite get right.

Cryptographic vs non-cryptographic randomness

The shuffle algorithm is correct, but the quality of the random number source matters too. Two sources in modern environments:

  • Math.random() in JS, random.random() in Python. Fast, deterministic given a seed, not cryptographically secure. Fine for games, quizzes, sample randomization.
  • crypto.getRandomValues() in JS, secrets.randbelow() in Python. Slower, but pulls from the OS's cryptographic random source. Required for security-sensitive shuffles (lottery, A/B test treatment assignment, anything where bias would be exploitable).

The TextKit Shuffle Lines tool uses crypto.getRandomValues(), so the output is cryptographically uniform. Most production code in non-security contexts uses the standard library function (Array.prototype.sort() alternatives, random.shuffle), which is also fine for non-security use.

When shuffle quality actually matters

Most casual uses (randomizing a quiz, shuffling a playlist for the next session) don't notice biased shuffles. Three contexts where it matters:

  1. Statistical sampling. If you're shuffling to draw a random sample, biased shuffles produce biased samples. The downstream statistics inherit the bias.
  2. Cross-validation. Splitting a dataset into k folds requires uniform random assignment to fold. Biased shuffle → fold imbalance → misleading model performance estimates.
  3. Treatment assignment in experiments. Randomized controlled trials require true random assignment. Biased shuffle introduces a confound that's hard to detect later.

For these contexts, use a known-good Fisher-Yates implementation backed by cryptographic randomness. Don't write your own; don't trust shuffles you can't audit.

Browser, command line, and Excel

Browser: the Shuffle Lines tool runs Fisher-Yates with cryptographic randomness on the input. One paste, copy the result.

Linux/macOS command line: shuf input.txt — a Fisher-Yates implementation in coreutils. sort -R input.txt also shuffles, with one quirk: it groups equal lines (a deliberate design choice for stability). For "every line independently random," shuf is the better choice.

Excel / Google Sheets: add a helper column with RAND(), sort by that column. This is technically the broken "sort by random key" approach, but Excel's sort is stable and deterministic enough that the bias is small for typical use.

Cryptographically uniform shuffle. The Shuffle Lines tool uses Fisher-Yates with crypto.getRandomValues(). Paste your list, copy the result.

The two-line summary

Use Fisher-Yates. If the tool you're using doesn't say what algorithm it uses, assume it's wrong and use a tool that does.

For the sort companion to this post, see How to Sort Lines Alphabetically. For the broader reference on list operations, see List Operations: The Complete Guide.

Frequently asked questions

What's wrong with sort-by-random-key shuffles?

They produce a non-uniform distribution. Some orderings are more likely than others, depending on the comparison sort's behavior with equal random keys. For most casual use this is invisible; for statistical sampling or cryptographic contexts, it's a real bug.

Is browser-based shuffle cryptographically random?

It depends on the source. Math.random() is fast but not cryptographically secure. crypto.getRandomValues() is. The TextKit Shuffle Lines tool uses crypto.getRandomValues() for the per-step random index, so the shuffle is cryptographically uniform.

How do I shuffle a really large list (1M+ lines)?

For very large lists, use the command-line shuf (Linux) or sort -R. Both are streaming-capable and don't require loading the entire list into memory.

Can I shuffle the same list twice and get the same order?

Only if you seed the random number generator. Browser tools use a fresh source each time and don't expose a seed. For reproducible shuffles (test fixtures, scientific work), shuffle once and save the result, or use a tool that supports a seed.

How is shuffle different from random sort?

Functionally identical — both produce a randomly ordered list. Implementation usually differs: shuffle uses Fisher-Yates directly; random sort calls a comparison sort with a random comparator. The Fisher-Yates implementation is faster and produces uniform distribution; the comparison-sort approach can be biased.

Keep reading

Written by the TextKit team. We build the tools we write about — try the Shuffle Lines tool used in this post.