random-variable

Sat Apr 04 2026

The Problem

Three random variables: MM (message), KK (key), C=MKC = M \oplus K (ciphertext). I know the distributions of MM and KK individually. How do I find the distribution of CC?

More generally: when a new random variable is defined as a function of others (Z=f(X,Y)Z = f(X, Y)), how do I compute its PMF?

Where I Got Stuck

I could reason about P(event A)P(\text{event A}) using Venn diagrams and set operations. But random variables are different. When someone writes C=KMC = K \oplus M and asks for P(C=1)P(C = 1), the answer depends on how KK and MM relate to each other. I didn't have a systematic way to go from "I know P(K=k)P(K = k) and P(M=m)P(M = m)" to "here's P(C=c)P(C = c)."

The event notation doesn't give you a way to do arithmetic on probabilities. You can intersect events, union them, complement them. But you can't XOR two events or add them. Random variables let you do that because they're functions, not sets.

Random Variables Are Functions

A random variable XX is a deterministic function from the sample space to real numbers:

X:ΩRX: \Omega \to \mathbb{R}

P(X=k)P(X = k) is shorthand for the probability of the event {ωΩX(ω)=k}\{\omega \in \Omega \mid X(\omega) = k\}, the inverse image of kk under XX.

Because RVs are functions, we can compose them. C(ω)=K(ω)M(ω)C(\omega) = K(\omega) \oplus M(\omega) for every outcome ω\omega. That's what "C=KMC = K \oplus M" means. The new variable CC is just another function on the same sample space, defined pointwise.

The Joint Table Is the Workspace

With two RVs KK and MM, stop listing sample space outcomes. Build the joint PMF:

PK,M(k,m)=P(K=k and M=m)P_{K,M}(k, m) = P(K = k \text{ and } M = m)

For fair independent coins (K,M{0,1}K, M \in \{0, 1\}):

KMK \setminus M01
00.250.250.250.25
10.250.250.250.25

If KMK \perp M: each cell is P(K=k)P(M=m)P(K{=}k) \cdot P(M{=}m).

This table is the object you work with. Everything else follows from it.

Marginalization: P(K=k)P(K = k) = sum of row kk. You project the 2D table onto one axis.

Conditioning: P(M=mK=k)P(M = m \mid K = k) = take row kk, divide each cell by the row sum. Same "shrink the universe" idea from conditional-probability, applied to the table instead of Venn diagrams.

Deriving a New Distribution

To find P(C=c)P(C = c) where C=KMC = K \oplus M: look at every cell (k,m)(k, m) in the joint table, check whether km=ck \oplus m = c, and sum those probabilities.

P(C=c)=k,mkm=cP(K=k,M=m)P(C = c) = \sum_{\substack{k, m \\ k \oplus m = c}} P(K{=}k, M{=}m)

For our fair coins: P(C=1)P(C = 1) comes from cells where kmk \neq m, which are (0,1)(0, 1) and (1,0)(1, 0). So P(C=1)=0.25+0.25=0.5P(C = 1) = 0.25 + 0.25 = 0.5.

The same technique works for any operation. For addition (Z=X+YZ = X + Y) with independent RVs, fixing X=kX = k forces Y=zkY = z - k, giving the convolution formula:

P(Z=z)=kP(X=k)P(Y=zk)P(Z = z) = \sum_{k} P(X = k) \cdot P(Y = z - k)

If XX and YY are dependent, replace P(Y=zk)P(Y = z - k) with P(Y=zkX=k)P(Y = z - k \mid X = k).

That's the general technique: enumerate all input pairs that produce the target output, sum their joint probabilities.

XOR vs Addition: Why the Outputs Look Different

The operation you apply determines the shape of the resulting distribution.

XOR is a permutation. Fix any row kk in the joint table. As mm ranges over {0,1}\{0, 1\}, the output kmk \oplus m hits every value exactly once. Every row is a permutation of the output space. If KK is uniform, each output value gets equal total weight.

Addition piles up in the middle. Multiple input pairs can produce the same sum. Middle values have more combinations (4=1+3=2+2=3+14 = 1{+}3 = 2{+}2 = 3{+}1) while edge values have fewer (2=1+12 = 1{+}1). Mass accumulates in the center.

XOR (\oplus)Addition (++)
Per-row structurePermutation (each output once)Multiple pairs per output
Uniform /+\oplus/+ UniformUniformTriangle (peaked)
Anything /+\oplus/+ UniformUniformStill peaked

Adding more independent uniform variables keeps smoothing the peak toward a bell curve (Central Limit Theorem).

The Row-Universe Analysis

This is the part that connects to the one-time pad. Split the joint table by rows of KK. Each row is a separate universe conditioned on a specific key value.

xor-row-universe

  • Row K=0K = 0: C=0M=MC = 0 \oplus M = M. The ciphertext copies the message.
  • Row K=1K = 1: C=1M=MC = 1 \oplus M = \overline{M}. The ciphertext flips the message.

For CC to be independent of KK, the distribution of CC must look the same in every row:

P(C=1K=0)=P(C=1K=1)P(C = 1 \mid K = 0) = P(C = 1 \mid K = 1)

The left side is P(M=1)P(M = 1). The right side is P(M=0)P(M = 0). These are equal only when P(M=1)=P(M=0)=1/2P(M = 1) = P(M = 0) = 1/2, so MM must be uniform.

This is the one-time pad property: Any bias \oplus Uniform == Uniform. Even if MM is 90% zeros, a uniform KK washes it away completely:

P(C=c)=mP(M=m)P(K=cm)=mP(M=m)1N=1NP(C = c) = \sum_{m} P(M = m) \cdot P(K = c \oplus m) = \sum_{m} P(M = m) \cdot \frac{1}{N} = \frac{1}{N}

The 1/N1/N factors out of the sum, and the remaining mP(M=m)=1\sum_m P(M = m) = 1.

See otp-security-proof for how this drives the full security argument.

Independence Matters, Not Just Uniformity

KK being uniform isn't enough. KK must also be independent of MM.

Counterexample: let K=MK = M (perfect dependence). KK is still uniform since MM is fair. But C=MK=MM=0C = M \oplus K = M \oplus M = 0 always. The ciphertext is a constant. It leaks everything about MM (namely, that M=KM = K).

The proof above used P(K=cm)P(K = c \oplus m) without conditioning on M=mM = m. That step requires KMK \perp M. Without independence, you'd need P(K=cmM=m)P(K = c \oplus m \mid M = m), and you can't conclude uniformity.