cjyuResearch

Watermark Z-Score Bounds Visualization

Interactive visualization of theoretical bounds for watermark detection z-scores and robustness to edits.

Parameters

Adjust the watermark detection parameters

0.50
2.0
0.050
10
5

Z-Score Bounds vs Text Length (n)

Variance V(y) Calculation

V(y) = n × γ × (1 - γ)

This is the variance of the number of green list tokens in a random sequence of length n under the null hypothesis (non-watermarked text). It follows from the binomial distribution Binomial(n, γ).

Lower Bound (Watermarked Text)

z_y ≥ Ω((e^δ - 1)√(n·γ·(1-γ)))

This bound shows that watermarked text will have high z-scores that grow with √n and increase exponentially with watermark strength δ.

Upper Bound (Non-watermarked Text)

z_y ≤ √(64V(y)·log(9/α)/(1-γ)) + 16C_max·log(9/α)/√(n·γ·(1-γ))

This bound ensures that non-watermarked text will have low z-scores, providing theoretical guarantees against false positives. Note that V(y) = n·γ·(1-γ) is now calculated automatically.

Robustness to Edits

z_u ≥ z_y - max((1+γ/2)η/√n, (1-γ/2)η/√(n-η))

The watermark remains detectable even after η edits. The robustness depends on the ratio η/√n, meaning the scheme can tolerate O(√n) edits while maintaining detectability.

Key Insights

Variance Calculation: V(y) = n·γ·(1-γ) represents the expected variance under the null hypothesis

Separation: The gap between watermarked and non-watermarked bounds grows with √n

Quality vs Detection: Higher δ improves detection but may reduce text quality

Robustness: The scheme can tolerate significant edits while maintaining detectability

Parameter Tuning: γ ≈ 0.5 often provides good balance between detection power and robustness