Conditional Probability and Bayes' Theorem | Probability & Statistical Literacy

When a headline hits, you do not throw away yesterday’s belief—you condition on new information. Module 01’s price is a snapshot of collective belief; arbitrage reacts within seconds when gaps are real; spreads widen when uncertainty spikes. Conditional probability and Bayes’ theorem are the update engine behind “the market moved on news” and behind your own edge when you disagree with the tape.

Conditional probability

P(A | B) is the probability A is true given that B is true. Read it as: restrict the sample space to worlds where B happened, then renormalize so probabilities again sum to one. Formally, when P(B) > 0, P(A | B) = P(A and B) / P(B).

P(A | B) is not the same as P(B | A). “High chance of winning Pennsylvania given Michigan won” does not automatically reverse. The conjunction fallacy lives in that confusion—coherent stories feel symmetric; the math is not.

Examples in words: P(YES resolves | candidate leads poll) is how you reprice after a survey. P(arb wins | Kalshi 58¢ and Poly 52¢) is whether the gap is real after fees. P(manipulation | thin book) separates fake moves from information.

A joint-counting example

Imagine 500 equally weighted paths for “wins Pennsylvania” (A) and “wins Michigan” (B). Suppose 120 paths have both, 80 have A only, 90 have B only, and 210 have neither.

Then P(A) = 200/500 = 0.40, P(B) = 210/500 = 0.42, and P(A and B) = 120/500 = 0.24. Conditioning on B: P(A | B) = 120/210 ≈ 0.571. Conditioning on A: P(B | A) = 120/200 = 0.60. Same data, different questions—always label which bar you are conditioning on.

Bayes’ theorem

Prior P(H) is belief before evidence. Likelihood P(E | H) is how often you see evidence E if hypothesis H is true. Posterior P(H | E) is belief after E.

Standard form: P(H | E) = P(E | H) × P(H) / P(E). The denominator expands as P(E) = P(E | H)P(H) + P(E | not H)P(not H)—the base rate of E matters. Likelihood ratio P(E | H) / P(E | not H) tells you how strongly E pushes odds; in odds form, posterior odds equal prior odds times the likelihood ratio.

Quick odds math: prior 50% with likelihood ratio 2 gives posterior odds 2:1 → 66.7%. Prior 75% with LR 0.5 gives 1.5:1 → 60%. Prior 40% with LR 1.5 lands at 50%. Use odds form when chaining several weak signals.

Binary contract update

Suppose H is “incumbent wins” and yesterday’s Kalshi mid implied P(H) = 0.52. Evidence E is a new poll showing incumbent +4; from your backtest you estimate P(E | H) = 0.70 and P(E | not H) = 0.35.

Then P(E) = 0.70×0.52 + 0.35×0.48 = 0.532, and P(H | E) = 0.70×0.52 / 0.532 ≈ 0.684. Fair YES under your model is about 68.4¢. If YES ask is still 54¢ after the poll, you may have Bayesian edge—subject to poll quality, resolution match, and fees. Compare to the ask, not the mid.

Base-rate neglect

A rare scandal ends campaigns only 2% of the time in history. A tip arrives; you assign P(tip | scandal) = 0.80 and P(tip | no scandal) = 0.10. Then P(tip) = 0.80×0.02 + 0.10×0.98 = 0.114, and P(scandal | tip) = 0.80×0.02 / 0.114 ≈ 14%. The headline screams scandal; the posterior is still modest because tips fire often when nothing happens. Traders who skip the denominator buy “resign” markets too high—information cascades amplify the mistake.

Independence and market prices

A and B are independent if P(A and B) = P(A) × P(B). “Fed cuts” and “BTC up the same day” are usually linked by macro. Two unrelated tags on Polymarket the same hour might be closer to independent—check the narrative. Parent and child contracts on the same tree are not independent. Assuming independence lets you multiply probabilities and overstate edge on baskets.

Treat the market mid as a crowded prior. You can replace it with only your model (hubris risk), blend ad hoc with your view, or run full Bayes with likelihood from your model and prior from the market. When Polymarket jumps eight cents on a tweet and Kalshi lags ninety seconds, ask whether the move is information on one venue versus stale quotes on another—that feeds cross-market arbitrage thinking.

Updating discipline on a news day

State H and not-H exactly as the contract resolves. Record prior (yesterday’s mid or your last forecast). Estimate P(E | H) and P(E | not H) from history, not vibe. Compute posterior and compare to ask for YES or NO. If posterior ≈ price, edge is zero after spread. Cross-check sister venues for rule mismatch, not just gap size. Log the update for calibration in the Brier chapter.

Kalshi CLOBs often refresh fast on liquid U.S. politics but queue at the touch; Polymarket pools can jump on-chain with slippage on size. Media uses mid; your Bayes compare should use executable prices.

Errors to avoid

Confusing P(A | B) with P(B | A)—“market up given poll” is not “poll likely given market up.” Double-counting the same poll retweeted five ways as five pieces of evidence. Ignoring P(E) when evidence is rare but not diagnostic. Conditioning on the future—“given we win” before the vote—is not a valid information set.

Likelihood ratios without tables

If prior odds on H are 3:1 (75% chance) and evidence is twice as likely under H than under not-H, posterior odds become 6:1 → about 85.7%. If the same evidence is only 1.2× more likely under H, posterior odds are 3.6:1 → about 78.3%. Small likelihood ratios move confident priors slowly; that is why long-shot contracts rarely deserve a jump from 8¢ to 40¢ on one anecdote.

Stale price as evidence

Treat “Kalshi still 54¢ while Polymarket is 68¢ after the same poll” as a question about which venue updated and whether rules differ, not as automatic edge on the stale side. Conditional thinking: P(true 68% | Poly moved, Kalshi stale) versus P(rule mismatch | persistent gap). Bayes is not only for polls—it is for microstructure and oracle risk too.

Pre-commitment discipline

Write H, prior, and likelihoods before you see the post-trade mid. After you click, the brain rewrites the prior to match the fill. That habit protects the calibration metrics in the Brier chapter and keeps Kelly sizing from becoming a story about conviction rather than probability.

Prosecutor’s fallacy preview

“If the incumbent wins, this poll pattern appears 70% of the time” is P(poll | win). You trade P(win | poll). Markets move when traders confuse the two—buying YES after a pattern that is common even when your candidate loses. Bayes forces the base rate and the inverse into one line. The fallacies chapter closes the loop on how often this mistake appears in headlines.

Worked odds form without a grid

Start at 40% prior on YES (odds 2:3 against). Evidence with likelihood ratio 2 doubles prior odds to 4:3 → posterior 4/7 ≈ 57%. If the YES ask is still 48¢, you have roughly nine points of edge before fees—not because the poll “sounded good,” but because the arithmetic said so. If the ask is 58¢, you have no trade even though the poll was “bullish.”

Multiple signals

Two independent-ish signals multiply likelihood ratios. LR 1.5 and LR 1.5 combine to 2.25 on odds—not 3.0 by addition. When five tweets repeat one poll, you have one LR, not five. Bayes rewards distinct evidence, punishes duplicated noise.

Conditioning on partial results

“Given leading in early returns” is valid only when those returns are in your information set and the contract rules count them. Partial election data changes P(win | partial) dramatically; contracts that ignore partials should not be updated as if they were exit polls. Match the event in the formula to the event in the rulebook.

Market as prior, you as likelihood

A practical workflow: let opening mid be prior P_mkt(H), your private data provide likelihood ratio, posterior P_you(H) drives trade. If posterior equals P_mkt, pass. You are not obligated to beat the crowd every hour—only when your likelihood is informed and not already in price.

What comes next

Next: odds to probability—converting sportsbook decimal, fractional, and moneyline quotes into probabilities you can compare to Kalshi cents and Polymarket prices, including vig.