Modules / Module 12 / Chapter 2

Internal Prediction Markets: Employee Wisdom Aggregation

Professional Applications & Career Pathways

Corporate teams often have private information that cannot be listed on Kalshi or Polymarket without leakage or morale risk. Internal prediction markets let employees trade play-money or point-weighted claims on operational milestones while leadership consumes aggregated beliefs. The design goal is psychological safety, fast revision, and alignment with HR policy—not TVL on a leaderboard.

Internal versus external: a clean split

External venues aggregate global capital on elections, macro, and celebrity headlines. Internal markets aggregate people who know the backlog, the quota model, and the incident queue. Use external feeds for exogenous shocks; use internal markets for milestones defined in systems you already trust—Jira epic closed, Salesforce quota snapshot, PagerDuty incident duration.

Collateral is usually points, subsidized log-scoring market maker liquidity, or small non-cash rewards—not employee paychecks tied to individual trading profit in the pilot phase. Resolution comes from manager attestation only when necessary; better is system attestation against named fields and timestamps.

Mechanism choice without crypto drama

Cold start kills internal pilots. On day one you need a price on screen. A log-scoring market maker (or vendor equivalent) is the default: always a quote, bounded house subsidy, no need for volunteer market makers. A central limit order book with play money teaches limit-order thinking but starves without dedicated makers. Pure polls score easiest with compliance but weaken incentives. Hybrids—AMM seed, book tighten later—fit scale-up after participation proves out.

The mechanism chapter’s AMM-versus-book tradeoff applies without blockchain: you are choosing how employees discover price, not how tokens settle.

Question design inside the firewall

Good questions are boring in the best way: “Will API v2 hit GA by March 31 per Jira epic CLOSED?” “Will EMEA beat Q2 quota per Salesforce snapshot?” “Will Sev-1 close in under four hours per PagerDuty?” Bad questions are moral hazards: CEO firing pools, layoff percentages, or vague “is Project Phoenix good?”

Every question needs a resolution source, UTC cutoff, invalid handling when data is missing, and a version log when wording changes after HR comms. Conditional forms force scenario discipline: “Will we hit Q3 revenue if the competitor launches before July 1?”—useful for strategy offsites without pretending the crowd can see the future of personnel decisions.

Incentives that do not backfire

Participation nudges—lunch lottery for five forecasts a quarter—can reward volume over skill. Public Brier leaderboards can humiliate and invite sandbagging. Team-level calibration published internally; individual ranks visible only to the employee and manager unless works councils object. Executives should forecast anonymously in the same pool or “boss price” becomes the only signal.

Reward calibration and team aggregates, not raw trading PnL, when quotas and promotions sit in the background. Sandbagging sales forecasts to buy cheap YES is an internal flavor of manipulation; historical calibration weighting beats celebrating the biggest point winner.

Worked example: LMSR on a launch date

Two hundred engineers receive 1,000 play points each. The house runs a binary market on GA before November 15 with liquidity parameter chosen so prices move but subsidy stays capped weekly. Day one prints near 41% skepticism; day twelve rises toward 58% as integration risk clears; day twenty peaks near 67% and leadership reviews dependencies; day twenty-eight falls to 49% after a late bug—before the executive narrative catches up. Settlement: GA on November 18. The market lost the binary but beat the verbal “definitely next week” by almost a week of warning.

Facilitators can post a prior from historical slip data so day one is not pure herding on whatever the loudest engineer said in standup.

Worked example: sales cluster surfaces pipeline hygiene

A categorical market asks which of four regions hits quota first, linked in planning to a parent revenue scenario. West and LATAM align with CRM; East shows 29% market share versus 18% CRM forecast. Review finds double-counted deals; the market surfaced an inside-view error before quarter close without anyone betting on a colleague’s firing.

Anonymity, ethics, and compliance

Employees need pseudonymous IDs and a no-retaliation policy with an ombuds path. Legal may block questions that amount to insider trading on listed securities. Gambling characterization varies by jurisdiction—play-money pilots with counsel memos are standard. Block public-ticker outcomes; block layoff and personality markets hard.

Eight-week rollout sketch

Week one: counsel memo and union consult where required. Week two: pick vendor or IT build. Week three: one-hour calibration training. Week four: single division pilot question. Week five: retro on wording bugs. Week six: at most one additional question. Week seven: executives join anonymously. Week eight: Brier versus CRM baseline report to FP&A. Do not blast company-wide day one—density beats vanity participation counts.

Integrate read-only feeds from Jira, Linear, Salesforce aggregates, PagerDuty; keep HRIS headcount milestones out of trader-visible resolution unless policy explicitly allows.

Failure patterns to study

Publicity spikes then participation decay without recurring questions. Layoff markets destroy morale. Manager-resolved outcomes breed distrust. Uncapped LMSR subsidy blows the pilot budget. Skipping training yields 50/50 noise dressed as wisdom.

Early large-firm experiments taught both sides: aggregation can work, but question selection and leadership behavior determine whether the tool survives past a press cycle. Copying the headline without copying governance repeats the failure.

Communication that reduces fear

Launch copy should state what the market is for (“early signal on milestone X”) and what it is not (“not input to layoff decisions”). Explain anonymity in plain language. Show an example resolve tied to Jira, not manager opinion. Where policy allows opt-out, track whether opt-out clusters by team—that can signal toxicity, not privacy success.

Executives who refuse to participate anonymously while demanding honesty from staff undermine the design. The market models the behavior you want from the organization.

Categorical and tree questions

Not every pilot is a single binary. Region-race categoricals surface pipeline hygiene errors. Shallow scenario trees link revenue to policy or launch dependencies. Keep trees small until participants maintain logical consistency—deep trees with sloppy parents teach bad planning.

Settlement on categoricals needs mutually exclusive buckets and an explicit refund rule when none win. Otherwise disputes consume the pilot.

Measuring success without gamification

Useful metrics include forecast error versus CRM baseline on the pilot question, time from market warning to mitigation, participation breadth across departments, and post-pilot trust surveys. Vanity metrics—total trade count, loudest trader leaderboard—work against the cultural goal.

Points might redeem to charity or learning stipends in phase one; tying points to individual performance review in the same quarter is a policy choice counsel should review explicitly.

Public market literacy still applies inside

Internal prices approximate probabilities only when participation is broad enough that no single desk sets the mid. Incentives beat unfunded polls—the chapter on why crowds aggregate information applies behind the firewall. If internal and external signals disagree, document both: low internal GA price with calm macro row is a different story than both flashing red.

Tie-back to corporate forecasting

Internal prices are the private inside view in the fusion workbook from the prior chapter. External macro and policy contracts remain the exogenous half. Together they give committees both “what the world might do” and “what we secretly fear about our own plan”—without conflating the two.

Vendor versus build (practical default)

Unless oracle definitions are truly unique, buy vendor LMSR or hybrid SaaS with SSO and audit logs. Internal engineering time is better spent on integrations—Jira, Salesforce, PagerDuty—than on reinventing matching. If you later build, copy rules-first discipline from the builder module: subsidy caps, dispute playbooks, and versioned wording before you debate chain choice.

Facilitator role

Someone must own weekly review: compare market to plan, log wording bugs, and shield participants from managers who treat price as performance review. That role is often PMO or a senior IC, not HR alone. Facilitators post priors from historical base rates so day-one prices are not pure hallway lore.

When to kill a pilot

Kill when participation is hollow, when questions keep resolving ambiguously, when subsidy burns without forecast error improvement, or when culture turns punitive. Scale when Brier or simple hit-rate beats baseline on the pilot KPI and counsel approves a second question. Patience beats company-wide blast radius.

Next: Geopolitical Risk Analysis Using Markets