Mantic launch blog: A new kind of foresight

Mantic launch post

A new kind of foresight

Toby Shevlane

CEO & Co-Founder

Ben Day

CTO & Co-Founder

Launching Mantic

At Mantic we build AI systems for predicting events in the messy world of human affairs: geopolitics, business, policy, technology, and culture. In these domains, a purely data-driven modelling approach is insufficient. Flexible reasoning and research are required, and so human superforecasters outperform automated methods. Mantic’s goal is to change that, delivering automated predictions at unprecedented accuracy and scale.

Today, Mantic is coming out of stealth. We’ve assembled a world class team of AI technical staff, coming from Google DeepMind, Citadel, universities of Cambridge and Oxford, and other AI startups. We raised $4m in our pre-seed round led by Episode 1 with backing from DRW and various top AI researchers at Anthropic and Google DeepMind.

Mantic is pushing the frontier of AI forecasting accuracy. We won the top prize-money in the Q1 2025 Metaculus AI Benchmark Tournament and our latest system sets a new state of the art when backtested on the 348 questions from Q2 2025.

The challenge: solve judgmental forecasting

Good decision-making is the ultimate meta-skill that lets humanity prosper. However, our decision-making is severely bottlenecked by our prediction capabilities. Imagine if Western governments had proactively allocated the appropriate level of attention and resources to coronavirus pandemics, deterrence of Russian aggression in Ukraine, the rise of generative AI, and supply chains for rare earths.

At Mantic, our goal is to solve judgemental forecasting. An example question:

Will Iran close the Strait of Hormuz before 2027?

To answer this question, the forecaster must understand the state of the world and reason about how things will play out. These are uncertain, so – rather than selecting a single outcome – the forecaster assigns probabilities over the set of possible outcomes.

To measure their performance, you must ask the forecaster hundreds of questions, wait to see what happens, and apply a scoring function that penalizes the discrepancy between each prediction and the observed outcome.

Human “superforecasters” are the best judgmental forecasters.

Domain experts often struggle to beat superforecasters because forecasting itself is a difficult skill. Forecasting tournaments organized by the US intelligence community from 2010 to 2015 found that superforecasters (generalist civilians, with no access to classified information) matched or even outperformed intelligence analysts on geopolitical forecasting questions.
Statistical models (alone) are insufficient for questions with limited data (e.g. Iran has never closed the strait) and where it’s necessary to understand the strategic or political context.

The forecasting landscape also includes prediction markets like Polymarket and Kalshi. Like financial markets, these platforms incentivize information-gathering and timely, calibrated predictions.

However, superforecasting and prediction markets haven’t yet lived up to their proponents’ ambitions. Superforecasters’ predictions don’t regularly inform strategy in the C-suite or the White House, and it’s not clear they are getting more accurate over time. Nor are prediction markets deeply integrated into economic and policy decision-making, as was envisioned in Hanson’s Futarchy and more recently Vitalik Buterin’s info finance.

Contrast the stunning progress of weather forecasting, both technically and commercially. Today’s 5-day weather forecasts are as accurate as 1-day forecasts were in 1985, per the “one day per decade” rule of progress. Weather forecasting is highly integrated into the economy, providing billions of dollars of value by informing decisions in agriculture, energy production, aviation and shipping, construction, and military planning.

John von Neumann and Robert Oppenheimer with the IAS machine, one of the early computers used for weather forecasting in the 1950s.

We believe automation will radically transform judgmental forecasting, like it did weather forecasting. It’s now becoming possible to build an AI forecasting system that can rival strong human forecasters. Our system is competitive in the ongoing Metaculus Cup, the premier tournament. From now on, technical and commercial progress in judgmental forecasting will look qualitatively different, thanks to the natural advantages of automated systems.

Unlock 1: Faster feedback

It takes a long time for human forecasters to develop and verify new techniques. If you want to tweak your approach, you can’t wipe your memory and redo all your predictions from last year. In the scientific literature on judgmental forecasting, it’s common for a single experiment to take 6+ months.

In Men in Black, the neuralyzer wipes the subject’s memory back to a certain date. We can do something similar when developing AI prediction systems.

In contrast to humans, AI systems can be backtested: we can imagine it’s e.g. 1st September 2024 and only let the AI system access information from before then. Thereby, we can ask forecasting questions from a vantage point in the future where we already know the outcome. This reduces the latency of evaluating a prediction’s accuracy from months to milliseconds, and it means we can replay the past year of world events over and over.

Backtesting is a difficult engineering challenge, but it unlocks an iteration speed that makes the existing science look glacial. We’re also using reinforcement learning to train models to predict ground truth outcomes, which would be impossibly slow without backtesting. We’ve created a training dataset of >10,000 high-quality forecasting questions from 2024-2025. We want to create AI systems that embody a much greater volume of forecasting experience than any human could accumulate.

Unlock 2: Speed and scale

The top human forecasters are limited in number and procuring their services can take weeks or months. Kalshi and Polymarket have their own scale problem: most markets have <$100k total volume, and the distribution of questions is biassed towards topics that people want to bet on, like crypto, sports, elections, and the return of Jesus. Moreover, these markets don’t require the forecasters to give detailed explanations, even though that’s what the consumers of forecasts often want.

With AI forecasting, we can build a much better experience, where predictions are delivered rapidly and tailored to the client’s needs. We can scale up:

The number of predictions made in parallel
The quantity of information the forecaster uncovers
The frequency at which predictions refresh over time

Illustrating these last two advantages: there was a question in the Metaculus Cup about how much photovoltaic capacity China would install in July 2025 following the end of subsidies. Our system made a fresh prediction every day since the question opened in early July. Its research uncovered key developments early, suggesting a big drop from the May figures that were initially available, and the community average prediction has slowly come to agree:

How much photovoltaic capacity will China install in July 2025?

Industrial scale prediction

What could you do with a team of 1,000 superforecasters working for you? To take advantage of this new level of supply, we need to innovate in the product experiences we build around forecasting.

Forecasting abundance means we can move beyond the level of a single prediction, finding new ways to combine predictions. For illustration, inspired by the US airstrikes on Iranian nuclear facilities in June, here is an animation of the likelihood of a US strike inside the territory of each Middle Eastern country through Jan – Aug 2025:

Each prediction is the likelihood of a US strike in the subsequent 6 months. We made these predictions retrospectively using backtesting. We ran 3 per month and smoothed to a daily value.

Another illustration: for each company listed on the German DAX index, here is the likelihood that their CEO is removed or resigns in the next 12 months, predicted from the perspective of each month in 2025:

The predictions cover all 40 companies on the German DAX, except those whose CEOs were already known to be leaving.

At this scale and frequency, the predictions can function as a radar, scanning the world and picking out what's worth paying attention to.

With forecasting abundance, we can also scale up the depth of predictions we concentrate on a single scenario. If the CEO of Adidas is removed, who will replace them? And when? The goal is to build a much richer picture of the future than has ever been possible.

The future is large

We need industrial scale prediction. Part of Toby’s research at Google DeepMind was about forecasting the future of AI capabilities and geopolitics. However, there are so many possible trajectories that this becomes cognitively overwhelming: the future is too big to fit inside one human brain. Meanwhile, we’ve seen that AlphaFold can do the work of a whole PhD-length project in minutes. At Mantic, we want to do the same for understanding the future.

Work with us

Excited by our mission? We are hiring.