Methodology — How We Measure Pokémon Popularity

At a glance

The Dataset in Numbers

1,025Pokémon tracked

264Months of data

2004Earliest month

2025Latest month

The database contains monthly Google search interest data for all 1,025 Pokémon in the National Pokédex. Each Pokémon has a normalized interest score for every month from January 2004 through December 2025 — 270,600 data points in total.

Data source

Google Trends

All data comes from Google Trends, Google's public tool for exploring the relative search volume of any query over time. It is the only freely available source with consistent, long-term (20+ year) coverage of named entities across hundreds of languages and regions.

We query Google Trends by Knowledge Graph entity (a unique identifier like /m/014wp5 for Mewtwo), not by plain keyword. This is an important distinction: a topic search for the Mewtwo entity only counts searches that Google has confidently classified as being about the Pokémon, filtering out unrelated searches that happen to contain the word.

Why entity search, not keyword search?

Keyword searches for names like "Persian," "Golem," or "Electrode" would be polluted by searches for the cat breed, geological formations, and car brands respectively. Querying the Knowledge Graph entity for each Pokémon is far more accurate — though it introduces its own limitations (see Ambiguous Names below).

Collection process

How We Queried the Data

Google Trends normalizes scores within each query batch: the most-searched item in the batch scores 100, and every other item is scaled proportionally. This means a single query for one Pokémon tells you nothing about how it compares to any other. To make scores cross-comparable, we used an anchor approach.

1
Anchor selection — We chose Mewtwo (/m/014wp5) as the anchor because it has a stable, consistent search signal across the full 20-year window with no gaps or anomalous spikes that would distort normalization.
2
Standalone anchor baseline — We first queried Mewtwo alone to establish its true baseline score over the full time period.
3
Batched queries — Every other Pokémon was queried in batches of five: four Pokémon plus Mewtwo. This gave us the raw score for each Pokémon relative to Mewtwo within that batch.
4
Cross-batch normalization — We applied the anchor correction formula to put all scores on the same scale.

# For each month and each Pokémon:
normalized = raw_score × (mewtwo_standalone / mewtwo_in_batch)

# Where:
# raw_score = Google's 0–100 score for the Pokémon in its batch
# mewtwo_standalone = Mewtwo's score when queried alone (the true baseline)
# mewtwo_in_batch = Mewtwo's score when it appeared in the same batch

A Pokémon that scores higher than Mewtwo in its batch will have a normalized value above Mewtwo's baseline. Pikachu, for example, consistently outscores Mewtwo, so its normalized scores exceed Mewtwo's baseline across most months. The resulting scale is not capped at 100 — scores can go higher.

Interpreting the numbers

What the Scores Actually Mean

The normalized score is a relative measure of Google search interest, not an absolute count of searches. A score of 50 does not mean "50 million searches" — it means that Pokémon received roughly half as much search interest as Mewtwo in the same month.

What the scores are good for:

Comparing Pokémon to each other within the same dataset
Tracking a single Pokémon's interest trend over time
Identifying spikes tied to game releases, anime events, or cultural moments
Ranking Pokémon within categories (starters, legendaries, by type, etc.)

What the scores should not be used for:

Estimating raw search volumes or audience sizes
Comparing scores across different datasets or time windows that weren't normalized together
Drawing conclusions about regions outside the United States (see Geographic Scope below)

Known limitations

Caveats to Keep in Mind

Search interest ≠ Popularity

High search volume can reflect many things: genuine popularity, competitive relevance, controversy, a new game release, a viral moment, or simply that a Pokémon has a name people frequently misspell. Charizard spikes every time a new game features it prominently. Pikachu never stops being searched because it is the mascot of the brand. A newly released Legendary might spike hard on launch and fade quickly. Use the trend shape, not just the average, to understand what the data is actually showing.

Generation Bias — Newer Pokémon start at zero

The data begins in January 2004. Pokémon introduced before 2004 (Generations I–III, released 1996–2003) have a full 20-year trend window. Generation IV Pokémon (2006) start appearing mid-dataset. Generation IX Pokémon (2022) only have about 3 years of data. When comparing average scores across generations, newer Pokémon are naturally penalized — they've had less time to accumulate interest. Treat cross-generation comparisons with care.

Normalization Introduces Compounding Uncertainty

The anchor correction assumes Mewtwo's search interest is stable enough to serve as a consistent reference point. In practice, Mewtwo's own signal can fluctuate — particularly around Mewtwo-heavy game releases (e.g. Pokémon GO's Mewtwo raids in 2017–2018). When Mewtwo's standalone score spikes, the correction factor temporarily inflates the normalized scores of whatever batch was collected during that period. This effect is small but real, and it accumulates across hundreds of batches.

Geographic Scope — United States Only

All trend data was collected without a geographic filter, which in Google Trends defaults to worldwide. However, the regional breakdown (state-level data) is U.S.-only. Worldwide data for a franchise like Pokémon is heavily influenced by English-speaking markets, particularly the United States. Search patterns in Japan, Europe, and other major Pokémon markets may differ significantly and are not captured in the state-level regional analysis.

Monthly Granularity Smooths Short Spikes

Google Trends returns monthly data when the time range spans more than ~90 days. This means event-driven spikes (a Pokémon trending on Twitter for a week, a surprise reveal at a Nintendo Direct) are averaged into the month's score rather than visible as a discrete peak. Daily or weekly granularity would tell a different story for many Pokémon.

Ambiguous Names — Some Data May Be Noisy

Even with Knowledge Graph entity queries, Google's topic classification is imperfect for Pokémon whose names overlap with well-known real-world entities. Pokémon like Persian (the cat breed), Golem (the mythological creature), and Electrode (the electrical component) had their MIDs verified manually, but we cannot guarantee that all classified searches are genuinely about the Pokémon. Their trend data should be treated as approximate.

Coverage notes

Keyword Fallbacks & Data Gaps

15 Pokémon tracked via keyword search, not entity ID

For 15 Pokémon we could not reliably identify a unique Google Knowledge Graph entity — typically because the name is too short, too generic, or heavily shared with a real-world entity. Rather than exclude them, we fell back to plain keyword searches (e.g. "absol pokemon") which are slightly noisier but still directionally accurate for low-ambiguity names.

Absol Abra Aron Durant Electrode Farfetch'd Golem Klang Klink Natu Paras Persian Seel Sirfetch'd Volbeat

Topic entity gaps for 15 Pokémon (~2016–2021)

A handful of well-known Pokémon — including Rayquaza, Blastoise, Umbreon, and Dragonite — show zero search interest in Google Trends topic queries for stretches of roughly 2016–2021, despite clearly having real search activity during that period. This appears to be a limitation in Google's Knowledge Graph topic coverage for those years: the entity ID stops receiving attributed searches even though keyword searches for those names continued normally.

We confirmed this by comparing topic queries directly against keyword queries in the Google Trends console. Because the gaps are an artifact of the data source rather than genuine zero interest, these months are flagged but not imputed — the trend charts for affected Pokémon will show a flat zero during those windows.

Technical details

Tools & Stack

Data was collected using pytrends (v4.9.2), an unofficial Python wrapper for the Google Trends API. Collection ran in batches with exponential backoff to handle Google's rate limits (429 errors), with automatic resume checkpointing so interrupted runs could pick up where they left off.

All data is stored in a local SQLite database with three tables: monthly_normalized (the main trend data), monthly_raw (pre-normalization scores), and regional_state (U.S. state-level interest). The site is served by a lightweight Flask app that queries the database on demand.

Pokémon metadata (names, types, generations, sprite URLs, legendary/mythical flags) was sourced from the PokéAPI and stored as a local JSON file to avoid runtime API calls.

How We MeasurePokémon Popularity