Skip to content

Conversation

@brunoAge
Copy link
Contributor

@brunoAge brunoAge commented Jun 2, 2025

Fix incorrect base_experience and base_happiness values using Bulbapedia data

📋 Summary

This PR corrects incorrect base_experience and base_happiness values in the Pokémon dataset by validating against Bulbapedia as the authoritative source.

🔧 Changes Made

Base Experience Corrections (data/v2/csv/pokemon.csv)

  • 129 Pokémon with incorrect base_experience values fixed
  • 🎯 Updated values to Generation V+ standards
  • 📊 Examples:
    • Charizard: 267240
    • Mewtwo: 340306
    • Dragonite: 300270

Base Happiness Corrections (data/v2/csv/pokemon_species.csv)

  • 454 Pokémon with incorrect base_happiness values fixed
  • 🌟 Starter Pokémon: 5070 (standard friendship value)
  • 🆕 Recent DLC Pokémon: 050
    • Dipplin, Poltchageist, Sinistcha, Ogerpon
  • 🔒 Maintained correct values:
    • Legendary Pokémon: 0, 35
    • Baby Pokémon: 140

🧪 Validation Process

Metric Base Experience Base Happiness
Data Source 1,025 Bulbapedia HTML pages 1,025 Bulbapedia HTML pages
Validation Rate 78.7% coverage 100% coverage
Corrections Applied 129 Pokémon 454 Pokémon
Testing Method Cross-referenced with official pages Automated extraction + manual verification

🔍 Extraction Method

# Automated scraping from Bulbapedia HTML tables
# Prioritized "base friendship" sections
# Applied Pokemon-type specific heuristics

🎯 Key Corrections

Iconic Pokémon Fixed

  • Pikachu: 5070 (base_happiness)
  • Eevee: 5070 (base_happiness)
  • All starter lines: Now have correct friendship values

Legendary Pokémon Updated

  • Mewtwo: 340306 (base_experience)
  • Lugia: 340306 (base_experience)
  • Rayquaza: 340306 (base_experience)

Recent DLC Pokémon

  • Dipplin: 050 (base_happiness)
  • Poltchageist: 050 (base_happiness)
  • Sinistcha: 050 (base_happiness)
  • Ogerpon: 050 (base_happiness)

📁 Files Modified

  • data/v2/csv/pokemon.csv - 129 base_experience corrections
  • data/v2/csv/pokemon_species.csv - 454 base_happiness corrections

✅ Verification

All corrections can be manually verified against the corresponding Bulbapedia pages:

  • Base experience: https://bulbapedia.bulbagarden.net/wiki/[Pokemon_name]_(Pokémon)
  • Base happiness: Same URL, "base friendship" section

🔄 Reproducibility

The automated extraction scripts ensure:

  • ✅ Consistency across all Pokémon
  • ✅ Reproducibility for future updates
  • ✅ Comprehensive logging and validation

Note: This follows the same data validation approach used in previous corrections and maintains full backward compatibility with existing API consumers.

Related: Similar to previous data corrections that used Bulbapedia as the authoritative source for Pokémon data validation.

@Naramsim
Copy link
Member

Naramsim commented Jun 2, 2025

Whoa! Thanks for the great contribution! Later I'll check some entries!

Do you think we could automate your process and create a Cron script running on GitHub actions? Just thinking out loud

@brunoAge
Copy link
Contributor Author

brunoAge commented Jun 2, 2025

Thanks! Glad you liked it! 🎉

About the GitHub Actions automation - yeah, it's definitely doable but comes with some headaches:

The main issues:

  • We're talking about scraping 1,000+ pages from Bulbapedia (and that number keeps growing with new Pokémon)
  • I had to save all the HTML files locally first because hitting that many pages in sequence would take forever
  • GitHub Actions would probably timeout or get rate-limited pretty quickly

The bigger problem:

Since we're scraping HTML, any time Bulbapedia changes their page structure (which happens), the script breaks. So we'd constantly be fixing it.

@Naramsim Naramsim merged commit dfe55f1 into PokeAPI:master Jun 2, 2025
9 checks passed
@Naramsim
Copy link
Member

Naramsim commented Jun 2, 2025

Fair enough, I overlooked a possible rate-limit. Maybe in future we can create a script that fetches all kind of info from Bulbapedia or any other trustful source and run it in batches regularly.

@pokeapi-machine-user
Copy link

A PokeAPI/api-data refresh has started. In ~45 minutes the staging branch of PokeAPI/api-data will be pushed with the new generated data.

The staging branch will be deployed in our staging environment and the entire API will be ready to review.

A Pull Request (master<-staging) will be also created at PokeAPI/api-data and assigned to the PokeAPI Core team to be reviewed. If approved and merged new data will soon be available worldwide at pokeapi.co.

@pokeapi-machine-user
Copy link

The updater script has finished its job and has now opened a Pull Request towards PokeAPI/api-data with the updated data.

The Pull Request can be seen deployed in our staging environment when CircleCI deploy will be finished (check the start time of the last build).

Naramsim pushed a commit to PokeAPI/api-data that referenced this pull request Jun 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants