A collection of Python and Node.js scripts for automatically downloading, organising, and managing Pokémon TCG card images from multiple official sources, with integration to the TCGdex API.
Built to support a card image repository across multiple languages including Japanese, Korean, Chinese (Traditional & Simplified), Thai, and Indonesian.
- Scrapes card images from the official Pokémon TCG Asia website, the official Japanese site, Pokellector, Serebii, and pcg-search.com
- Automatically detects the correct output folder based on the URL region code
- Filters downloads using TCGdex CSV data so only missing cards are downloaded
- Organises images by language and set with sequential positional numbering (001.png, 002.png…)
- Automatically moves completed sets to
Collected/on 100% success - Packages sets into upload-ready batch zips under 2 GB
- Cleans zip files of non-image files before upload
- Moves CSV logs out of image folders so they don't interfere with uploads
| Script | Source | Description |
|---|---|---|
scrape_pokemon_images.py |
asia.pokemon-card.com | Main scraper. Paste one or more URLs, auto-detects language folder, downloads missing card images |
scrape_official_japanese.py |
pokemon-card.com | Scans Japanese Need folders, checks each set against the official API, downloads available sets as JPG |
scrape_pokellector_images.py |
jp.pokellector.com | Scrapes Japanese card images from Pokellector, visits each card detail page to get image URLs |
scrape_serebii_images.py |
serebii.net | Scrapes Japanese card images from Serebii, downloads missing card images by set |
scrape_pcgsearch_images.py |
pcg-search.com | Scrapes vintage Japanese sets (PMCG, e, PCG eras). Auto-detects sets from Need folder, downloads images directly — no detail-page visit needed |
| Script | Description |
|---|---|
batch_zip.py |
Groups set folders from Collected/ into upload-ready batch zips (max 1.9 GB each) in Uploaded/ |
clean_zips.py |
Scans all zips in Collected/ and Uploaded/, removes non-image files, moves CSVs to logs |
move_collection_csvs.py |
Moves CSVs out of Collected/ set folders into the logs folder. Auto-creates logs and Uploaded/ folders if missing |
check_missing_images.py |
Reads the TCGdex status page and creates folders for sets with missing images |
run_missing_reports.py |
Runs missing-images-report.js across all set folders in bulk |
create_set_folders.py |
Reads TCGdex CSVs and creates matching set folders in the correct language directory |
missing-images-report.js |
Node.js script that queries the TCGdex API and generates a CSV of missing images per set |
- Python 3
- Node.js
Install Python dependencies (run once):
# Mac / Linux
pip3 install requests beautifulsoup4
# Windows
pip install requests beautifulsoup4Install Node.js dependencies (run once):
npm installThe scripts are designed to work with the following layout. Set it up once and everything is detected automatically.
Missing Images CSVs/
├── scrape_pokemon_images.py
├── scrape_official_japanese.py
├── scrape_pokellector_images.py
├── batch_zip.py
├── clean_zips.py
├── check_missing_images.py
├── run_missing_reports.py
├── create_set_folders.py
├── move_collection_csvs.py
├── missing-images-report.js
├── package.json
└── MissingImages/
└── Language Name/
├── Need/
│ └── SetID/ ← place the TCGdex CSV for this set here
├── Collected/ ← sets move here automatically on success
├── Uploaded/ ← batch zips go here, ready for upload
└── Missing Reports and Collection Logs/
Mac (prevents sleep during long runs):
caffeinate -i python3 "/path/to/scrape_pokemon_images.py"Linux:
systemd-inhibit python3 scrape_pokemon_images.pyWindows:
python scrape_pokemon_images.pyTip: Disable sleep manually in Power Settings before a long run on Windows.
When prompted, paste one or more URLs (one per line) then press Enter on a blank line to start:
https://asia.pokemon-card.com/tw/card-search/list/?expansionCodes=SV9
https://asia.pokemon-card.com/tw/card-search/list/?expansionCodes=SV9a
The script will:
- Auto-detect the correct language folder from the URL region code
- Load the filter CSV if one exists in the set folder
- Download only the missing card images with polite delays
- Move completed sets to
Collected/on success - Skip sets already in
Collected/orUploaded/
caffeinate -i python3 "/path/to/scrape_official_japanese.py"Press Enter to auto-scan all folders in Japanese/Need/ — the script checks each set code against the official API and skips sets not available on the site (older vintage sets). Only sets that return results are downloaded.
caffeinate -i python3 "/path/to/scrape_pokellector_images.py"Paste the Pokellector set page URL when prompted.
caffeinate -i python3 "/path/to/scrape_serebii_images.py"caffeinate -i python3 "/path/to/scrape_pcgsearch_images.py"Automatically scans your Japanese/Need/ folder and downloads any sets it recognises from pcg-search.com. No URL input required. Supported sets: PMCG1/3/4/5/6, E1–E5, PCG1–PCG9.
To add a new set: right-click a card image on pcg-search.com, copy the image address, then add one line to SET_CONFIG in the script:
"SETCODE": ("folder", "prefix", "日本語名"),Once scraping is complete, run batch_zip.py to package all sets in Collected/ into upload-ready batch zips:
python3 batch_zip.pyEach batch zip contains set folders with images directly inside — no zip-in-zip. Batches are capped at 1.9 GB to stay safely under common 2 GB upload limits. Re-run safe — sets already included in a previous batch are skipped automatically.
If you have zips that may contain non-image files (__MACOSX, .DS_Store, CSVs, etc.):
python3 clean_zips.pyScans all zips in Collected/ and Uploaded/ across every language, removes non-image files, and rewrites the zip cleanly.
| URL Region | Language Folder |
|---|---|
/hk/ |
Chin (t) |
/hk-en/ |
English |
/tw/ |
Chinese (simplified) |
/th/ |
Thai |
/id/ |
Indonesian |
- Scrapers use polite delays (2–4 seconds) between downloads to avoid overloading servers
- If a run is interrupted, simply rerun — already downloaded images are skipped automatically
- The TCGdex API is maintained by a single developer. Bulk report scripts are rate-limited (
--concurrency 3 --rps 5) - Older Japanese sets (pre-SV era) are not available on the official Japanese site — the scraper detects and skips these automatically
Card data provided by TCGdex. Card images sourced from the official Pokémon TCG Asia website, the official Japanese Pokémon Card site, and Pokellector.