Scraper Guide
This project includes a powerful Python-based scraper to extract officer data from stfc.space. Follow this guide to run it locally and generate your own dataset.
Prerequisites
- Python 3.10 or higher
- Node.js (for the website)
- A terminal or command prompt
Installation
# Clone the repository
git clone https://gitlab.com/your-repo/stfc-space-scraper.git
cd stfc-space-scraper
# Set up virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -e ./scripts/stfc_scraper
playwright install chromium Running the Scraper
The scraper can be run using the following command:
stfc-scraper --output officers.csv Options
| Flag | Description | Default |
|---|---|---|
| --output, -o | Output CSV path | officers.csv |
| --no-headless | Run with a visible browser window | Headless |
| --delay, -d | Delay between requests (seconds) | 1.0 |
| --lang | Language for logs (en, es) | en |
| --page-start | Page to start scraping from | 1 |
| --page-end | Page to stop scraping at | 14 |
| --limit, -l | Limit the number of officers to scrape | None |
How it Works
The scraper visits the officer list on stfc.space, extracts the detail links, and then navigates to each officer profile. It uses a combination of Playwright for page loading and JavaScript injection to extract:
- Basic info: Name, Rarity, Group, Faction, Avatar Image.
- Abilities: Captain, Officer, and Below Decks abilities.
- Stats: Attack, Defense, Health (Max Level).
- Traits and Synergy Officers.