Skip to content

Web scraper + public API for all BWF singles badminton player/match information from 2008-present, whose conception prompted a swift IP ban (tournamentsoftware pls unban me 😢)

License

Notifications You must be signed in to change notification settings

oscarlaaaa/badminton-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BWF Badminton API

Landing page

A badminton singles match API that has scraped match data from tournamentsoftware.com stored in an SQL database from between 2007 and present day. The database is automatically updated periodically every month. Endpoints for players, matches, and tournaments are all established in the application.

Right now the database only supports Singles events (Men's and Women's), but the API may expand to accommodate Doubles events in the future.

Current Features

  • Regular database updates scraped directly from TournamentSoftware every month
  • Multiple endpoints to facilitate various datapoint collections
  • Flexible endpoint queries to provide limits, parameters, and more
  • Landing page with detailed API usage instructions + FastAPI generated /docs page
  • Async-focused design for robust responsiveness
  • A very cool creator 😎

Motivation

This project is made for the purpose of providing data for a data analysis/visualization project which is in the works. Stay tuned!

How to Use

Visit here for detailed information on how to access the various endpoints of the API.

Visit here instead for the FastAPI-generated documentation, or to test out the various endpoints.

Progress Roadmap

  • Scrape matches from relevant event and return list of Matches
  • Compile list of BWF Tournaments either manually or through web-scraping
  • Make match data stored more complex to allow for greater data points (ex. time of day, bwf tournament level, etc.)
  • Concurrent scraping for tournament gatherer
  • Concurrent scraping for match gatherer
  • Concurrent scraping for player gatherer
  • Establish benchmarking to determine bottlenecks within scraping/data insertion process insertion process
  • Build foundation for MySQL-scraper interface to insert scraped data
  • Refactor and clean-up scraper code
  • Establish back-end API foundation for periodic DB updates using FastAPI
  • Set-up SQLAlchemy models and DB connection
  • Establish API endpoints to facilitate simple JSON get requests
  • Refactor and clean-up API code
  • Build simple static landing page to show people how to use the API
  • Build Docker Image
  • Load all scraped data onto hosted AWS MySQL server
  • Deploy onto cloud-service like AWS or Heroku

Technologies Used

  • Python3 (BeautifulSoup, Aiohttp, SQLAlchemy)
  • FastAPI
  • MySQL (AWS RDS)
  • AWS Lambda/Amplify/Gateway

What I've Learned

  • Web-scraping and data processing
  • Asynchronous design and concurrency
  • API development and design
  • ORMs and how useful they are!
  • To space out your webscraping so you don't get IP banned on a site you actually use regularly 🤡

How Can I Contribute?

If you'd like more or different endpoints for the project, feel free to clone the project, establish local database credentials in a .env file in the root folder, and submit a pull request. You can also test out the various endpoints and let me know if there are any bugs or convenience issues!

License

This product is licensed under the MIT license.

Credits

Special thanks to Vivian for helping me debug stuff

About

Web scraper + public API for all BWF singles badminton player/match information from 2008-present, whose conception prompted a swift IP ban (tournamentsoftware pls unban me 😢)

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages