A wiki crawler to populate a neo4j graph database.
- Download Neo4j
- After your download, they will take you to a guide for your OS, follow that (making your password "password"). Go until "Explore Sample Datasets".(Sample walkthrough link for OS X)
- At this point, you should have a graph going.
- It's recommended you have a virtual environment
pip install -r requirements.txt
- The pokemon graph extractor:
python wiki2graph/pokemon/go.py
For now, as Development installation
The pokemon extractor is a "plug in" for the crawler that lets it recognize pokemon-related entities and relations.
Creating an extractor for another domain requires two parts:
- Create an extractor that inherits from Extractor (in wiki2graph.graph)
class PokemonExtractor(Extractor)
- When you create your crawler, send a list of the extractors you created (NOTE: currently doesn't work this way but it should)
c = Crawler('http://pokemon.wikia.com', '/wiki/Bulbasaur', limit=800, extractors=[YOUR EXTRACTORS HERE])