- [Docker for Mac] (https://download.docker.com/mac/stable/Docker.dmg)
- docker pull redis
- docker pull sequenceiq/hadoop-docker:2.4.1
- docker pull kbastani/docker-neo4j:2.2.1
- docker pull kbastani/neo4j-graph-analytics:1.1.0
- docker run --name redis-frontierindex -d -p 6379:6379 redis
- docker run --name redis-docindex -d -p 6379:6380 redis
- docker run --name redis-urlresolver -d -p 6379:6381 redis
- docker run --name redis-titleindex -d -p 6379:6382 redis
docker run -i -t --name hdfs sequenceiq/hadoop-docker:2.4.1 /etc/bootstrap.sh -bash
docker run -i -t --name mazerunner --link hdfs:hdfs kbastani/neo4j-graph-analytics:1.1.0
Replace and with the location to your existing Neo4j database store directory:
- docker run -d -P -v /Users///data:/opt/data --name graphdb --link mazerunner:mazerunner --link hdfs:hdfs kbastani/docker-neo4j:2.2.1
NOTE: To Run the project at this point run the WebCrawlerController.java file located in the webcrawler package. I'm still working out the kinks, so for now stop the project manually when you are done. Check the various databases to view the results of your crawl.