Cloud scraper for cryptocurrency order book data.

Create an event schedule to scrape order book data from a public exchange API (here implemented with Kraken API), compresses it into relevant features, and insert it into a private database. The scraper is hosted serverless on AWS Lambda making it virtually free to run.

Resources

Tech

Zappa - deploy Python Lambdas and schedule events
krakenex - API for Kraken exchange
AWS Lambda - serverless compute service
AWS RDS - relational database service

Installation and Setup

Clone

Clone this repo to your local machine.

$ git clone https://github.com/hexamax/orderbook-crypto-scraper.git

Install Requirements

Zappa requires an active virtual environment to deploy. Either install and activate your own virtual environment or execute the following steps.

$ cd ohlc-crypto-scraper
$ virtualenv -p python3 venv
$ source venv/bin/activate
$ pip install -r requirements.txt

Configure Database

Write your database credentials into the corresponding fields of the database configuration file located at ohlc-crypto-scraper/db_config.py

WARNING: The scraper was written, used and tested for a PostgreSQL database only. For compatability make sure to be running a Postgres instance as well. To set up a low cost RDS Postgres instance on AWS check out this tutorial.

Zappa Settings (Optional)

The zappa_settings.json file was initialized with some sensible defaults and will run fine without additional manipulation. However, here are some easy changes you can make to customize your deploy:

Specify the rate at which the data scraping event is executed by changing the rate expression located in zappa_settings.json > events > expression.
Specify the aws region of your deploy in the aws_region field.
Specify a custom name for your S3 bucket using the s3_bucket field.

Deploy and Schedule

Initial Deploy

Use the following command for the initial deploy only.

$ zappa deploy scrape_event

Zappa will spit out the deployment information to your terminal and let you know if the deploy was succesfull. If the deploy was succesfull your data scraper should now be up and running.

Subsequent deploys are possible by calling zappa update.

$ zappa update scrape_event

Schedule

If you decided to change the rate expression in the zappa_settings.json file you can easily reschedule your scraper.

$ zappa schedule scrape_event

Undeploy

This will remove the Lambda function.

$ zappa undeploy scrape_event

Logs

You can monitor your scraper's AWS CloudWatch logs directly from the console.

$ zappa tail scrape_event

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
research		research
utils		utils
.gitignore		.gitignore
README.md		README.md
config.py		config.py
lambda_iterator.py		lambda_iterator.py
requirements.txt		requirements.txt
scraper.py		scraper.py
zappa_settings.json		zappa_settings.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cloud scraper for cryptocurrency order book data.

Resources

Tech

Installation and Setup

Clone

Install Requirements

Configure Database

Zappa Settings (Optional)

Deploy and Schedule

Initial Deploy

Schedule

Undeploy

Logs

About

Releases

Packages

Languages

maxlamberti/orderbook-crypto-scraper

Folders and files

Latest commit

History

Repository files navigation

Cloud scraper for cryptocurrency order book data.

Resources

Tech

Installation and Setup

Clone

Install Requirements

Configure Database

Zappa Settings (Optional)

Deploy and Schedule

Initial Deploy

Schedule

Undeploy

Logs

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages