Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
-
Updated
May 1, 2024 - Jupyter Notebook
Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
A collection of handy Bash One-Liners and terminal tricks for data processing and Linux system maintenance.
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
A tool that uses advanced Monte Carlo simulations and Turbit parallel processing to create possible Bitcoin prediction scenarios.
Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/
Large-scale pretraining for dialogue
Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
A light-weight, flexible, and expressive statistical data testing library
Machine Learning notebooks for refreshing concepts.
Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
Lineage metadata API, artifacts streams, sandbox, API, and spaces for Polyaxon
Use this template repository to write projects and tenders data ingestion pipelines
A list about Apache Kafka
Concurrent and multi-stage data ingestion and data processing with Elixir
Extract Transform Load for Python 3.5+
Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool. Supports conversion between formats and can be used as a Go package.
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Elastic data processing with Apache Pulsar and Apache Flink
Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CASL project: http://casl-project.ai/
Large-scale pretrained models for goal-directed dialog
Add a description, image, and links to the data-processing topic page so that developers can more easily learn about it.
To associate your repository with the data-processing topic, visit your repo's landing page and select "manage topics."