apache / spark
Apache Spark - A unified analytics engine for large-scale data processing
See what the GitHub community is most excited about today.
Apache Spark - A unified analytics engine for large-scale data processing
Source code for Twitter's Recommendation Algorithm
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.
Scala language server with rich IDE features 🚀
The Daml smart contract language
TheHive: a Scalable, Open Source and Free Security Incident Response Platform
Removes large or troublesome blobs like git-filter-branch does, but faster. And written in Scala
A Git platform powered by Scala with easy installation, high extensibility & GitHub API compatibility
sbt, the interactive build tool
An open protocol for secure data sharing
Hybrid visual and textual functional programming.
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
♞ lichess.org: the forever free, adless and open source chess server ♞
Functional GraphQL library for Scala
FEEL parser and interpreter written in Scala
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
Spark: The Definitive Guide's Code Repository
A Spark plugin for reading and writing Excel files
Modern Load Testing as Code