Skip to content
View pquintero's full-sized avatar
🎯
🎯

Block or report pquintero

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A complete computer science study plan to become a software engineer.

304,814 76,486 Updated Sep 13, 2024

DuckDB is an analytical in-process SQL database management system

C++ 23,027 1,832 Updated Sep 27, 2024

A high-performance database dumping application

Java 1 1 Updated Mar 14, 2024

Apache Cassandra®

Java 8,739 3,606 Updated Sep 27, 2024

Uber Rides Python SDK (beta)

Python 174 66 Updated Mar 19, 2023

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch…

Python 1,784 285 Updated Dec 2, 2023

A GPU-powered real-time analytics storage and query engine.

Go 3,021 234 Updated Jul 13, 2024

Generic Data Ingestion & Dispersal Library for Hadoop

Java 479 111 Updated Mar 19, 2023

Uber's cross-platform mobile architecture framework.

Kotlin 7,742 903 Updated Jul 21, 2024

This is a repo with links to everything you'd ever want to learn about data engineering

10,476 1,452 Updated Sep 11, 2024

Upserts, Deletes And Incremental Processing on Big Data.

Java 5,342 2,419 Updated Sep 27, 2024

TeamCity docker images

Dockerfile 103 59 Updated Sep 26, 2024

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs

Scala 7,471 1,680 Updated Sep 27, 2024

The source code for the book Modern Data Engineering with Apache Spark

Scala 31 36 Updated Jul 26, 2022

Exploit scripts

Python 12 3 Updated Apr 10, 2022

This repository provides a command line interface (CLI) utility that replicates an Amazon Managed Workflows for Apache Airflow (MWAA) environment locally.

Shell 672 683 Updated Sep 27, 2024

Official source for Docker configurations, images, and examples of Dockerfiles for Oracle products and projects

Shell 1 Updated Mar 7, 2022

Official source of container configurations, images, and examples for Oracle products and projects

Shell 6,530 5,423 Updated Sep 27, 2024

A Spark plugin for reading and writing Excel files

Scala 463 147 Updated Sep 21, 2024

Curated list of resources about Apache Airflow

Shell 3,656 493 Updated Aug 20, 2024

BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.

Java 373 194 Updated Sep 6, 2024

a curated list of docker-compose files prepared for testing data engineering tools, databases and open source libraries.

Jupyter Notebook 568 58 Updated Sep 17, 2023

Easy to maintain open source documentation websites.

TypeScript 55,792 8,356 Updated Sep 27, 2024

This repository contains the basic definition for the AWS Glue DataCatalog Database

HCL 2 3 Updated Dec 28, 2021

This repository contains the basic definition for the AWS Glue job deployment

HCL 1 1 Updated Dec 26, 2021

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

Python 36,483 14,131 Updated Sep 27, 2024

Git repo to accompany the AWS DevOps Blog: Using AWS DevOps Tools to model and provision AWS Glue workflows

Python 17 18 Updated Nov 16, 2021

Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.

Jupyter Notebook 10,018 6,745 Updated Sep 27, 2024

easy-rsa - Simple shell based CA utility

Shell 4,015 1,190 Updated Sep 14, 2024

Demo for exploiting SSRF in AWS ec2 instance.

CSS 1 2 Updated May 10, 2020
Next