Skip to content
View AlexIoannides's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report AlexIoannides

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

Data Engineering

14 repositories
Python 116 4 Updated May 30, 2023

Represent, send, store and search multimodal data

Python 3,023 234 Updated Feb 25, 2025

Malloy is an experimental language for describing data relationships and transformations.

TypeScript 2,087 79 Updated Mar 9, 2025

Transforms PDF, Documents and Images into Enriched Structured Data

JavaScript 5,935 312 Updated Dec 3, 2023

Query SQLite files in S3 using s3fs

Python 500 8 Updated Sep 14, 2022

BlackJAX is a Bayesian Inference library designed for ease of use, speed and modularity.

Python 885 109 Updated Feb 19, 2025

Simple, modern and fast file watching and code reload in Python.

Python 1,934 108 Updated Jan 10, 2025

This is a guide to PySpark code style presenting common situations and the associated best practices based on the most frequent recurring topics across the PySpark repos we've encountered.

Python 1,108 138 Updated Sep 19, 2024

Tool for probabilistically linking the records of individual entities (e.g. people) within and across datasets

Python 109 3 Updated Dec 4, 2024

A modular SQL linter and auto-formatter with support for multiple dialects and templated code.

Python 8,644 785 Updated Mar 8, 2025

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, Du…

Rust 4,241 262 Updated Mar 8, 2025

A simple and efficient tool to parallelize Pandas operations on all available CPUs

Python 3,727 216 Updated Jul 9, 2024

Recipes for using Python's polars library

Jupyter Notebook 257 12 Updated Sep 8, 2024

SQLAlchemy driver for DuckDB

Python 393 47 Updated Mar 7, 2025