#

preprocessing-data

Here are 54 public repositories matching this topic...

hyperimpute

vanderschaarlab / hyperimpute

A framework for prototyping and benchmarking imputation methods

python data-science machine-learning scikit-learn imputation machine-learning-prerequisites imputation-algorithm preprocessing-data

Updated Apr 4, 2023
Python

ELHoussineT / AutoDataCleaner

Simple and automatic data cleaning in one line of code! It performs one-hot encoding, date & time casting to datetime dtype, detects binary columns, safely convert non-numeric columns to numeric dtypes, cleaning dirty/empty values, normalizing values and removing unwanted columns all in one line of code. Get your data ready for model training an…

data-science machine-learning data-analysis cleaning-data preprocessing-data

Updated May 22, 2021
Python

NLPiper

dlite-tools / NLPiper

NLPiper is a package that agglomerates different NLP tools and applies their transformations in the target document.

nlp text text-analysis text-processing preprocessing nlp-parsing nlp-library preprocessing-data

Updated Aug 25, 2023
Python

ArthurMangussi / pymdatagen

A Python Library for the Generation of Artificial Missing Data

machine-learning missing-data amputation preprocessing-data

Updated Feb 26, 2026
Python

Brokttv / food101-preprocessing

A clean and modular pipeline for preprocessing the Food-101 dataset using both folder-based and CSV-based workflows.

datasets food-101 preprocessing-data dataset-parser

Updated Jul 15, 2025
Python

courtois-neuromod / ds_prep

All the scripts to prepare the Courtois-Neuromod dataset

preprocessing-data

Updated Dec 12, 2025
Python

XuanyiJennyMa / pupil_cloud_data_preprocessing_Phase_1

Scripts for pre-processing eye-tracker data from pupil cloud

eye-tracking pupillometry preprocessing-data

Updated Apr 26, 2024
Python

Sabaudian / Music_Genre_Classification_project

Audio Pattern Recognition project - Music Genres Classification

python machine-learning neural-network random-forest svm audio-analysis artificial-intelligence music-information-retrieval preprocessing music-genre-classification audio-classification svm-classifier audio-processing k-nearest-neighbours k-nn genre-classification genres-classification random-forest-classification preprocessing-data

Updated Jul 27, 2025
Python

nlqthinh / WeaviateAnime

Explore your favorite anime with this interactive search app! 🚀 This project leverages Weaviate for vector search and Gradio for a seamless user interface. Using embeddings from a custom anime dataset, you can perform quick and accurate similarity searches for anime titles

python docker anime gradio weaviate preprocessing-data vectordb

Updated Feb 26, 2025
Python

alvaro-concha / animal-behavior-preprocessing

animal-behavior-preprocessing is a Python repository to preprocess animal behavior data. It works on the output spreadsheets from video-tracking of animal body parts with LEAP or DeepLabCut. It applies a Median Filter, an Ensemble Kalman Filter, transforms data to joint angles and computes their Morlet Wavelet Spectra.

pipeline data-engineering feature-extraction filtering cleaning-data preprocessing-data

Updated Dec 12, 2024
Python

lawl2 / object-detection-and-spatial-relation

PhilaController / gun-violence-dashboard-data

Python toolkit for preprocessing data for the City Controller's Gun Violence Dashboard

philadelphia python3 web-scraping python-toolkit gun-violence preprocessing-data

Updated Jan 27, 2025
Python

BirchKwok / spinesUtils

A library that provides template code for Python development to shorten the project development cycle.

data-science machine-learning machine-learning-algorithms preprocessing-data

Updated Mar 8, 2025
Python

Multiomics-Analytics-Group / acore

Functionality to preprocess and analyse multi-omics data

analysis omics omics-data-integration preprocessing-data

Updated Mar 12, 2026
Python

Ryannn06 / Analysis-of-DepEd-Schools-Masterlist

This project uses the S.Y. 2020-2021 DepEd Schools Masterlist that contains 64,000+ school information across the Philippines, including location, sectors, and classification details.

python sql preprocessing-data sql-analysis

Updated Dec 28, 2025
Python

gaurav-singh7092 / ResuMatch

An AI-powered resume and job description matching application using natural language processing and machine learning techniques. This application provides intelligent analysis of resume-job compatibility with detailed scoring and recommendations.

python nlp nextjs keyword-extraction similarity-score tailwind fastapi preprocessing-data

Updated Jul 9, 2025
Python

EslamElbassel / MNIST-Dataset-Classification-with-KNN-using-centroid-preprocessing

MNIST is a Dataset for images of handwritten digits Classification with KNN by extracting features using centroid

machine-learning mnist-classification mnist-dataset knn-classification centroid preprocessing-data

Updated May 11, 2021
Python

sorrychoe / pyBigKinds

BigKinds Data Analysis Toolkit for python

python text-mining journalism newsdata preprocessing-data

Updated Jan 3, 2026
Python

anishdeshmukh9 / AI-model-Training-Disease-prognosis

this was a academic project that showcase my pre&post ML model knowledge such as, data collection, data preprocessing, AI model training( ML) and finetune the model

data ml preprocessing preprocessing-data

Updated Jun 4, 2025
Python

Hareeswar2006 / PickMyModel

PickMyModel is an end to end AutoML and meta-learning system that automatically analyzes user-uploaded datasets, recommends suitable models based on learned patterns from previous datasets. The system extracts rich statistical meta features, applies reusable preprocessing pipelines and trains and evaluates multiple models.

machine-learning scikit-learn regression classification data-analysis feature-engineering fastapi preprocessing-data

Updated Feb 10, 2026
Python

Improve this page

Add a description, image, and links to the preprocessing-data topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the preprocessing-data topic, visit your repo's landing page and select "manage topics."