Easy scaffolding for machine learning pipelines in Scikit-Learn
neuropipe
is an easy-to-use microframework for machine learning pipelines in Scikit-Learn. Simply answer a series of questions about your project in the command-line, and a custom template will be created, fit with preprocessing functions and models that are most likely to be relevant to your task.
The questions are based on the Scikit-Learn cheatsheet for selecting the right learning algorithm:
Install neuropipe
with pip:
$ pip3 install neuropipe
Then go through the questionnaire and create your custom project folder by running:
$ neuropipe my_project
This will generate a pipeline…
A new project has the following file structure:
├── pipeline.py
├── engine/
│ ├── data_bus.py
│ ├── preprocessing.py
│ ├── model.py
│ └── analytics.py
├── data/
│ ├── raw.csv
│ └── processed.csv
├── models/
└── figures/
designed to be modular, blah blah blah
(Not finished yet, do not read)
Where:
pipeline.py
: defines the entire pipeline, from fetching to ….
pipeline.py
: defines the entire pipeline, from fetching raw data to model evaluation
engine/
: holds all of the modules used in your pipelineengine/data_bus.py
: defines the functions for fetching and loading raw data into the pipelineengine/preprocessing.py
:engine/model.py
: defines objects for learning models selected by neuropipeengine/analytics.py
:data/
: cache of your pre and post-processed datamodels/
: holds serialized…figures/
: holds visualizations (i.e. correlation heatmaps, confusion matrices, histograms, etc.)All of the pipeline templates are based on my own personal projects, and are therefore very limited. If you’ve built an easily generalizable pipeline that isn’t featured here, or have an idea for one, I strongly encourage you to add it to the template library. Read the contributing guidelines to learn how.
Pending features: