Mltooler

_images/logo.png https://img.shields.io/github/last-commit/lewis-morris/mltooler:alt:GitHublastcommit https://img.shields.io/badge/contributors-welcomed-green:alt:HelpWanted

We’ve all been there (haven’t we?…) Spending hours trying to improve the accuracy of a model, switching out different scalers, transforming the data every which way. All in a bid to improve our score (and probably move up the kaggle ladder)

Mltools helps takes (some**) of the headache out of the process by automatically and iteratively improving the cross validation score of your chosen machine learning model.

Why not put your feet up and squeeze that last few percent out of what you thought might not even be possible.

What has it got, and what does it do?

Contained within Mltooler is a selection of tools to help the user improve their model with little to no effort. If you want to squeeze out every last little bit you can then you have come to the right place.

SelfImprovingEstimator
The star of the show - a pipeline wrapper for a selection of estimators and data preprocessing steps - Do you want to improve your cross validation score? Then this is the place to start.
VotingClassifier
A voting classifier with added bang - calculate the best breakdown of estimator weights with ease.
StackingEnsemble
A stacking estimator with a little magic.
HyperSearch
A wrapper for hyoperopt - making parameter searing easier than ever. With timed early stopping and preset parameter grids.
RandomSearch
A random parameter grid search.
EnembleEstimator

The home for a stack of SelfImprovingEstimators - includes a DIRTY way to speed up the processing of multiple estimators tuning.

Please note this will eventually use threads and not the current method of spawning the code in seperate python consoles.

Warning

Usage Notes:

  • Don’t forget that “Garbage in equals garbage out” - mltooler is what it says on the tin, a tool. It’s not magic and it cannot perform miracles with your data. If you are interested in performing miracles please consult your nearest bible.
  • Due to the amount of possible pipeline steps, training and improvement can take a substantial amount of time, and probably best left running overnight for least amount of hair pulling or nail biting. Increased amount of input features or sheer volume of data decreases run speed significantly.
  • Mltooler tries to recreate sklearn estimators in essence, they are not one in the same and some functions usually associated with sklearn algorithms will not work or do not exist.
  • All code as been written in my spare time, as a project while I learn to code python. Please report and errors or irregularities and I will try to fix when possible.
  • And finally… Although all efforts have been made to make mltooler as robust as possible, responsibility cannot be accepted for inconsistent or incorrect results.