项目作者: godatadriven

项目描述 :
Prediction Intervals with specific value prediction
高级语言: Python
项目地址: git://github.com/godatadriven/piven.git
创建时间: 2021-01-05T08:41:28Z
项目社区:https://github.com/godatadriven/piven

开源协议:MIT License

下载


Piven

This is an implementation of the model described in the following paper:

Simhayev, Eli, Gilad Katz, and Lior Rokach. “PIVEN: A Deep Neural Network for Prediction Intervals with Specific Value Prediction.” arXiv preprint arXiv:2006.05139 (2020).

I have copied some of the code from the paper’s code base, and cite the author’s paper where this is the case.




NN with piven layer, from from Simhayev, Gilad and Rokach (2020).

In short

A neural network with a Piven (Prediction Intervals with specific value prediction) output layer returns a point
prediction as well as a lower and upper prediction interval (PI) for each target in a regression problem.

This is useful because it allows you to quantify uncertainty in the point-predictions.

Using Piven

Using the piven module is quite straightforward. For a simple MLP with a piven output, you can run:

  1. import numpy as np
  2. from piven.models import PivenMlpModel
  3. from sklearn.preprocessing import StandardScaler
  4. # Make some data
  5. seed = 26783
  6. np.random.seed(seed)
  7. # create some data
  8. n_samples = 500
  9. x = np.random.uniform(low=-2.0, high=2.0, size=(n_samples, 1))
  10. y = 1.5 * np.sin(np.pi * x[:, 0]) + np.random.normal(
  11. loc=0.0, scale=1 * np.power(x[:, 0], 2)
  12. )
  13. x_train = x[:400, :].reshape(-1, 1)
  14. y_train = y[:400]
  15. x_valid = x[400:, :].reshape(-1, 1)
  16. y_valid = y[400:]
  17. # Build piven model
  18. model = PivenMlpModel(
  19. input_dim=1,
  20. dense_units=(64, 64),
  21. dropout_rate=(0.0, 0.0),
  22. lambda_=25.0,
  23. bias_init_low=-3.0,
  24. bias_init_high=3.0,
  25. lr=0.0001,
  26. )
  27. # Normalize input data
  28. model.build(preprocess=StandardScaler())
  29. # You can pass any arguments that you would also pass to a keras model
  30. model.fit(x_train, y_train, model__epochs=200, model__validation_split=.2)

The image below shows how the lower and upper PI change as we keep training the model

You can score the model by calling the score() method:

  1. y_pred, y_ci_low, y_ci_high = model.predict(x_test, return_prediction_intervals=True)
  2. model.score(y_true, y_pred, y_ci_low, y_ci_high)

To persist the model on disk, call the save() method:

  1. model.save("path-to-model-folder", model=True, predictions=True)

This will save the metrics, keras model, and model predictions to the folder.

If you want to load the model from disk, you need to pass the model build function (see below for more information).

  1. from piven.models import piven_mlp_model
  2. model = PivenMlpModel.load("path-to-model-folder", build_fn=piven_mlp_model)

For additional examples, see the ‘tests’ and ‘notebooks’ folders.

Creating your own model with a piven output layer

You can use a Piven layer on any neural network architecture. The authors of the Piven paper use it on top of
a pre-trained CNN to predict people’s age.

Suppose that you want to create an Model with a Piven output layer. Because this module uses the
KerasRegressor wrapper
from the tensorflow library to make scikit-compatible keras models, you would first specify a build
function like so:

  1. import tensorflow as tf
  2. from piven.layers import Piven
  3. from piven.metrics.tensorflow import picp, mpiw
  4. from piven.loss import piven_loss
  5. def piven_model(input_size, hidden_units):
  6. i = tf.keras.layers.Input((input_size,))
  7. x = tf.keras.layers.Dense(hidden_units)(i)
  8. o = Piven()(x)
  9. model = tf.keras.models.Model(inputs=i, outputs=o)
  10. model.compile(optimizer="rmsprop", metrics=[picp, mpiw],
  11. loss=piven_loss(lambda_in=15.0, soften=160.0,
  12. alpha=0.05))
  13. return model

The most straightforward way of running your Model is to subclass the PivenBaseModel class. This requires you
to define a build() method in which you can add preprocessing pipelines etc.

  1. from piven.models.base import PivenBaseModel
  2. from piven.scikit_learn.wrappers import PivenKerasRegressor
  3. from piven.scikit_learn.compose import PivenTransformedTargetRegressor
  4. from sklearn.preprocessing import StandardScaler
  5. class MyPivenModel(PivenBaseModel):
  6. def build(self, build_fn = piven_model):
  7. model = PivenKerasRegressor(build_fn=build_fn, **self.params)
  8. # Finally, normalize the output target
  9. self.model = PivenTransformedTargetRegressor(
  10. regressor=model, transformer=StandardScaler()
  11. )
  12. return self

To initialize the model, call:

  1. MyPivenModel(
  2. input_size=3,
  3. hidden_units=32
  4. )

Note that the inputs to MyPivenModel must match the inputs to the piven_model function.

You can now call all methods defined as in the PivenBaseModel class. Check the
PivenMlpModel class
for a more detailed example.

Details: loss function

The piven loss function is more complicated than a regular loss function in that it combines three objectives:

  1. The coverage (number of observations within lower and upper PI) should be approximately
    1-eq, where eq
    is the desired significance level.
  2. The PI should not be too wide.
  3. The point-prediction should be as accurate as possible.

The piven loss function combines these objectives into a single loss. The loss function takes three arguments.

  1. eq: the desired significance level. Given this value, we aim for PI
    such that, if we re-run our experiments many times, the PI would include the true value on our outcome
    eq&space;*&space;100) times.
  2. eq: this is a hyperparameter controlling the relative importance
    of PI width versus PI coverage. As eq shrinks down to 0, you will
    observe narrower PI at the cost of lower coverage.
  3. eq: technicality. Primarily used to ensure that the loss function can
    be optimized using a gradient-based solver.

The default settings are those used by the authors of the paper. You should probably leave them as they are unless you
know what you are doing. For further details, see [1, pp. 4-5].

Details: uncertainty

In statistics/ML, uncertainty is often subdivided into ‘aleatoric’ and ‘epistemic’ uncertainty. The former is associated
with randomness in the sense that any experiment that is not deterministic shows variability in its outcomes. The latter
type is associated with a lack of knowledge about the best model. Unlike aleatoric uncertainty, epistemic uncertainty
can be reduced by acquiring more information. [2].

Prediction intervals are always wider than confidence intervals, since confidence intervals try to capture epistemic
uncertainty only whereas prediction intervals seek to capture both types. See pages 2 and 5 in [1] for a discussion
on quantifying uncertainty.

References

[1] Simhayev, Eli, Gilad Katz, and Lior Rokach. “PIVEN: A Deep Neural Network for Prediction Intervals with Specific Value Prediction.” arXiv preprint arXiv:2006.05139 (2020).

[2] Hüllermeier, Eyke, and Willem Waegeman. “Aleatoric and epistemic uncertainty in machine learning: A tutorial introduction.” arXiv preprint arXiv:1910.09457 (2019).