|
| 1 | +--- |
| 2 | +title: 'Hyperas: Simple Hyperparameter Tuning for Keras Models' |
| 3 | +tags: |
| 4 | + - Python |
| 5 | + - Hyperparameter Tuning |
| 6 | + - Deep Learning |
| 7 | + - Keras |
| 8 | + - Hyperopt |
| 9 | +authors: |
| 10 | + - name: Max Pumperla |
| 11 | + affiliation: "1, 2" |
| 12 | +affiliations: |
| 13 | + - name: IU Internationale Hochschule |
| 14 | + index: 1 |
| 15 | + - name: Pathmind Inc. |
| 16 | + index: 2 |
| 17 | +date: 19 November 2021 |
| 18 | +bibliography: paper.bib |
| 19 | + |
| 20 | +--- |
| 21 | + |
| 22 | +# Summary |
| 23 | + |
| 24 | +Hyperas is an extension of [Keras](https://keras.io/) [@chollet2015keras], which allows you to run hyperparameter optimization of your models using [Hyperopt](http://hyperopt.github.io/hyperopt/) [@bergstra2012hyperopt]. |
| 25 | +It was built to enable fast experimentation cycles for researchers and software developers. |
| 26 | +With hyperas, you can set up your Keras models as you're used to and specify your hyperparameter search spaces in a convenient way, following the design principles suggested by the [Jinja project](https://jinja.palletsprojects.com/en/3.0.x/) [@jinja2008]. |
| 27 | + |
| 28 | +With hyperas, researchers can use the full power of hyperopt without sacrificing experimentation speed. |
| 29 | +Its documentation is hosted on [GitHub](https://github.com/maxpumperla/hyperas) and comes with suite of [examples]https://github.com/maxpumperla/hyperas/tree/master/examples) to get users started. |
| 30 | + |
| 31 | + |
| 32 | +# Statement of need |
| 33 | + |
| 34 | +Hyperas is in active use in the Python community and still has [thousands of weekly downloads](https://pypistats.org/packages/hyperas), which shows a clear need for this experimentation library. |
| 35 | +Over the years, hyperas has been used and cited by [research papers](https://scholar.google.com/scholar?cluster=1375058734373368171&hl=en&oi=scholarr), mostly by [referring to Github](https://scholar.google.com/scholar?hl=de&as_sdt=0%2C5&q=hyperas+keras&btnG=). |
| 36 | +Researchers that want to focus on their deep learning model definitions don't get bogged down by maintaining separate hyperparameter search spaces and configurations and can leverage hyperas to speed up their experiments. |
| 37 | +After hyperas has been published, tools like Optuna [@akiba2019optuna] have adopted a similar approach to hyperparameter tuning. |
| 38 | +KerasTuner [@omalley2019kerastuner] is officially supported by Keras itself, but does not have the same variety of hyperparameter search algorithms as hyperas. |
| 39 | + |
| 40 | +# Design and API |
| 41 | + |
| 42 | +Hyperas uses a Jinja-style template language to define search spaces implicitly in Keras model specifications. |
| 43 | +Essentially, regular configuration values in a Keras layer, such as `Dropout(0.2)` get replaced by a [suitable distribution](https://github.com/maxpumperla/hyperas/blob/master/hyperas/distributions.py) like `Dropout({{uniform(0, 1)}})`. |
| 44 | +To define a hyperas model, you proceed in two steps. |
| 45 | +First, you set up a function that returns the data you want to train on, which could include features and labels for training, validation and test sets. |
| 46 | +Schematically this would look as follows: |
| 47 | + |
| 48 | +```python |
| 49 | +def data(): |
| 50 | + # Load your data here |
| 51 | + return x_train, y_train, x_test, y_test |
| 52 | +``` |
| 53 | + |
| 54 | +Next, you have to specify a function that takes your data as input arguments, defines a Keras model with hyperas template handles (`{{}}`), fits the model to your data and returns a dictionary that has to at least contain a `loss` value to be minimized by hyperopt, e.g. validation loss or the negative of test accuracy, and the hyperopt `status` of the experiment. |
| 55 | + |
| 56 | +```python |
| 57 | +from hyperas.distributions import uniform |
| 58 | +from hyperopt import STATUS_OK |
| 59 | + |
| 60 | + |
| 61 | +def create_model(x_train, y_train, x_test, y_test): |
| 62 | + model = Sequential() |
| 63 | + model.add(Dense(512, input_shape=(784,))) |
| 64 | + model.add(Activation('relu')) |
| 65 | + model.add(Dropout({{uniform(0, 1)}})) |
| 66 | + # ... add more layers |
| 67 | + model.add(Dense(10)) |
| 68 | + model.add(Activation('softmax')) |
| 69 | + |
| 70 | + # fit model |
| 71 | + model.fit(x_train, y_train, ...) |
| 72 | + |
| 73 | + # evaluate model and return loss |
| 74 | + score = model.evaluate(x_test, y_test, verbose=0) |
| 75 | + accuracy = score[1] |
| 76 | + return {'loss': -accuracy, 'status': STATUS_OK, 'model': model} |
| 77 | +``` |
| 78 | + |
| 79 | +Lastly, you simply prompt the `optim` module of hyperas to `minimize` your model loss defined in `create_function`, using `data`, with a hyperparameter optimization algorithm like TPE or any other algorithm supported by hyperopt [@pmlr-v28-bergstra13]. |
| 80 | + |
| 81 | +```python |
| 82 | +from hyperas import optim |
| 83 | +from hyperopt import Trials, STATUS_OK, tpe |
| 84 | + |
| 85 | +best_run = optim.minimize(model=create_model, |
| 86 | + data=data, |
| 87 | + algo=tpe.suggest, |
| 88 | + max_evals=10, |
| 89 | + trials=Trials()) |
| 90 | +``` |
| 91 | + |
| 92 | +Furthermore, note that hyperas can run [hyperparameter tuning in parrallel](https://github.com/maxpumperla/hyperas#running-hyperas-in-parallel), using hyperopt's distributed MongoDB backend. |
| 93 | + |
| 94 | +# Acknowledgements |
| 95 | + |
| 96 | +We would like to thank all the open-source contributors that helped making `hyperas` what it is today. |
| 97 | +It's a great honor to see your software continually used by the [community](https://github.com/maxpumperla/hyperas/network/dependents?package_id=UGFja2FnZS01MjIwODQ4OA%3D%3D). |
| 98 | + |
| 99 | +# References |
0 commit comments