Skip to content

Added Leaky ReLU activation function (1D & 3D) #123

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Mar 17, 2023

Conversation

Spnetic-5
Copy link
Collaborator

@Spnetic-5 Spnetic-5 commented Mar 7, 2023

@milancurcic milancurcic added the enhancement New feature or request label Mar 7, 2023
@milancurcic
Copy link
Member

milancurcic commented Mar 7, 2023

Thanks @Spnetic-5! A few things:

  • We need a leaky_relu_prime as well.
  • You also need to include it in the select case here:
    https://github.com/modern-fortran/neural-fortran/blob/main/src/nf/nf_dense_layer_submodule.f90#L166-L169
    and similarly in the nf_conv2d_layer_submodule.f90.
  • We shouldn't hardcode alpha (0.01); how best to do it? In the current implementation of how activation functions are passed to the layers, all activation functions need to have the same interface. One approach that I've been thinking about is to make an optional alpha parameter in all activation functions, except alpha would only be used in leaky_relu but would be ignored in the other functions. This way the interfaces of all activation functions would match and it should just work.

@Spnetic-5
Copy link
Collaborator Author

@milancurcic Please review, I have made suggested changes.

@milancurcic
Copy link
Member

Thank you, we're almost there. See my 3rd comment from above:

One approach that I've been thinking about is to make an optional alpha parameter in all activation functions, except alpha would only be used in leaky_relu but would be ignored in the other functions. This way the interfaces of all activation functions would match and it should just work.

The build currently fails because leaky_relu has a different interface. Let's make all activation functions take a real, intent(in), optional :: alpha argument, to make them have the same interface. This is a somewhat ugly hack, but it should work. You'll also need to add the optional alpha to the activation interface here:

interface
pure function activation_function(x)
real, intent(in) :: x(:)
real :: activation_function(size(x))
end function activation_function
end interface

@jvdp1
Copy link
Collaborator

jvdp1 commented Mar 7, 2023

The build currently fails because leaky_relu has a different interface. Let's make all activation functions take a real, intent(in), optional :: alpha argument, to make them have the same interface. This is a somewhat ugly hack, but it should work. You'll also need to add the optional alpha to the activation interface here:

Looks good.

Re: alpha, wouldn't it be better to define a derived type that contains alpha, and pass only the derived type? So, in the future, if other parameters needs to be added for other activation functions, only the content of derived type should be changed and the API would not change.

@milancurcic
Copy link
Member

Thanks, Jeremie, I think that's a good idea. Having an activation_params type would also make it easier to pass this info to individual layers.

@Spnetic-5
Copy link
Collaborator Author

alpha, wouldn't it be better to define a derived type that contains alpha, and pass only the derived type?

Where and how should I define the derived type, could you please give me example or steps to implement this approach.

@Spnetic-5
Copy link
Collaborator Author

@milancurcic @jvdp1

  • By using a derived type to contain the activation parameters, we can add new parameters in the future without changing the function signature. However, this approach does add some overhead for me, since an extra object needs to be created and passed around.
  • By adding the optional alpha argument to the activation interface, we allow all the activation functions to have the same interface, which should resolve the issue for now.

As I'm still not familiar with entire Fortran codebase and working, I tried implementing the optional alpha approach. Please review the changes made so far; there are significant changes that need to be made in nf_conv2d_layer_submodule.f90, but I'm having trouble figuring them out.
Could you please guide me in the same?

pure function leaky_relu_prime(x, alpha) result(res)
! First derivative of the Leaky Rectified Linear Unit (Leaky ReLU) activation function.
real, intent(in) :: x(:,:,:)
real, intent(in) :: alpha
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be optional everywhere.
In this case, I would use something like the stdlib optval function.

Copy link
Member

@milancurcic milancurcic Mar 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to evaluate the performance impact of using optval since an activation function is invoked many times and are at the very bottom of the call stack. It would be a good opportunity to use stdlib as a dependency, and via fpm it will be easy. But we'd need to add it as a CMake dependency, which will require a bit more CMake code here. I'm wary of making this PR any larger because it's @Spnetic-5's first PR.

Once we do activation_params derived type (in a separate PR), there's a technique that will allow us to not rely on present or optval.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points! I agree with you.

@milancurcic
Copy link
Member

I think it's fine that we do just an optional alpha as a stopgap solution for this PR, and we can generalize it to activation_params in a separate PR.

@milancurcic
Copy link
Member

@Spnetic-5 Let's just make alpha optional everywhere as @jvdp1 wrote and let it build successfully (but the leaky_relu will still not work), and we can tackle a well designed solution with activation parameters type in a separate PR (but before next minor release).

@Spnetic-5

This comment was marked as outdated.

@milancurcic
Copy link
Member

What needs changing in nf_conv2d_layer_submodule.f90?

@Spnetic-5
Copy link
Collaborator Author

Spnetic-5 commented Mar 8, 2023

Thanks for clearing stuff @milancurcic & @jvdp1 , now I'm facing following error while building:

f951: Fatal Error: Reading module ‘/home/csgo/Desktop/GSoC/neural-fortran/src/nf/nf_conv2d_layer.mod’ at line 181 column 36: Expected left parenthesis
compilation terminated.
make[2]: *** [CMakeFiles/neural.dir/build.make:348: CMakeFiles/neural.dir/src/nf/nf_layer_constructors_submodule.f90.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:279: CMakeFiles/neural.dir/all] Error 2
make: *** [Makefile:166: all] Error 2

@milancurcic
Copy link
Member

  1. Define the alpha parameter as an attribute on the dense_layer and conv2d_layer concrete layer types, e.g. here:

type, extends(base_layer) :: dense_layer
!! Concrete implementation of a dense (fully-connected) layer type
integer :: input_size
integer :: output_size
real, allocatable :: weights(:,:)
real, allocatable :: biases(:)
real, allocatable :: z(:) ! matmul(x, w) + b
real, allocatable :: output(:) ! activation(z)
real, allocatable :: gradient(:) ! matmul(w, db)
real, allocatable :: dw(:,:) ! weight gradients
real, allocatable :: db(:) ! bias gradients
procedure(activation_function), pointer, nopass :: &
activation => null()
procedure(activation_function), pointer, nopass :: &
activation_prime => null()

you can add real :: alpha, so that in type-bound methods it will be accessible via self % alpha.

  1. In the layer constructor functions, rather than passing alpha to set_activation(), set it to the type attribute alpha that you defined in step 1 above (if present(alpha), of course). Add also a run-time check to require alpha argument to be present if the leaky_relu activation is requested.

  2. In forward and backward pass methods for each dense_layer and conv2d_layer, pass self % alpha (which is now set in step 2 in the case of leaky_relu) to self % activation().

After you do these steps, I believe it should work correctly.

@milancurcic
Copy link
Member

To keep this PR small, let's implement the activation params approach in a separate PR. Thanks @Spnetic-5!

@milancurcic milancurcic merged commit f328c8d into modern-fortran:main Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants