Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add static/dynamic functionality for topologies #164

Merged
merged 2 commits into from
Jul 17, 2018
Merged

Add static/dynamic functionality for topologies #164

merged 2 commits into from
Jul 17, 2018

Conversation

whzup
Copy link
Collaborator

@whzup whzup commented Jul 14, 2018

Description

I added a new functionality to the topologies with which one can decide if the topology is static (i.e. the neighbours don't change over the course of the optimization) or dynamic (i.e. the neighbours change over the course of the optimization). This new functionality includes the addition of two new attributes to the topology classes that can be static or dynamic:

  • static: a boolean that decides whether the topology is static or not
  • neighbor_idx: a matrix with the neighbor indices for every particle.

To initialize a topology it is now necessary to set the static attribute like so:

Pyramid(static=True)

The other functionalities remain unchanged.

I also changed the tests to have new parametrizations so it works without a lot of pytest marks. The documentation has been updated according to the changes.
The local and global optimizers have now dynamic topologies (same as before but now explicitly stated in the documentation and the code).

Related Issue

#129

Motivation and Context

This new feature makes it possible to choose whether the neighbours are computed every iteration or just in the first one. This reduces the computation steps for every iteration if one does not consider a dynamic topology as important.

How Has This Been Tested?

The tests were extended with the new constructor parameter so all the test now check the dynamic and the static variant.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (a non-breaking change which adds functionality)
  • Breaking change (fix or feature that would mildly cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests pass.

@whzup whzup requested a review from ljvmiranda921 July 14, 2018 08:50
@whzup
Copy link
Collaborator Author

whzup commented Jul 14, 2018

@ljvmiranda921 I can add the Von Neumann topology as well if you want. Since there is no one reacting to my comments in the Implement Topologies issue I don't know if someone will do it other than me 🙈. The VonNeumann class itself is 14 lines of code and I'd just have to add the tests and documentation.

@ljvmiranda921
Copy link
Owner

ljvmiranda921 commented Jul 14, 2018 via email

Copy link
Owner

@ljvmiranda921 ljvmiranda921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As usual, awesome PR @whzup ! Really happy for this!

Just two things:

  1. Make static=False be the default value. Traditionally, a dynamic topology is often used, so let's just make it the default value in our Topology classes. So please edit those in the __init__ method and mention it in the docstring

  2. No need to make a static fixture. Usually we use fixtures as mock-ups for classes, methods, and etc. But for parameters that we want to change, we simply call the pytest.mark.parameterize decorator on top of the unit tests.

This is awesome! Once this is merged, then it's my turn to update the documentation, etc.

@@ -22,8 +22,18 @@


class Pyramid(Topology):
def __init__(self):
def __init__(self, static):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to have a default value for static, i.e., static=False.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was actually a very conscious choice to implement the static attribute it in that way. There are two reasons:

  • An active choice: If we implement it without a default statement the user has actively to choose which topology he actually wants. There are arguments for both: static uses less computation time and saves some resources on the user's side, dynamic is the traditional way. So I chose to not influence the user by defaults (there is an interesting book about this called Nudge)

  • Readability: By implementing it so the user has to choose whether the topology is static or dynamic he has more control over his code. You can look at it and immediately see that the topology is static or dynamic. This improves the readability and comprehensibility of the code.

Copy link
Owner

@ljvmiranda921 ljvmiranda921 Jul 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @whzup , I understand your point about having the user check if he/she needs a static or dynamic topology. However, this requires a lot of cognitive overhead on the user-side to differentiate between the two. If we can provide a reasonable default and it works, then it's all good. I still maintain that we set a default value for this parameter. Look at convention over configuration paradigm

Lastly, if you think of static as a light switch, its default value is False/off. It is up to the user to turn it True/on.

  1. On active choices

If we implement it without a default statement the user has actively to choose which topology he actually wants.

Default values provide a baseline from which a user can adapt his/her system: "If a dynamic topology didn't work, then maybe a static topology will." as opposed to asking "What's the difference between dynamic and static?" right off the bat. Sure, I can try one first randomly, but why not just set a default?

Our main philosophy is to keep things simple. We only expose the API whenever needed.
The library requires a necessary input from the user, and it gives the user's expected output. All customization parameters should either be defaulted or unexposed. The static=False option is the most reasonable for most use-cases, so we set it that way.

We want to make the users familiar with the input -> output (or f(x) = y) scheme before introducing other parameters. Again, in this manner, the user checks if his system works given a baseline, then he/she adapts the algorithm until it clicks.

A notable example for this would be the SVM classifier of scikit-learn. Sensible defaults are given and it provides a baseline for experimentation (e.g. if RBF doesn't work, maybe a polynomial kernel will, etc.).

There are arguments for both: static uses less computation time and saves some resources on the user's side

I might need some supporting evidence for this aside from looking at the code. Static may be "faster," but is it significantly faster? If you can profile this that would be nice.

  1. On readability

By implementing it so the user has to choose whether the topology is static or dynamic he has more control over his code.

We have solved the "more control over his code" part when we implemented the parameter itself. Again, actively choosing something still requires a lot of thought. Let's give them a sensible default and have them iterate if it didn't work.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point. Although:

Default values provide a baseline from which a user can adapt his/her system: "If a dynamic topology didn't work, then maybe a static topology will." as opposed to asking "What's the difference between dynamic and static?"
I'd argue, this has actually the opposite effect. If the default is set and the implementation looks like that:

topology = Pyramid()
GeneralOptimizer(10, 2, function, topology)

Where would you search if the code does not work? I guess first in the code you've already written. So you won't immediately ask yourself whether you can just change the static parameter and then it works, you'd rather go through the code you have already written and search for something that is missing.
Additionally, at this point, the user has already chosen the GeneralOptimizer and knows that he has to choose the topology himself so he's prepared for that. By adding the static attribute as a "must-write", the user has an active knowledge of the state of the topology and fully gives him the control over the optimization process which is the purpose of the GeneralOptimizer.
For exactly the simplicity you're describing we're still keeping the GlobalOptimizer and the LocalOptimizer classes.
I think in the SVM example you gave, the choice to have default values makes sense because you have to make very complex decisions about the parameters and you have a lot of parameters. Our topology classes do not share this property with the SVM example. The user has to ask one simple question "static or dynamic?" and that's it. I think even with the active choice of the parameter it still remains very simple as long as we don't add any more.

All customization parameters should either be defaulted or unexposed.

If we strictly follow this philosophy we'd also have to set defaults for the options, bounds, swarm size etc. in the optimizers, wouldn't we?

I might need some supporting evidence for this aside from looking at the code. Static may be "faster," but is it significantly faster? If you can profile this that would be nice.

I was actually thinking about profiling the whole GeneralOptimizer class with every topology for v0.4.0 so we can show some examples in the README, what do you think about that? Might be a good first-timers issue as well.

PS: I heard that in many Asian cultures there is a very strong hierarchical structure in projects and that there is little questioning about decisions of the supervisor. This is a very European thing to do. I hope I don't insult you in any way by doing so. Just want this project to become a great product 😄.

Copy link
Owner

@ljvmiranda921 ljvmiranda921 Jul 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would you search if the code does not work?

That's what the documentation is for. 👍

Not if it does not work, but if it does not optimize to an optima you are expecting. By that time you would probably check what can be done with the topology. The keyword is to provide sensible defaults. In this case our defaults will be the traditional approach, as you've mentioned. 💯

Also, look at how staticis defined: True or False. Think of this as a switch with a default state (in this case, turned-off). What a user can do is to turn-on a switch so that they get out of that traditional scheme and lock themselves into a fixed set of neighbors. Ok, sure we can change this to connection='static' or 'dynamic', but if we have a discrete choice between 1 or 2, why not just make it 0 or 1?

I think in the SVM example you gave, the choice to have default values makes sense because you have to make very complex decisions about the parameters and you have a lot of parameters.

Think of shrinking, probability, or class_weights. I can call SVM, provide C and gamma, and care less about what shrinking does. I trust that it is a sensible default, and I just focus on building the model. If I want to further optimize, then I look at what turning it off means.

Same with probability, I know that by setting it to True, I sacrifice training speed (according to the docs), I just turn it on when I need it and focus on my model.

The user has to ask one simple question "static or dynamic?" and that's it. I think even with the active choice of the parameter it still remains very simple as long as we don't add any more.

Again, there's a cognitive overhead. I still need to check for what static means or what dynamic does. We can provide the traditional default first, and have them change that whenever needed.

If we strictly follow this philosophy we'd also have to set defaults for the options, bounds, swarm size etc. in the optimizers, wouldn't we?

We already have defaults for bounds. We can keep the swarm size flexible but you're correct to set a default options dictionary. I've been thinking of doing that for some time, but I realized that the common workflow in PSO is that researchers choose values for the cognitive and social parameters and so we follow what they usually do.

I was actually thinking about profiling the whole GeneralOptimizer class with every topology for v0.4.0 so we can show some examples in the README, what do you think about that? Might be a good first-timers issue as well.

This is a good idea.

PS: I heard that in many Asian cultures there is a very strong hierarchical structure in projects and that there is little questioning about decisions of the supervisor. This is a very European thing to do. I hope I don't insult you in any way by doing so. Just want this project to become a great product

All fine, but no need to mention this to be honest.

For exactly the simplicity you're describing we're still keeping the GlobalOptimizer and the LocalOptimizer classes.

Again, simplicity is key. We want the user to use our methods right off the bat. Customization happens when needed and not at every step of the way. I'm still trying to find the right balance between this philosophy and flexibility. We're flexible, sure. That's the reason why we try to implement as many options as possible. 💯

Thanks for raising these thought-provoking questions!

Copy link
Owner

@ljvmiranda921 ljvmiranda921 Jul 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @whzup ! Yes, I'm expecting that Zen of Python argument. It's nontrivial, to be honest. However, the inverse of convention over configuration violates "Simple is better than complex." This goes to show that software design principles can sit in tension with each other. Again, our principle would be sensible defaults and to just stick with convention unless really needed. 👍

Yes, the log information should be a good compromise! 💯

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I just add code like that to the GeneralOptimizer:

if not static:
    logger.info(dynamic_message)

Or is there a special place where I'd have to include this? I haven't looked into the logging too much, to be honest 😄.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to that, yes. So we have cli_print function here. You can just use the function.

Yes, maybe you can put it at GeneralOptimizer's __init__ method? Or in optimize? Depends on where you think it's most applicable. 👍

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, sorry because I thought of something @whzup...

What if we put it inside the init method of the Topology class itself?

try:
    if self.static:
        cli_print(message)
except AttributeError: # in case it does not exist such as in Star
    pass

With this, we don't need to put the logger in all optimizers. It's all in the base topology class

Copy link
Collaborator Author

@whzup whzup Jul 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I include the two attributes in the Topology class? I think it's a bit dirty to just leave the Star topology out. The neighbor_idx attribute could be useful for future PSO algorithms (e.g. for accessing the neighbours outside the topology). In the Topology class it'd look like this:

class Topology(object):
    def __init__(self, static=False, **kwargs):
        """Initializes the class"""

        # Initialize logger
        self.logger = logging.getLogger(__name__)

        self.static = static
        self.neighbor_idx = None

        try:
            if self.static:
                cli_print("Running on `dynamic` topology, neighbors are updated regularly."
                          "Set `static=True` for fixed neighbors.",
                          1,
                          0,
                          self.logger)
        except AttributeError:
            pass

It also adds to DRY 😄. We might have to set static in the Star topology to something else so we can throw an exception 🤔.


Parameters
----------
static : bool
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just add a note that there is a default value, i.e.

def __init__(self, static=False):
    """Initialize the class
    
    Parameters
    ---------------
    static : bool (default is :code:`False`)

    """

@@ -24,8 +24,17 @@


class Ring(Topology):
def __init__(self):
def __init__(self, static):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here too, static=False 👍


Parameters
----------
static : bool
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with the documentation, static : bool (default is :code:False)

@@ -172,7 +172,7 @@ def __init__(
# Initialize the resettable attributes
self.reset()
# Initialize the topology
self.top = Ring()
self.top = Ring(static=False)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep this so that we are explicit 👍

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @whzup , you can also add a cli_print somewhere here so that they know about it.
I forgot to tell you to add a static parameter here. We can add control to LocalBestPSO by letting the user to choose between a static or dynamic Ring topology. 👍

@@ -9,9 +9,9 @@
from pyswarms.backend.topology import Pyramid


def test_compute_gbest_return_values(swarm):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can parameterize this such that:

@pytest.mark.parameterize("static", [True, False])
def test_compute_gbest_return_values(swarm, static):
      """Test if compute_gbest() gives the expected return values"""
      topology = Pyramid()
      topology = Pyramid(static=static)
      expected_cost = 1
      expected_pos = np.array([1, 2, 3])
      pos, cost = topology.compute_gbest(swarm)
      assert cost == expected_cost
      assert (pos == expected_pos).all()

So we can eliminate the need for the static fixture. Usually we use fixtures to create mockups: classes, dictionaries, etc. If we're just changing the parameters, we can simply parameterize this easily 💯 Hail pytest!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought a fixture might be cleaner because I'd have to write @pytest.mark.parametrize("static", [True, False]) above every function 😄.

@@ -20,9 +20,9 @@ def test_compute_gbest_return_values(swarm):


@pytest.mark.parametrize("clamp", [None, (0, 1), (-1, 1)])
def test_compute_velocity_return_values(swarm, clamp):
def test_compute_velocity_return_values(swarm, clamp, static):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here in terms of parameterization. 👍

@@ -33,9 +33,9 @@ def test_compute_velocity_return_values(swarm, clamp):
"bounds",
[None, ([-5, -5, -5], [5, 5, 5]), ([-10, -10, -10], [10, 10, 10])],
)
def test_compute_position_return_values(swarm, bounds):
def test_compute_position_return_values(swarm, bounds, static):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here 👍

@@ -10,9 +10,9 @@

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be better to just parameterize the static variable than making it a fixture 👍

@@ -11,9 +11,9 @@

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be better to just parameterize the static variable than making it a fixture 👍

@whzup
Copy link
Collaborator Author

whzup commented Jul 15, 2018

Thanks, @ljvmiranda921, that was a quick review! I'll go ahead and adjust the code tomorrow evening. Think about my proposal about the static implementation 😋.

@ljvmiranda921
Copy link
Owner

Sure, take your time! I snatched some free time this weekend so I looked over your PR. Thanks a lot for this!

@ljvmiranda921
Copy link
Owner

ljvmiranda921 commented Jul 16, 2018 via email

@ljvmiranda921
Copy link
Owner

ljvmiranda921 commented Jul 16, 2018 via email

Added two new attributes to the topologies. "static" decides whether the topology is a static or a dynamic one, it is initialized together with the class. "neighbor_idx" is an array that stores the indices of the neighbors of every particle. Changed all occurrences of topologies to fit the new initialization. Reworked the tests to fit the new functionality and added more parametrization of fixture functions. Updated the documentation to incorporate the new functionality.

See also: #129
@whzup
Copy link
Collaborator Author

whzup commented Jul 16, 2018

@ljvmiranda921 got it what do you think? 😄
FYI, I saw some deprecation warnings by matplotlib when running the tests.

@ljvmiranda921
Copy link
Owner

Hi @whzup , not sure if you have committed the changes already but I still can't see them. No need to rebase them in your branch because I will squash them into one commit when merging 👍

  1. Sorry if I didn't understand your email earlier. Do you mean adding self.static and self.neighbor_idx to the Topology class itself? Go ahead. But for printing, I think there's no need for a try-except statement.

I only placed a try-except statement to catch cases (i.e, the Star topology) where self.static does not exist. That's why we have an AttributeError. Since everyone will now have a self.static attribute (with Star automatically set to False), we just use an if-else statement. 👍

  1. For the tests, just parameterize the static variable similar to my last email 👍

  2. What were the deprecation warnings? Let's just deal with them in another PR

@whzup
Copy link
Collaborator Author

whzup commented Jul 17, 2018

Huh, I don't know where my changes are 😮. They're on my fork but they don't appear here. Maybe we have to wait a while. GitHub had some issues when I worked on the fork yesterday.
These are the deprecation warnings:

============================== warnings summary ===============================
tests/utils/plotters/test_plotters.py::test_plot_cost_history_return_type[cost_history]
  c:\users\aaron\appdata\local\programs\python\python37-32\lib\site-packages\matplotlib\cbook\__init__.py:2347: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
    if isinstance(obj, collections.Iterator):
  c:\users\aaron\appdata\local\programs\python\python37-32\lib\site-packages\matplotlib\cbook\__init__.py:2364: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
    return list(data) if isinstance(data, collections.MappingView) else data

I just wanted to note these so they don't become forgotten 😄.

@ljvmiranda921
Copy link
Owner

ljvmiranda921 commented Jul 17, 2018 via email

Added a static parameter to the LocalBest class.
@whzup
Copy link
Collaborator Author

whzup commented Jul 17, 2018

There they are 😋.

@ljvmiranda921
Copy link
Owner

LGTM! Will merge in a while 👍

@ljvmiranda921 ljvmiranda921 merged commit c95661b into ljvmiranda921:development Jul 17, 2018
@ljvmiranda921
Copy link
Owner

Merged. Thank you for this @whzup ! Adding new topologies + the static/dynamic setting is really helpful. Glad to have you here!

@whzup whzup deleted the static_dynamic branch July 17, 2018 16:56
@ljvmiranda921 ljvmiranda921 mentioned this pull request Jul 31, 2018
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature requests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants