Skip to content

Conversation

reivilibre
Copy link
Collaborator

@reivilibre reivilibre commented Oct 13, 2021

The best description of this template is the README.

I have tested it and everything seems to be in order.

(That said, Mypy complains and will continue to do so until we release Synapse 1.46 with matrix-org/synapse#11054).

@reivilibre
Copy link
Collaborator Author

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
@reivilibre reivilibre requested review from babolivier and a team October 14, 2021 12:59
@reivilibre
Copy link
Collaborator Author

I've put this on for review; I don't know if it's worth reviewing or just putting right in?

@babolivier babolivier self-assigned this Oct 14, 2021
Copy link
Contributor

@babolivier babolivier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I finished with the readme file, meaning some comments I made on the code file where written before I saw it was documented.

@staticmethod
def parse_config(config: Dict[Any, Any]) -> {{ cookiecutter.module_class_name }}Config:
try:
return cattr.structure(config, {{ cookiecutter.module_class_name }}Config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just {{ cookiecutter.module_class_name }}Config(**config)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works recursively and is smart enough to convert types.
For example, if my struct has a Set[str] as a field, that will accept lists from the JSON/YAML and convert it properly.

If it's too weird, I can probably take it away?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this seems like a good enough reason to let it in, I was a bit puzzled because I've never encountered cattr.structure in the past. It sounds good to leave this in, maybe with a comment that explains what it does?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, actually, I thought it validated the types too but it doesn't seem to. Maybe I should revert to just attrs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool, but I think trying to be a little bit too magic. I also don't think it's a good idea to encourage Config(**config) as that's just going to ignore any validation (unless people use attrs validators, which they won't unless they know attrs intimately). Personally, I think its enough to do something like:

# This is where we pull out and parse any config from the provided dict, e.g.:
#
#     some_option = config.get("some_option")
#     if some_option is None:
#          raise ConfigError("Missing option  'some_option'")
#      if not isinstance(some_option, str):
#          raise ConfigError("Config option  'some_option' must be a string")
#
return Config()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is something we need to fix in the module documentation (eg https://matrix-org.github.io/synapse/develop/modules/writing_a_module.html) rather than in here?
I'm thinking that if we think there's value in making modules use attrs then maybe the documentation should give basic explanations on how to use it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I almost want an excuse to use ConfigError in the code so that the import for it isn't redundant (and therefore has to be removed).
Maybe this will be improved in Synapse 1.46, but until then, we don't export the error types so if you use your IDE to import ConfigError, you'll import it from its internal path rather than from synapse.module_api.errors.

However, it's hard to see an excuse to do so if we're cutting out the example... thoughts? x)


@attr.s(auto_attribs=True, frozen=True)
class {{ cookiecutter.module_class_name }}Config:
example_option: bool = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest making it a pass rather than trying to make it an example.

Copy link
Collaborator Author

@reivilibre reivilibre Oct 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel a bit more strongly about this one; it's fine for people like us who know what attrs is but the idea was to provide an example to reduce some of that intimidation factor for other people.
(also the README follows this example and I was hoping to set a good example of documenting the module's options ;p)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, then it probably needs a comment to explain what's going on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're worried that people don't know how to use attrs then I think that's more of an argument to not use it at all?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not use it at all

Yeah, that's fair I suppose. I find it quite nice but suppose this is more of a personal thing and not really suited for a template. Will cut it out

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, if we cut out attrs entirely, I'll have to put something worse in its place (like passing back an untyped dict), which is not really what I hoped for in terms of keeping everything mypy-strict.

What do you think would be best?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it'd be worth pointing to our own repos that use this template as examples

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still keep the Config class, and just leave it as empty at first?


from {{ cookiecutter.package_name }}.module import {{ cookiecutter.module_class_name }}

__all__ = ["{{ cookiecutter.module_class_name }}"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just include the class in this file rather than make a different file for it? Since we already do that for quite a few modules (as most of the time the complexity of a module's code is low enough to allow it).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like bad practice to put anything too beefy in __init__.py (circular import issues spring to mind as an actual consequence you usually feel).

I think there's a discontinuity; in single-file projects, beefy __init__ seems OK but if you stretch into a multiple-file project, I think one generally wants __init__ to be little more than structure.

It also feels more like a structural file, e.g. __version__ = ... and so on. When looking through a project, I tend to glaze over (ignore) __init__.py files at first glance. However, maybe that's just me... 😇

I'm curious what other people think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a discontinuity; in single-file projects, beefy __init__ seems OK but if you stretch into a multiple-file project, I think one generally wants __init__ to be little more than structure.

I think the same, my point was that in practice we're way more often in the former case than the latter.

@babolivier
Copy link
Contributor

Oh also, would it be interesting to add a CODEOWNERS file to automatically request reviews from synapse-core on new PRs? I think there could be value in trying to standardise this across our repos

@reivilibre
Copy link
Collaborator Author

Oh also, would it be interesting to add a CODEOWNERS file to automatically request reviews from synapse-core on new PRs? I think there could be value in trying to standardise this across our repos

Hmmmm. That sounds a bit like a plan but it doesn't really apply to third-party users of the template. Wonder if I can make it an option...?

@babolivier
Copy link
Contributor

Hmmmm. That sounds a bit like a plan but it doesn't really apply to third-party users of the template. Wonder if I can make it an option...?

Ah fair enough, it's true the approach I used was us using it rather it being used for third-party modules, my bad.

@reivilibre
Copy link
Collaborator Author

It makes sense (though I'm so used to requesting Synapse Core at this point (assuming everyone else is too?) that maybe it's not worth spending too much time worrying about it), I'll see how hard it is to make it an option though.

Copy link

@callahad callahad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're using this for new projects, should we take the opportunity to move away from legacy imperative setup.py and to a declarative setup.cfg? Keeps us with setuptools, just... more maintainable. Happy to provide a diff against this PR if that'd be useful.

@clokep
Copy link
Member

clokep commented Oct 14, 2021

If we're using this for new projects, should we take the opportunity to move away from legacy imperative setup.py and to a declarative setup.cfg?

Do you have any articles on the benefits of this? This is the first I've heard setup.py being called legacy.

@callahad
Copy link

If we're using this for new projects, should we take the opportunity to move away from legacy imperative setup.py and to a declarative setup.cfg?

Do you have any articles on the benefits of this? This is the first I've heard setup.py being called legacy.

I've posted a longer discussion privately, but the central point is that Setuptools itself wants you to use it declaratively. Its own quickstart docs default to setup.cfg and state:

To avoid executing arbitrary scripts and boilerplate code, we are transitioning into a full-fledged setup.cfg to declare your package information instead of running setup().

@erikjohnston
Copy link
Member

Thanks for driving this @reivilibre, and would be a marked improvement on our current ad hoc stuff, and generally seems sane.

One thing that worries me a bit is the maintainability of the architecture: i.e. what happens if we want to change the template (e.g. add a new python version to CI)? Right now we'd have to change the template and all existing modules, which is going to lead to drift in how our modules are set up.

I'm not really sure what we can do about that. We could aim to use reusable github pipelines, maybe bundle some of the binaries into a separate dev package that we can pull in and so more easily update? It also makes me more sympathetic to the idea that we should try and be as minimal as possible, e.g. is using towncrier giving enough benefit that its worth maintaining across many modules?

@babolivier
Copy link
Contributor

babolivier commented Oct 15, 2021

One thing that worries me a bit is the maintainability of the architecture: i.e. what happens if we want to change the template (e.g. add a new python version to CI)? Right now we'd have to change the template and all existing modules, which is going to lead to drift in how our modules are set up.

I'm not really sure what we can do about that. We could aim to use reusable github pipelines, maybe bundle some of the binaries into a separate dev package that we can pull in and so more easily update? It also makes me more sympathetic to the idea that we should try and be as minimal as possible, e.g. is using towncrier giving enough benefit that its worth maintaining across many modules?

On this, separately to this repo, I want to spend some time (either today or next week) looking into how to use reusable github action pipelines for modules. I agree that the current situation makes it a faff to keep the CI up to date, but I suspect reusable pipelines might be the solution to this problem.

@@ -0,0 +1,11 @@
from importlib.metadata import PackageNotFoundError, version
Copy link

@callahad callahad Oct 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

importlib.metadata is only available in Python >= 3.8. We should either fall back to importlib_metadata or set a minimum python version in the package metadata.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah grumble, I used importlib_metadata originally but it doesn't have type annotations.

Thinking it might be best to go with Erik's plan and pull it from here into setup.cfg if that's a supported way of doing business.

@reivilibre
Copy link
Collaborator Author

I don't think GitHub reusable pipelines are the answer here, because they don't apply to GitLab and also don't apply to local development.

Someone mentioned https://earthly.dev but I'm struggling to get a sense of what it feels like and don't know if that really applies to what we're trying to do.

I wonder if git submodules may be useful for pulling in reusable scaffolding: linter configurations, development scripts, and so on?

It might be useful to have a point-by-point list of what we think is concerning to us from a maintenance point of view, because it's hard to fully appreciate the concerns without some specifics.

The things that come to my mind:

  • CI configs are bound to specific Python versions, which we will want to increment (e.g. to follow Synapse's version) as they age
    • yes
  • Linters are bound to specific versions
    • Not the best for standardisation but not the end of the world; all the scripts will still work, CI still works, the developer experience is the same.
    • (Mypy might be a bit more of a special case)

I'm noticing quickly that we might end up wanting our own variant of this template because our needs are slightly different to that of a third-party module (mostly around the reusability, but also e.g. CODEOWNERS has been mentioned).
If possible, I would like to separate out first-party concerns from this PR into issue #3.

@reivilibre
Copy link
Collaborator Author

reivilibre commented Oct 15, 2021

So we now have three competing suggestions for doing the package version :P :

  • setuptools_scm (rely on git tags) (→ __version__ using importlib)
  • setup.cfg (→ __version__ using importlib)
  • __version__ (→ setup.cfg using version = attr: package.__version__)
    • slightly undermines the credibility of 'setup.cfg is meant to be declarative', but importlib is feeling a little bit nastier

Will wait for a round of opinions on that.

As a side question: Do we care about __version__ anyway? Our other packages might be doing it for monitoring's sake. https://github.com/matrix-org/synapse-freeze-room/blob/main/freeze_room/__init__.py, my spiritual reference, doesn't seem to care about that.

@babolivier
Copy link
Contributor

babolivier commented Oct 15, 2021

I don't think GitHub reusable pipelines are the answer here, because they don't apply to GitLab and also don't apply to local development.

Someone mentioned https://earthly.dev but I'm struggling to get a sense of what it feels like and don't know if that really applies to what we're trying to do.

I wonder if git submodules may be useful for pulling in reusable scaffolding: linter configurations, development scripts, and so on?

It might be useful to have a point-by-point list of what we think is concerning to us from a maintenance point of view, because it's hard to fully appreciate the concerns without some specifics.

The things that come to my mind:

  • CI configs are bound to specific Python versions, which we will want to increment (e.g. to follow Synapse's version) as they age
    • yes
  • Linters are bound to specific versions
    • Not the best for standardisation but not the end of the world; all the scripts will still work, CI still works, the developer experience is the same.
    • (Mypy might be a bit more of a special case)

I'm noticing quickly that we might end up wanting our own variant of this template because our needs are slightly different to that of a third-party module (mostly around the reusability, but also e.g. CODEOWNERS has been mentioned). If possible, I would like to separate out first-party concerns from this PR into issue #3.

I think an issue with this PR as it is, is that the template tries to be everything at once and lacks a precise target.

A company writing a Synapse module, us as the Synapse team writing a Synapse module, and a community member writing a Synapse module, are three very different situations that will produce very different code.

I don't think we can realistically make this template appeal to everyone without making it overly complicated and difficult to maintain, and even then it wouldn't be fully possible (because people have e.g. their own preferences for coding style/linting/etc).

Since all of the files in this template seem to be centered around how we, as the Synapse team, write modules, why not make this template "the template the Synapse team uses, which might or might not fit your needs"? I think if we want to make it easier for people that aren't us to write a module, the best way is more documentation/tutorials.

@reivilibre
Copy link
Collaborator Author

@babolivier

After thinking it through, my current plan is to make this PR a baseline template, so that we have something for community members. I don't want to make perfect the enemy of the good here.
I personally really appreciate this sort of template when I'm starting using something; it seems only fair to provide something since I've gone to the effort of making this.
(I think I'll probably cut Towncrier and releasing steps out of it; probably also disregard most of the things that don't make sense for someone on the more casual end of things — snipping __version__ out might be an idea. Will probably leave GitHub/GitLab CI in because those two systems are popular.)
We'll then call that 'done' and branch off to our own variant, which supports us more directly (and I'd like to refer discussion about that to #3).

Other 'companies' (I'm taking this to mean: other maintainers of many modules) will be left on their own, and I don't think we really have much ability to help them out, except for the fact that: the 'community' template will get them going, and they can look at our solutions to things as a reference.

@babolivier
Copy link
Contributor

babolivier commented Oct 15, 2021

After thinking it through, my current plan is to make this PR a baseline template, so that we have something for community members. I don't want to make perfect the enemy of the good here.

If that's the case then I guess we should remove everything that's not a Python file, since everything else is very much a copy of how we do things to meet our own needs.

Other 'companies' (I'm taking this to mean: other maintainers of many modules) will be left on their own, and I don't think we really have much ability to help them out, except for the fact that: the 'community' template will get them going, and they can look at our solutions to things as a reference.

Again, I think the solution here is more documentations, with more examples in it. I don't think a template can go much further than this in this case.

@reivilibre
Copy link
Collaborator Author

If that's the case then I guess we should remove everything that's not a Python file, since everything else is very much a copy of how we do things to meet our own needs.

I would say that's a bit extreme.
Code formatting and type checking is not a requirement that's unique to us, and looking at other templates, it seems in order to provide a way of doing that. Similarly for unit tests and CI configurations (even if they aren't, unfortunately, neutral).
Having a read-me file with instructions to install and set up a development environment are useful to have.
On the other hand, the way we do Towncrier is very specific to us, especially the way we link it to PRs (judging from yesterday's meeting).

Other 'companies' (I'm taking this to mean: other maintainers of many modules) will be left on their own, and I don't think we really have much ability to help them out, except for the fact that: the 'community' template will get them going, and they can look at our solutions to things as a reference.

Again, I think the solution here is more documentations, with more examples in it. I don't think a template can go much further than this in this case.

Yes (that's what I'm saying, hence 'I don't think we really have much ability to help them out'). In other words, this template is not trying to cater to that use.

@babolivier
Copy link
Contributor

git defaults to master and I read that some prefer other terms like trunk

fwiw, git now defaults to main, and trunk is usually for projects that have been exported from SVN (or other VCSs where main is trunk, not sure how many there are out there), so it sounds to me like adding extra configuration just for fixing a fairly unlikely edge case

@callahad
Copy link

git defaults to master and I read that some prefer other terms like trunk

fwiw, git now defaults to main

It doesn't yet, but git man page notes "If not specified, fall back to the default name (currently master, but this is subject to change in the future)"

Though looking more closely... this template does not initialize the Git repo, so it would be subject to config drift between the user's local machine (likely master) and hosting services (likely main). Ugh. ...Can we init the repo?

@babolivier
Copy link
Contributor

git defaults to master and I read that some prefer other terms like trunk

fwiw, git now defaults to main

It doesn't yet, but git man page notes "If not specified, fall back to the default name (currently master, but this is subject to change in the future)"

Ugh, indeed, looks like I got confused because my git is configured to automatically create with main, as per the big yellow warning git shows you when doing init without setting init.defaultBranch:

$ git init
hint: Using 'master' as the name for the initial branch. This default branch name
hint: is subject to change. To configure the initial branch name to use in all
hint: of your new repositories, which will suppress this warning, call:
hint: 
hint: 	git config --global init.defaultBranch <name>
hint: 
hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and
hint: 'development'. The just-created branch can be renamed via this command:
hint: 
hint: 	git branch -m <name>

Ignore me then 😅

@reivilibre
Copy link
Collaborator Author

fwiw, git now defaults to main

Maybe my version is a few months out of date, but mine doesn't. I imagine the version in LTS distros is older still.
(I don't think it's an edge case at all.)

I would really prefer to leave it as-is for the self-contained template, it's not as though it's harming anything and it could save someone the hassle of having to change it. (As a data point: none of my tools (excl GitHub which I don't use personally) default to main at this point in time.)

I will instead take it out of the 'team' template because I assume we will like to standardise on main.

this template does not initialize the Git repo, so it would be subject to config drift between the user's local machine (likely master) and hosting services (likely main). Ugh. ...Can we init the repo?

The hosting services that I'm aware of don't create a branch unless you ask them to (usually along the lines of 'Initialise repo with a README'). Initialising the repo with the chosen branch locally seems reasonable. I will give that a go.

Comment on lines +68 to +80
### Set up Git

What you do from here on is up to you!

However, it's probably worth initialising a Git repository and making the first
commit be exactly what is produced by this template.
(Doing this may make it easier to upgrade an existing project to a later version
of the template in the future: you could branch off from the initial commit,
re-run the template generator, commit, and then merge that branch into your main
branch.)

If you use GitHub or GitLab, you should use its web interface to enable the
included CI configuration.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer necessary now that the repo is initialized for us?

(GitHub Actions is also on by default, so no need to enable anything there. Not sure about GitLab, but presumably people using GitLab would know.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GHA confused me because I definitely had to manually enable it once before, but that might have been because I forked a repository rather than created a fresh one.

I'll check GLCI in a little bit (I'm going to do a full run-through on both GHA and GLCI) and will update that.

Comment on lines 43 to 50
line_length = 88
sections=FUTURE,STDLIB,THIRDPARTY,FIRSTPARTY,TESTS,LOCALFOLDER
default_section=THIRDPARTY
known_first_party={{ cookiecutter.package_name }}
known_tests=tests
multi_line_output=3
include_trailing_comma=true
combine_as_imports=true

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can replace a bunch of this with just profile = black

https://black.readthedocs.io/en/stable/guides/using_black_with_other_tools.html

Also the sections and known_tests could probably be dropped? Default is identical, minus a separate section for TESTS -- how critical is that?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
line_length = 88
sections=FUTURE,STDLIB,THIRDPARTY,FIRSTPARTY,TESTS,LOCALFOLDER
default_section=THIRDPARTY
known_first_party={{ cookiecutter.package_name }}
known_tests=tests
multi_line_output=3
include_trailing_comma=true
combine_as_imports=true
profile = black
combine_as_imports=true
sections=FUTURE,STDLIB,THIRDPARTY,FIRSTPARTY,TESTS,LOCALFOLDER
known_first_party={{ cookiecutter.package_name }}
known_tests=tests

cite: https://pycqa.github.io/isort/docs/configuration/profiles.html

Also, default_section defaults to THIRDPARTY, so that can be removed, too

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether we want / need combine_as_imports and the custom stuff around sections remains an open question

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for investigating these options.

profile = black

That's just lovely!

combine_as_imports

Neither here nor there, I have never seen an as import in Python so far. Will probably just cut it out because it's an odd thing to make a fuss about.

custom stuff around sections

I'm tempted just to cut it out. The isort default can't be that bad.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/tmp/synapse-my-module/.tox/check_codestyle/lib/python3.9/site-packages/isort/settings.py:411: UserWarning: `known_tests` setting is defined, but TESTS is not included in `sections` config option: ('FUTURE', 'STDLIB', 'THIRDPARTY', 'FIRSTPARTY', 'LOCALFOLDER').

See: https://pycqa.github.io/isort/#custom-sections-and-ordering.
  warn(

This is probably why a custom sections attribute was used. I will reinstate it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. Do we know why we need known_tests? What happens if we cut that out?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added tests to known_first_party; will that do?

@callahad
Copy link

We've gone enough rounds on this that, at this point, I wouldn't block merging just so we can have a clean slate to discuss changes. Though if you want to resolve any of my outstanding comments first, more power to you :)

@erikjohnston
Copy link
Member

I think this is a broadly good set of things to include. Thanks!

I'm a bit sad at not having a __version__, as it's quite common and is how I always check for the version of a package. I trust that slightly more than importlib, since __version__ is from the actual code running, importlib will return the wrong version for e.g. a) editable installs and b) packages that have been upgraded since the app was started. (Plus you don't have to worry about figuring out what package a module comes from, i.e. synapse.__version__ vs version("matrix-synapse")). It's not a biggie though.

@erikjohnston erikjohnston removed their request for review November 1, 2021 09:56
@reivilibre reivilibre merged commit e11c02c into main Nov 1, 2021
@reivilibre reivilibre deleted the rei/first_go branch November 1, 2021 11:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants