Add Synapse module template #1

reivilibre · 2021-10-13T13:43:04Z

The best description of this template is the README.

I have tested it and everything seems to be in order.

(That said, Mypy complains and will continue to do so until we release Synapse 1.46 with matrix-org/synapse#11054).

reivilibre · 2021-10-13T16:32:59Z

Example generation: https://github.com/matrix-org/synapse-user-restrictions/tree/f672717db2ff94f6edf7c8ed39ffce8fab25d7cc

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>

reivilibre · 2021-10-14T13:01:38Z

I've put this on for review; I don't know if it's worth reviewing or just putting right in?

babolivier

Sorry I finished with the readme file, meaning some comments I made on the code file where written before I saw it was documented.

{{ cookiecutter.directory_name }}/README.md

{{ cookiecutter.directory_name }}/{{ cookiecutter.package_name }}/module.py

babolivier · 2021-10-14T13:00:35Z

{{ cookiecutter.directory_name }}/{{ cookiecutter.package_name }}/module.py

+    @staticmethod
+    def parse_config(config: Dict[Any, Any]) -> {{ cookiecutter.module_class_name }}Config:
+        try:
+            return cattr.structure(config, {{ cookiecutter.module_class_name }}Config)


Why not just {{ cookiecutter.module_class_name }}Config(**config)?

This works recursively and is smart enough to convert types.
For example, if my struct has a Set[str] as a field, that will accept lists from the JSON/YAML and convert it properly.

If it's too weird, I can probably take it away?

Right, this seems like a good enough reason to let it in, I was a bit puzzled because I've never encountered cattr.structure in the past. It sounds good to leave this in, maybe with a comment that explains what it does?

yeah, actually, I thought it validated the types too but it doesn't seem to. Maybe I should revert to just attrs.

This is cool, but I think trying to be a little bit too magic. I also don't think it's a good idea to encourage Config(**config) as that's just going to ignore any validation (unless people use attrs validators, which they won't unless they know attrs intimately). Personally, I think its enough to do something like:

# This is where we pull out and parse any config from the provided dict, e.g.: # # some_option = config.get("some_option") # if some_option is None: # raise ConfigError("Missing option 'some_option'") # if not isinstance(some_option, str): # raise ConfigError("Config option 'some_option' must be a string") # return Config()

Maybe this is something we need to fix in the module documentation (eg https://matrix-org.github.io/synapse/develop/modules/writing_a_module.html) rather than in here?
I'm thinking that if we think there's value in making modules use attrs then maybe the documentation should give basic explanations on how to use it.

I almost want an excuse to use ConfigError in the code so that the import for it isn't redundant (and therefore has to be removed).
Maybe this will be improved in Synapse 1.46, but until then, we don't export the error types so if you use your IDE to import ConfigError, you'll import it from its internal path rather than from synapse.module_api.errors.

However, it's hard to see an excuse to do so if we're cutting out the example... thoughts? x)

babolivier · 2021-10-14T13:00:55Z

{{ cookiecutter.directory_name }}/{{ cookiecutter.package_name }}/module.py

+
+@attr.s(auto_attribs=True, frozen=True)
+class {{ cookiecutter.module_class_name }}Config:
+    example_option: bool = False


I'd suggest making it a pass rather than trying to make it an example.

I feel a bit more strongly about this one; it's fine for people like us who know what attrs is but the idea was to provide an example to reduce some of that intimidation factor for other people.
(also the README follows this example and I was hoping to set a good example of documenting the module's options ;p)

Fair, then it probably needs a comment to explain what's going on.

If we're worried that people don't know how to use attrs then I think that's more of an argument to not use it at all?

not use it at all

Yeah, that's fair I suppose. I find it quite nice but suppose this is more of a personal thing and not really suited for a template. Will cut it out

Hm, if we cut out attrs entirely, I'll have to put something worse in its place (like passing back an untyped dict), which is not really what I hoped for in terms of keeping everything mypy-strict.

What do you think would be best?

Maybe it'd be worth pointing to our own repos that use this template as examples

I'd still keep the Config class, and just leave it as empty at first?

babolivier · 2021-10-14T13:03:18Z

{{ cookiecutter.directory_name }}/{{ cookiecutter.package_name }}/__init__.py

+
+from {{ cookiecutter.package_name }}.module import {{ cookiecutter.module_class_name }}
+
+__all__ = ["{{ cookiecutter.module_class_name }}"]


Why not just include the class in this file rather than make a different file for it? Since we already do that for quite a few modules (as most of the time the complexity of a module's code is low enough to allow it).

It feels like bad practice to put anything too beefy in __init__.py (circular import issues spring to mind as an actual consequence you usually feel).

I think there's a discontinuity; in single-file projects, beefy __init__ seems OK but if you stretch into a multiple-file project, I think one generally wants __init__ to be little more than structure.

It also feels more like a structural file, e.g. __version__ = ... and so on. When looking through a project, I tend to glaze over (ignore) __init__.py files at first glance. However, maybe that's just me... 😇

I'm curious what other people think?

I think there's a discontinuity; in single-file projects, beefy __init__ seems OK but if you stretch into a multiple-file project, I think one generally wants __init__ to be little more than structure.

I think the same, my point was that in practice we're way more often in the former case than the latter.

README.md

babolivier · 2021-10-14T13:20:00Z

Oh also, would it be interesting to add a CODEOWNERS file to automatically request reviews from synapse-core on new PRs? I think there could be value in trying to standardise this across our repos

reivilibre · 2021-10-14T13:40:07Z

Oh also, would it be interesting to add a CODEOWNERS file to automatically request reviews from synapse-core on new PRs? I think there could be value in trying to standardise this across our repos

Hmmmm. That sounds a bit like a plan but it doesn't really apply to third-party users of the template. Wonder if I can make it an option...?

babolivier · 2021-10-14T13:42:14Z

Hmmmm. That sounds a bit like a plan but it doesn't really apply to third-party users of the template. Wonder if I can make it an option...?

Ah fair enough, it's true the approach I used was us using it rather it being used for third-party modules, my bad.

reivilibre · 2021-10-14T13:43:55Z

It makes sense (though I'm so used to requesting Synapse Core at this point (assuming everyone else is too?) that maybe it's not worth spending too much time worrying about it), I'll see how hard it is to make it an option though.

callahad

If we're using this for new projects, should we take the opportunity to move away from legacy imperative setup.py and to a declarative setup.cfg? Keeps us with setuptools, just... more maintainable. Happy to provide a diff against this PR if that'd be useful.

clokep · 2021-10-14T16:03:58Z

If we're using this for new projects, should we take the opportunity to move away from legacy imperative setup.py and to a declarative setup.cfg?

Do you have any articles on the benefits of this? This is the first I've heard setup.py being called legacy.

callahad · 2021-10-14T19:17:30Z

If we're using this for new projects, should we take the opportunity to move away from legacy imperative setup.py and to a declarative setup.cfg?

Do you have any articles on the benefits of this? This is the first I've heard setup.py being called legacy.

I've posted a longer discussion privately, but the central point is that Setuptools itself wants you to use it declaratively. Its own quickstart docs default to setup.cfg and state:

To avoid executing arbitrary scripts and boilerplate code, we are transitioning into a full-fledged setup.cfg to declare your package information instead of running setup().

erikjohnston · 2021-10-15T09:04:18Z

Thanks for driving this @reivilibre, and would be a marked improvement on our current ad hoc stuff, and generally seems sane.

One thing that worries me a bit is the maintainability of the architecture: i.e. what happens if we want to change the template (e.g. add a new python version to CI)? Right now we'd have to change the template and all existing modules, which is going to lead to drift in how our modules are set up.

I'm not really sure what we can do about that. We could aim to use reusable github pipelines, maybe bundle some of the binaries into a separate dev package that we can pull in and so more easily update? It also makes me more sympathetic to the idea that we should try and be as minimal as possible, e.g. is using towncrier giving enough benefit that its worth maintaining across many modules?

{{ cookiecutter.directory_name }}/{{ cookiecutter.package_name }}/__init__.py

babolivier · 2021-10-15T09:09:50Z

One thing that worries me a bit is the maintainability of the architecture: i.e. what happens if we want to change the template (e.g. add a new python version to CI)? Right now we'd have to change the template and all existing modules, which is going to lead to drift in how our modules are set up.

I'm not really sure what we can do about that. We could aim to use reusable github pipelines, maybe bundle some of the binaries into a separate dev package that we can pull in and so more easily update? It also makes me more sympathetic to the idea that we should try and be as minimal as possible, e.g. is using towncrier giving enough benefit that its worth maintaining across many modules?

On this, separately to this repo, I want to spend some time (either today or next week) looking into how to use reusable github action pipelines for modules. I agree that the current situation makes it a faff to keep the CI up to date, but I suspect reusable pipelines might be the solution to this problem.

{{ cookiecutter.directory_name }}/{{ cookiecutter.package_name }}/module.py

callahad · 2021-10-15T09:47:22Z

{{ cookiecutter.directory_name }}/{{ cookiecutter.package_name }}/__init__.py

@@ -0,0 +1,11 @@
+from importlib.metadata import PackageNotFoundError, version


importlib.metadata is only available in Python >= 3.8. We should either fall back to importlib_metadata or set a minimum python version in the package metadata.

Ah grumble, I used importlib_metadata originally but it doesn't have type annotations.

Thinking it might be best to go with Erik's plan and pull it from here into setup.cfg if that's a supported way of doing business.

reivilibre · 2021-10-15T10:32:10Z

I don't think GitHub reusable pipelines are the answer here, because they don't apply to GitLab and also don't apply to local development.

Someone mentioned https://earthly.dev but I'm struggling to get a sense of what it feels like and don't know if that really applies to what we're trying to do.

I wonder if git submodules may be useful for pulling in reusable scaffolding: linter configurations, development scripts, and so on?

It might be useful to have a point-by-point list of what we think is concerning to us from a maintenance point of view, because it's hard to fully appreciate the concerns without some specifics.

The things that come to my mind:

CI configs are bound to specific Python versions, which we will want to increment (e.g. to follow Synapse's version) as they age
- yes
Linters are bound to specific versions
- Not the best for standardisation but not the end of the world; all the scripts will still work, CI still works, the developer experience is the same.
- (Mypy might be a bit more of a special case)

I'm noticing quickly that we might end up wanting our own variant of this template because our needs are slightly different to that of a third-party module (mostly around the reusability, but also e.g. CODEOWNERS has been mentioned).
If possible, I would like to separate out first-party concerns from this PR into issue #3.

reivilibre · 2021-10-15T10:46:56Z

So we now have three competing suggestions for doing the package version :P :

setuptools_scm (rely on git tags) (→ __version__ using importlib)
setup.cfg (→ __version__ using importlib)
__version__ (→ setup.cfg using version = attr: package.__version__)
- slightly undermines the credibility of 'setup.cfg is meant to be declarative', but importlib is feeling a little bit nastier

Will wait for a round of opinions on that.

As a side question: Do we care about __version__ anyway? Our other packages might be doing it for monitoring's sake. https://github.com/matrix-org/synapse-freeze-room/blob/main/freeze_room/__init__.py, my spiritual reference, doesn't seem to care about that.

(hopefully)

babolivier · 2021-10-15T11:08:59Z

I don't think GitHub reusable pipelines are the answer here, because they don't apply to GitLab and also don't apply to local development.

Someone mentioned https://earthly.dev but I'm struggling to get a sense of what it feels like and don't know if that really applies to what we're trying to do.

I wonder if git submodules may be useful for pulling in reusable scaffolding: linter configurations, development scripts, and so on?

It might be useful to have a point-by-point list of what we think is concerning to us from a maintenance point of view, because it's hard to fully appreciate the concerns without some specifics.

The things that come to my mind:

CI configs are bound to specific Python versions, which we will want to increment (e.g. to follow Synapse's version) as they age

yes

Linters are bound to specific versions

Not the best for standardisation but not the end of the world; all the scripts will still work, CI still works, the developer experience is the same.

(Mypy might be a bit more of a special case)

I'm noticing quickly that we might end up wanting our own variant of this template because our needs are slightly different to that of a third-party module (mostly around the reusability, but also e.g. CODEOWNERS has been mentioned). If possible, I would like to separate out first-party concerns from this PR into issue #3.

I think an issue with this PR as it is, is that the template tries to be everything at once and lacks a precise target.

A company writing a Synapse module, us as the Synapse team writing a Synapse module, and a community member writing a Synapse module, are three very different situations that will produce very different code.

I don't think we can realistically make this template appeal to everyone without making it overly complicated and difficult to maintain, and even then it wouldn't be fully possible (because people have e.g. their own preferences for coding style/linting/etc).

Since all of the files in this template seem to be centered around how we, as the Synapse team, write modules, why not make this template "the template the Synapse team uses, which might or might not fit your needs"? I think if we want to make it easier for people that aren't us to write a module, the best way is more documentation/tutorials.

reivilibre · 2021-10-15T11:21:13Z

@babolivier

After thinking it through, my current plan is to make this PR a baseline template, so that we have something for community members. I don't want to make perfect the enemy of the good here.
I personally really appreciate this sort of template when I'm starting using something; it seems only fair to provide something since I've gone to the effort of making this.
(I think I'll probably cut Towncrier and releasing steps out of it; probably also disregard most of the things that don't make sense for someone on the more casual end of things — snipping __version__ out might be an idea. Will probably leave GitHub/GitLab CI in because those two systems are popular.)
We'll then call that 'done' and branch off to our own variant, which supports us more directly (and I'd like to refer discussion about that to #3).

Other 'companies' (I'm taking this to mean: other maintainers of many modules) will be left on their own, and I don't think we really have much ability to help them out, except for the fact that: the 'community' template will get them going, and they can look at our solutions to things as a reference.

babolivier · 2021-10-15T11:34:39Z

After thinking it through, my current plan is to make this PR a baseline template, so that we have something for community members. I don't want to make perfect the enemy of the good here.

If that's the case then I guess we should remove everything that's not a Python file, since everything else is very much a copy of how we do things to meet our own needs.

Other 'companies' (I'm taking this to mean: other maintainers of many modules) will be left on their own, and I don't think we really have much ability to help them out, except for the fact that: the 'community' template will get them going, and they can look at our solutions to things as a reference.

Again, I think the solution here is more documentations, with more examples in it. I don't think a template can go much further than this in this case.

reivilibre · 2021-10-15T13:29:48Z

If that's the case then I guess we should remove everything that's not a Python file, since everything else is very much a copy of how we do things to meet our own needs.

I would say that's a bit extreme.
Code formatting and type checking is not a requirement that's unique to us, and looking at other templates, it seems in order to provide a way of doing that. Similarly for unit tests and CI configurations (even if they aren't, unfortunately, neutral).
Having a read-me file with instructions to install and set up a development environment are useful to have.
On the other hand, the way we do Towncrier is very specific to us, especially the way we link it to PRs (judging from yesterday's meeting).

Other 'companies' (I'm taking this to mean: other maintainers of many modules) will be left on their own, and I don't think we really have much ability to help them out, except for the fact that: the 'community' template will get them going, and they can look at our solutions to things as a reference.

Again, I think the solution here is more documentations, with more examples in it. I don't think a template can go much further than this in this case.

Yes (that's what I'm saying, hence 'I don't think we really have much ability to help them out'). In other words, this template is not trying to cater to that use.

babolivier · 2021-10-20T09:47:56Z

git defaults to master and I read that some prefer other terms like trunk

fwiw, git now defaults to main, and trunk is usually for projects that have been exported from SVN (or other VCSs where main is trunk, not sure how many there are out there), so it sounds to me like adding extra configuration just for fixing a fairly unlikely edge case

callahad · 2021-10-20T10:11:30Z

git defaults to master and I read that some prefer other terms like trunk

fwiw, git now defaults to main

It doesn't yet, but git man page notes "If not specified, fall back to the default name (currently master, but this is subject to change in the future)"

Though looking more closely... this template does not initialize the Git repo, so it would be subject to config drift between the user's local machine (likely master) and hosting services (likely main). Ugh. ...Can we init the repo?

babolivier · 2021-10-20T10:19:10Z

git defaults to master and I read that some prefer other terms like trunk

fwiw, git now defaults to main

It doesn't yet, but git man page notes "If not specified, fall back to the default name (currently master, but this is subject to change in the future)"

Ugh, indeed, looks like I got confused because my git is configured to automatically create with main, as per the big yellow warning git shows you when doing init without setting init.defaultBranch:

$ git init
hint: Using 'master' as the name for the initial branch. This default branch name
hint: is subject to change. To configure the initial branch name to use in all
hint: of your new repositories, which will suppress this warning, call:
hint: 
hint: 	git config --global init.defaultBranch <name>
hint: 
hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and
hint: 'development'. The just-created branch can be renamed via this command:
hint: 
hint: 	git branch -m <name>

Ignore me then 😅

reivilibre · 2021-10-20T10:19:59Z

fwiw, git now defaults to main

Maybe my version is a few months out of date, but mine doesn't. I imagine the version in LTS distros is older still.
(I don't think it's an edge case at all.)

I would really prefer to leave it as-is for the self-contained template, it's not as though it's harming anything and it could save someone the hassle of having to change it. (As a data point: none of my tools (excl GitHub which I don't use personally) default to main at this point in time.)

I will instead take it out of the 'team' template because I assume we will like to standardise on main.

this template does not initialize the Git repo, so it would be subject to config drift between the user's local machine (likely master) and hosting services (likely main). Ugh. ...Can we init the repo?

The hosting services that I'm aware of don't create a branch unless you ask them to (usually along the lines of 'Initialise repo with a README'). Initialising the repo with the chosen branch locally seems reasonable. I will give that a go.

I have opened a [PR] against aiounittest to resolve this [PR]: kwarunek/aiounittest#19

callahad · 2021-10-22T13:27:42Z

README.md

+### Set up Git
+
+What you do from here on is up to you!
+
+However, it's probably worth initialising a Git repository and making the first
+commit be exactly what is produced by this template.
+(Doing this may make it easier to upgrade an existing project to a later version
+of the template in the future: you could branch off from the initial commit,
+re-run the template generator, commit, and then merge that branch into your main
+branch.)
+
+If you use GitHub or GitLab, you should use its web interface to enable the
+included CI configuration.


No longer necessary now that the repo is initialized for us?

(GitHub Actions is also on by default, so no need to enable anything there. Not sure about GitLab, but presumably people using GitLab would know.)

GHA confused me because I definitely had to manually enable it once before, but that might have been because I forked a repository rather than created a fresh one.

I'll check GLCI in a little bit (I'm going to do a full run-through on both GHA and GLCI) and will update that.

{{ cookiecutter.directory_name }}/.github/workflows/ci.yml

{{ cookiecutter.directory_name }}/README.md

{{ cookiecutter.directory_name }}/pyproject.toml

callahad · 2021-10-22T13:54:50Z

{{ cookiecutter.directory_name }}/setup.cfg

+line_length = 88
+sections=FUTURE,STDLIB,THIRDPARTY,FIRSTPARTY,TESTS,LOCALFOLDER
+default_section=THIRDPARTY
+known_first_party={{ cookiecutter.package_name }}
+known_tests=tests
+multi_line_output=3
+include_trailing_comma=true
+combine_as_imports=true


I think we can replace a bunch of this with just profile = black

https://black.readthedocs.io/en/stable/guides/using_black_with_other_tools.html

Also the sections and known_tests could probably be dropped? Default is identical, minus a separate section for TESTS -- how critical is that?

Suggested change

line_length = 88

sections=FUTURE,STDLIB,THIRDPARTY,FIRSTPARTY,TESTS,LOCALFOLDER

default_section=THIRDPARTY

known_first_party={{ cookiecutter.package_name }}

known_tests=tests

multi_line_output=3

include_trailing_comma=true

combine_as_imports=true

profile = black

combine_as_imports=true

sections=FUTURE,STDLIB,THIRDPARTY,FIRSTPARTY,TESTS,LOCALFOLDER

known_first_party={{ cookiecutter.package_name }}

known_tests=tests

cite: https://pycqa.github.io/isort/docs/configuration/profiles.html

Also, default_section defaults to THIRDPARTY, so that can be removed, too

Whether we want / need combine_as_imports and the custom stuff around sections remains an open question

Thanks for investigating these options.

profile = black

That's just lovely!

combine_as_imports

Neither here nor there, I have never seen an as import in Python so far. Will probably just cut it out because it's an odd thing to make a fuss about.

custom stuff around sections

I'm tempted just to cut it out. The isort default can't be that bad.

/tmp/synapse-my-module/.tox/check_codestyle/lib/python3.9/site-packages/isort/settings.py:411: UserWarning: `known_tests` setting is defined, but TESTS is not included in `sections` config option: ('FUTURE', 'STDLIB', 'THIRDPARTY', 'FIRSTPARTY', 'LOCALFOLDER'). See: https://pycqa.github.io/isort/#custom-sections-and-ordering. warn(

This is probably why a custom sections attribute was used. I will reinstate it.

Hm. Do we know why we need known_tests? What happens if we cut that out?

I've added tests to known_first_party; will that do?

callahad · 2021-10-22T13:56:33Z

We've gone enough rounds on this that, at this point, I wouldn't block merging just so we can have a clean slate to discuss changes. Though if you want to resolve any of my outstanding comments first, more power to you :)

This reverts commit 925e537.

…ts as known first-party

erikjohnston · 2021-11-01T09:56:47Z

I think this is a broadly good set of things to include. Thanks!

I'm a bit sad at not having a __version__, as it's quite common and is how I always check for the version of a package. I trust that slightly more than importlib, since __version__ is from the actual code running, importlib will return the wrong version for e.g. a) editable installs and b) packages that have been upgraded since the app was started. (Plus you don't have to worry about figuring out what package a module comes from, i.e. synapse.__version__ vs version("matrix-synapse")). It's not a biggie though.

Add module project template

e6ccb25

Fix GitHub CI configuration

e62af52

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>

reivilibre requested review from babolivier and a team October 14, 2021 12:59

babolivier self-assigned this Oct 14, 2021

babolivier reviewed Oct 14, 2021

View reviewed changes

callahad reviewed Oct 14, 2021

View reviewed changes

erikjohnston reviewed Oct 15, 2021

View reviewed changes

{{ cookiecutter.directory_name }}/{{ cookiecutter.package_name }}/__init__.py Show resolved Hide resolved

erikjohnston reviewed Oct 15, 2021

View reviewed changes

{{ cookiecutter.directory_name }}/{{ cookiecutter.package_name }}/module.py Outdated Show resolved Hide resolved

callahad reviewed Oct 15, 2021

View reviewed changes

reivilibre mentioned this pull request Oct 15, 2021

Concerns about maintaining many modules #3

Open

reivilibre added 2 commits October 15, 2021 12:05

Make the template less of an example and more of a template

3aec296

Configuration dictionary is stringly-keyed

7302b75

(hopefully)

Don't bother with __version__

f95f74a

reivilibre added 3 commits October 21, 2021 11:23

HOPEFULLY TEMPORARY: Skip mypy imports of aiounittest

925e537

I have opened a [PR] against aiounittest to resolve this [PR]: kwarunek/aiounittest#19

Fix pyproject.toml/setup.cfg-driven build

c7ccc6b

Initialise the git repository once the template has been used

0f11662

reivilibre requested review from callahad and babolivier October 21, 2021 11:09