Skip to content

Conversation

@oliver408i
Copy link
Contributor

This PR removed the old use of exec for loading python dictionaries to instead using importlib. No changes are required to any dictionaries nor Plover.

Why

  • Improves safety and maintainability by avoiding exec
  • Integrates plugins/modules cleanly with Python’s import system (sys.modules)
  • Better debugging support (tracebacks, see Traceback only show "File <string>" in case an error happen in the dictionary #10 )
  • Follows Python's module system, instead of exec dumping everything into a flat dict (this is how additional python code should be loaded)
  • exec can run any arbitrary Python code, importlib constrains behavior to well-formed Python modules
  • Most other plugin systems use this same system, including setuptools

Overall this makes the code safer, helps dictionary developers, and makes it generally easier to maintain, with zero loss in functionality and no required changes to any other code including dictionaries and Plover.

Copy link
Contributor

@mkrnr mkrnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a huge improvement, thanks a lot! And sorry for not responding earlier, I wasn't watching this repo.

Some small suggestions below.

Edit: I tested it locally with the latest Plover and it works great for my Python dictionary files.

@oliver408i
Copy link
Contributor Author

This will also make #11 obsolete (in my mind its a better solution than compile() )

@oliver408i
Copy link
Contributor Author

Additionally, since we are using importlib already, we should consider limiting scopes passed to the module loaded. This includes blocking most modules such as os or sys to prevent malicious code from being executed.

@mkrnr
Copy link
Contributor

mkrnr commented Jul 9, 2025

Additionally, since we are using importlib already, we should consider limiting scopes passed to the module loaded. This includes blocking most modules such as os or sys to prevent malicious code from being executed.

That's a tricky one. I see the benefits from a security standpoint but it would restrict the possibilities of what a python dictionary can do.

I would like there to be something like a "safe mode" config for this plugin that is set to true by default but that "power users" could set to false. If "safe mode" is enabled, such modules would be blocked. Certainly something for another PR though.

What do you think?

@oliver408i
Copy link
Contributor Author

Additionally, since we are using importlib already, we should consider limiting scopes passed to the module loaded. This includes blocking most modules such as os or sys to prevent malicious code from being executed.

That's a tricky one. I see the benefits from a security standpoint but it would restrict the possibilities of what a python dictionary can do.

I would like there to be something like a "safe mode" config for this plugin that is set to true by default but that "power users" could set to false. If "safe mode" is enabled, such modules would be blocked. Certainly something for another PR though.

What do you think?

I understand what you mean but there is zero reason why any python dictionary will need to access os or sys. I would expect stuff like math re json to be used at the most.

@mkrnr
Copy link
Contributor

mkrnr commented Jul 9, 2025

Additionally, since we are using importlib already, we should consider limiting scopes passed to the module loaded. This includes blocking most modules such as os or sys to prevent malicious code from being executed.

That's a tricky one. I see the benefits from a security standpoint but it would restrict the possibilities of what a python dictionary can do.
I would like there to be something like a "safe mode" config for this plugin that is set to true by default but that "power users" could set to false. If "safe mode" is enabled, such modules would be blocked. Certainly something for another PR though.
What do you think?

I understand what you mean but there is zero reason why any python dictionary will need to access os or sys. I would expect stuff like math re json to be used at the most.

I can see examples like:

  • Get the home directory (either for direct print-out or to be used for something): os.path.expanduser("~")
  • Get the OS-specific line separator: os.linesep
  • Make the plugin behave differently based on operating system: if sys.platform == "darwin"
  • Get environment variables: os.getenv(name, "")

@oliver408i
Copy link
Contributor Author

Yea I see, it's not going to be a perfect solution. Anyway this would be for further discussion. This PR should stay as is for now

@oliver408i
Copy link
Contributor Author

All changes have been done. Please review

Copy link
Contributor

@mkrnr mkrnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks again for the contribution and for the security discussion; it's great to see people care about small open-source projects like this one 🙂

@oliver408i
Copy link
Contributor Author

No problem! Thanks for the review and merge

@user202729
Copy link
Member

user202729 commented Jul 14, 2025

it's questionable how importlib is safer than exec exactly. But there isn't too much harm either except slightly longer code

more serious issues:

  • sys.modules[module_name] = mod # Optional but allows re-imports and traceback clarity how exactly? The traceback lists the file path, never the module name. Also normally module code doesn't contain a /, which makes it impossible to re-import it with a import X statement anyway.
  • normally error message starts with a lowercase character i.e. the old convention is more correct in this aspect. Try running 1/0 in Python. (changing to f-string is harmless though.)

@mkrnr
Copy link
Contributor

mkrnr commented Jul 14, 2025

it's questionable how importlib is safer than exec exactly. But there isn't too much harm either except slightly longer code

more serious issues:

  • sys.modules[module_name] = mod # Optional but allows re-imports and traceback clarity how exactly? The traceback lists the file path, never the module name. Also normally module code doesn't contain a /, which makes it impossible to re-import it with a import X statement anyway.
  • normally error message starts with a lowercase character i.e. the old convention is more correct in this aspect. Try running 1/0 in Python. (changing to f-string is harmless though.)

Thanks for your input!

Initially, the module name was just the file name but then I suggested that there could be issues when having two python files with the same file name, see this discussion. We could change it back to file name only or find another way to make the name unique.

@user202729
Copy link
Member

user202729 commented Jul 14, 2025

There should be no need to inject it into sys.modules at all. Likely this is not what the user wants/needs.

Unless the user wants to import one Python dictionary from another Python dictionary using the file name... I think that case was not (never) supported, and I suppose the user could figure out how to do that themselves.

@oliver408i
Copy link
Contributor Author

oliver408i commented Jul 14, 2025

it's questionable how importlib is safer than exec exactly. But there isn't too much harm either except slightly longer code

more serious issues:

* `sys.modules[module_name] = mod # Optional but allows re-imports and traceback clarity` how exactly? The traceback lists the file path, never the module name. Also normally module code doesn't contain a `/`, which makes it impossible to re-import it with a `import X` statement anyway.

* normally error message starts with a **lowercase** character i.e. the old convention is more correct in this aspect. Try running `1/0` in Python. (changing to f-string is harmless though.)

The sys.modules was there in the original version before I changed how modules were named. It doesn't make sense with the current naming change to the full path, so unless that is changed, yes, it can be removed.

As for the current module naming system, using just filename is not the best solution but its the easiest non-conflicting way. As you mentioned we aren't expecting others dicts to import dicts so it's fine since we aren't importing the module anywhere else.

And yes, while importlib is not inherently safer than exec it is just generally better practice. Nowhere should exec be used in a production application (or plugin, in this case). Additionally it allows tracing errors in modules, which is a known concern as per the issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants