Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treat file types consistently across all components #842

Open
eaon opened this issue Oct 25, 2022 · 6 comments
Open

Treat file types consistently across all components #842

eaon opened this issue Oct 25, 2022 · 6 comments

Comments

@eaon
Copy link
Contributor

eaon commented Oct 25, 2022

In response to how we figure out which files get converted to PDF for printing as well as other conversations about the client's current inability to distinguish which files can be printed and which cannot, this conversation happened:

… [@zenmonkeykstop and I] came to the conclusion that "how we treat files in SDW" should probably be a larger conversation with a more coordinated effort, because [it] would also affect the client: the template the client uses does not have LibreOffice installed, but it should also be aware which file types are supported for printing.
Originally posted by @eaon in freedomofpress/securedrop-export#108 (comment)

Because it affects multiple components of the system, I thought this was the best repository to track this discussion.

Status quo

  • The client itself does not distinguish between file types at all, it only knows about operations which are available for all files:
    • Export: handled by securedrop-export, an operation mostly irrelevant to this conversation.
    • Print: Center stage to this. Also uses securedrop-export, which prints anything if given the instruction via metadata.json.
      • securedrop-export's print action does not care about file types other than a file suffix list that decides whether LibreOffice is used to convert a file to a PDF before it's handed off to the printing panel
      • If the client sends an Ogg Vorbis file to be printed, the printer will print (lots of) garbage
    • View
      • 100% delegated to a DispVM, the client merely copies a file over
      • The DispVM (template sd-viewer) makes all the decisions as to what to do with that file via our own mimeapps.list files
    • We maintain 3 different mimeapps lists manually

High level "ideal" (proposed)

  • SDW as a system should be aware of applications it supports
  • From applications, we can extrapolate features (e.g. LibreOffice: view, print, Audacious: view) and mime types (using mimeinfo.cache?)
  • securedrop-client and securedrop-export, as well as sd-viewer's mimeapps.list ought to be able to use that information to make the correct decisions for their respective responsibilities

An effect of the above would that if for example eog would add native .webp support (see #614), we would inherit this automatically instead of having to keep track of it and merge/release changes to support it as well.

Possible implementation

While sd-devices and sd-viewer are currently based on the large template that has LibreOffice installed so mimeinfo.cache ought to have the right kind of information to work with already, sd-app is based on the small template so would need some way of being "informed" of what is supported by which application in the large template. @zenmonkeykstop pointed out that this only needs to be done when updates are run, so this may be relatively easy to pull off.

  • Ship a script to sd-large-*-template that reads mimecache.info data and picks out mime type/application pairs
  • Script transforms pairs into a file (probably json 🤷) that can be ingested by securedrop-client (e.g. {'print': ['application/rtf', 'application/pdf', …], 'view': ['video/ogg', …]})
  • Script also transforms pairs into a mimeapps.list for sd-viewer
  • Could be run by dpkg-triggers (@legoktm?)
  • Once updates have run, dom0 can copy the json file for securedrop-client over to sd-app. Ideally, this would happen after updates have run on sd-large-*-template but before the VM is shut down (though that may not be entirely feasible at the moment)
  • The same code used by the script to map applications and mime types could be used by securedrop-export to make a call whether to convert a file with LibreOffice before it is sent off to the printer or not.

Comment

While the "ideal" here sounds nice in theory, I am certain there's edge cases that aren't accounted for yet. Example: message/rfc822 (see freedomofpress/securedrop-client#2158) would be associated with Thunderbird (if that was installed), and would likely not open the way that we'd want it to, or any compressed archive which we probably want special handling for in any case.

Overall, this issue intends to kick off a conversation, since it'd touch a lot of different components at the same time and is an example of something we'd want deeper understanding of across the team before we move on it.

References

Client: freedomofpress/securedrop-client#918

@eloquence
Copy link
Member

Is there a case for a more naive implementation of mapping common MIME types to "printable" (and maybe "viewable"), without the application context, so that the SecureDrop Client can offer the print/open options only when appropriate?

If I understand correctly, this would imply more ongoing maintenance of that list to ensure that changed application capabilities are accurately reflected, but it doesn't seem to me that those capabilities are changing so rapidly that it's necessarily a large maintenance burden.

@eaon
Copy link
Contributor Author

eaon commented Oct 25, 2022

If we would would prefer to to manually maintain a list of mime types, I'm OK that. However, I don't think the automation part about the application awareness would add a lot more effort to the core of what I think we need: my main concern is that how we handle files is spread across different repos and packages, there's no integration. I think we want integration between sd-viewer's mimeapps.list, securedrop-export and securedrop-client, which is where I believe the bulk of the effort would go regardless of whether we use an application awareness approach vs. a manually maintained list approach.

@gonzalo-bulnes
Copy link
Contributor

gonzalo-bulnes commented Oct 25, 2022

If we agree on the format of the file(s), we could fix freedomofpress/securedrop-client#918 using a manually maintained list, then follow up to generate the file automatically as a transparent improvement.

@rocodes
Copy link
Contributor

rocodes commented Oct 26, 2022

One additional thing to consider is the evenutal support for bulk actions (bulk export, but maybe also bulk print?), which will send an archive containing (potentially) many filetypes to sd-devices [sd-export] for print or export. For this, the correct print behaviour per filetype might need to live in the vm responsible for printing, rather than sending a long config file with each submission and the action to take with it.

But I guess you're saying that if the client were aware of the supported actions, those could be passed to sd-devices, rather than reimplemented in sd-devices?

@gonzalo-bulnes
Copy link
Contributor

gonzalo-bulnes commented Oct 26, 2022

It seems to me that it could be considered responsibility of the Client to only send for printing files that are printable etc. From that point of view, I think the purveyor of service (sd-devices in this example) could keep knowledge of what it can or not do, but more for "consistency in depth" or security purposes, not for user-facing error handling.

@eloquence
Copy link
Member

My intuition would be to approach this problem as follows:

  • create an allow-list that enumerates supported actions for each file type (view/print)
  • add this allow-list to the securedrop-workstation-config package, which is already installed in both templates, and which contains other MIME type hardening and configuration files
  • query the list in Securedrop Client as the primary mechanism for constraining user interactions with files
  • query the list in sd-devices (and perhaps sd-viewer, though that might require introducing a new wrapper script) as a secondary mechanism for "consistency in depth" as Gonzalo put it above

To update the list, we would build and release a new securedrop-workstation-config package.

Does that approach seem viable for managing/updating this config & querying it, or am I overlooking significant drawbacks, use cases, or logic problems?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants