Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Email scraper #1

Closed
PianSom opened this issue Aug 24, 2024 · 5 comments
Closed

Email scraper #1

PianSom opened this issue Aug 24, 2024 · 5 comments

Comments

@PianSom
Copy link

PianSom commented Aug 24, 2024

First off - thanks for sharing. I've been thinking for ages about trying to do something similar.

Is your email scraper in a state where it could be shared? Even if it's for a different platform/setup it's always easier to build on someone else's work.

@gcoan
Copy link
Contributor

gcoan commented Aug 24, 2024

First off - thanks for sharing. I've been thinking for ages about trying to do something similar.

Is your email scraper in a state where it could be shared? Even if it's for a different platform/setup it's always easier to build on someone else's work.

I came here to make a comment of exactly the same thing, could the email scraper be shared so others in other regions can set it up?

Does it run under HA or under another host operating system?

@8none1
Copy link
Owner

8none1 commented Aug 24, 2024

I've just uploaded the code here: https://github.com/8none1/octopus_powerups/blob/main/gapps_scripts/powerups_email_finder.gs

The code runs inside Google Apps Script and is dependent on having read only access to a GMail account. I recommend that you create a separate gmail account to run this in and then use some filtering from your personal email account to find and forward the power ups email to this new GMail account.

The work flow is:

  1. Publish the gs code to the web via GApps Script so that when you call a "secret" URL the code gets run (the entry point is doGet()). Give it read only rights to your new GMail account.
  2. From your original Email account, Power ups email from Octopus comes in -> A filter matches (e.g. from: hello@octopus subject: power-ups) and forwards the email to your separate gmail account)
  3. When ever you GET the secret URL (where the GApps script is published) the code will run and will return a JSON object with zero to 3 entries in an array. This URL is the source for Home Assistant. It looks like a normal URL which provides a JSON object.
  4. If you want to publish the JSON object for other people to use I would recommend that you add another layer of indirection. Run a cron job somewhere every 15 mins or so which GETs the secret URL and stashes the results somewhere with a bit more bandwidth. This means that every time someone fetches the data the GApps script is not being run, instead the pre-downloaded file (via the cron job) is fetched. In this case do not publish the secret URL, instead publish the URL of where the JSON response has been stashed. In my case I'm backing it off to Github because I figure they can handle it.

I did consider having GH Actions run every 15 mins to fetch the JSON object from the GApps script, but I think that would have leaked the secret URL and I didn't want to do that, so instead I'm running a cron job on a Pi in my house (the same Pi running HA).

The risks that I can imagine are:

  • a GApps script running against a GMail account could steal your email
  • someone could send you malicious emails which match your filter causing the script to break and do untold bad things

@8none1
Copy link
Owner

8none1 commented Aug 24, 2024

I would welcome improvements to the GS code 😄

Plus I noticed that the checking of the headers doesn't actually do anything. Guess I never hooked it up, probably because it wasn't reliable.

@8none1
Copy link
Owner

8none1 commented Aug 24, 2024

and if you didn't already see it; there is a bit more information here: https://www.whizzy.org/2024-01-24-powerups-api/

@8none1
Copy link
Owner

8none1 commented Aug 28, 2024

Marking as resolved. I added a note to the README to link to this issue for people who would like more info on the scraper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
@8none1 @PianSom @gcoan and others