Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Remove newlines from RSS post titles #34

Closed
nil0x42 opened this issue Oct 10, 2020 · 10 comments
Closed

[Feature]: Remove newlines from RSS post titles #34

nil0x42 opened this issue Oct 10, 2020 · 10 comments
Labels
enhancement New feature or request

Comments

@nil0x42
Copy link

nil0x42 commented Oct 10, 2020

Is your feature request related to a problem? Please describe.
I use a twitter RSS feed as input, and sometimes titles containg newlines, leading to improper markdown formatting.

Describe the solution you'd like
Replace occurences of \r\n & \n in the title by simple space

Describe alternatives you've considered
n/a

Additional context
Here's how titles get formatted:
Screenshot_2020-10-10_13-22-26
Screenshot_2020-10-10_13-21-04

@nil0x42 nil0x42 added the enhancement New feature or request label Oct 10, 2020
@gautamkrishnar
Copy link
Owner

gautamkrishnar commented Oct 10, 2020

You can now use item_exec parameter to do advanced text manipulation via JavaScript if required. Each post item will be available as the post variable:

const post = {
title: item.title.trim(),
url: item.link.trim(),
date: new Date(item.pubDate.trim()),
...customTags
};
// Advanced content manipulation using javascript code
if (ITEM_EXEC) {
try {
eval(ITEM_EXEC);
} catch (e) {
core.error('Failure in executing `item_exec` parameter');
core.error(e);
process.exit(1);
}

You can do something like this:

name: Latest Tweets workflow
on:
  schedule: # Run workflow automatically
    - cron: '0 * * * *' 
jobs:
  update-readme-with-blog:
    name: Update this repo's README with latest blog posts
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: gautamkrishnar/blog-post-workflow@master
        with:
          feed_list: "https://rss.app/feeds/7YyygtUxDsIaLQ3s.xml"
          item_exec: |
                    post.title = post.title.replace('\n',' '); post.title = post.title.replace('\r\n',' ');
                    console.log("test\n world");

Another interesting use case example:
https://github.com/ayushi7rawat/ayushi7rawat/blob/master/.github/workflows/youtube.yml

name: Latest youtube videos
on:
  schedule: # Run workflow automatically
    - cron: '5 * * * *' # Runs every hour, on the hour
  workflow_dispatch: # Run workflow manually (without waiting for the cron to be called), through the Github Actions Workflow page directly
jobs:
  update-readme-with-youtube:
    name: Update this repo's README with latest blog posts
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: gautamkrishnar/blog-post-workflow@master
        with:
          feed_list: "https://www.youtube.com/feeds/videos.xml?channel_id=UCvmONGrUQxL3B3PmSv1JQqQ"
          item_exec: "post.title = post.title.split('|')[0]"
          comment_tag_name: "YOUTUBE"
          commit_message: "Updated with the latest youtube video"

Stripping out html contents in description example:
#74 (comment)

name: Latest blog post workflow
on:
  schedule:
    - cron: 0 * * * *
  push:
    branches:
      - main
jobs:
  update-readme-with-blog:
    name: Update this repo's README with latest blog posts
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: gautamkrishnar/blog-post-workflow@master
        with:
          max_post_count: '5'
          feed_list: 'https://medium.com/feed/@nicko170'
          template: '- [$title]($url): $description $newline'
          date_format: 'UTC:ddd yyyy-mm-dd h:MM:ss TT Z'
          filter_comments: medium
          tag_post_pre_newline: 'true'
          item_exec: |
            post.description = post.description.replace(/<\/?[^>]+(>|$)/g, ""); 

@gautamkrishnar
Copy link
Owner

@nil0x42
Copy link
Author

nil0x42 commented Oct 10, 2020

Wow ! I'm simply shocked by your reactivity !

@nil0x42
Copy link
Author

nil0x42 commented Oct 10, 2020

Just for the record. There is a problem with \n & \r literals in item_exec, so they must be escaped.

As an example, for my case, the correct item_exec is:

item_exec: "post.title = post.title.replace(/(?:\\r\\n|\\r|\\n)/g,' ');"

@gautamkrishnar
Copy link
Owner

Thanks for pointing it out @nil0x42

@nil0x42
Copy link
Author

nil0x42 commented Oct 11, 2020

@gautamkrishnar , is it possible to use this new feature to ignore an entry?
Let's say I want to have 5 first entries that do not contain a specific string in title, is it possible with item_exec to just ignore entry?

@gautamkrishnar
Copy link
Owner

gautamkrishnar commented Oct 11, 2020

@nil0x42 thanks for the suggestion, just released this feature. You can ignore any item by setting post variable to null via javascript using the item_exec param.

Eg:

item_exec: "if (post.title.indexOf('example title to ignore') > -1) post = null;"

You may need to add proper escaping for some special characters.

@nil0x42
Copy link
Author

nil0x42 commented Oct 11, 2020

Hi ! Thank you again for the speed at which you adress issues !
I tried and it doesn't work because post is const, so i get TypeError: Assignment to constant variable. error.

Also, as item_exec migh alter title's length, it might be useful to put TITLE_MAX_LENGTH trimming block code after item_exec, so if item_exec changes the title-'s length, the trimming happens after.

@gautamkrishnar
Copy link
Owner

@nil0x42 oopsie, I will fix that, will update the code to use let instead. 👍 Happy to help.

@gautamkrishnar
Copy link
Owner

@nil0x42 thanks for your suggestions, it's now released: https://github.com/gautamkrishnar/blog-post-workflow/releases/tag/1.3.4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants