Skip to content

Readability mode drops hyperlinked paragraph #878

Open
@swethapillai

Description

@swethapillai

Hi all,
I am experiencing an issue with applying readability to the following webpage:
https://www.hartlepoolmail.co.uk/news/crime/seven-hartlepool-men-appear-in-court-charged-with-the-murder-of-michael-phillips-637809

This webpage has two paragraphs that are hyperlinked:

<div class="Markup__ParagraphWrapper-sc-13q6ywe-0 iWyzoK markup ">
<p><a href="https://www.hartlepoolmail.co.uk/news/crime/live-updates-seven-hartlepool-men-appear-court-charged-murder-michael-phillips-637633" rel="nofollow" target="_blank" data-vars-event="gaEvent" data-vars-ec="navigation" data-vars-ea="in article" data-vars-el="plain links" data-vars-aidclick="637633" data-vars-titleclick="On Friday September 27, John Musgrave, 54, Sean Musgrave, 30, both Wordsworth Avenue, Hartlepool, and Anthony Small, 39, of Rydal Street, Hartlepool all denied the charge of murder. " data-vars-urlclick="https://www.hartlepoolmail.co.uk/news/crime/live-updates-seven-hartlepool-men-appear-court-charged-murder-michael-phillips-637633"><strong>On Friday September 27, John Musgrave, 54, Sean Musgrave, 30, both Wordsworth Avenue, Hartlepool, and Anthony Small, 39, of Rydal Street, Hartlepool all denied the charge of murder. </strong></a></p>
</div>
<div class="Markup__ParagraphWrapper-sc-13q6ywe-0 iWyzoK markup ">
<p><strong><a href="https://www.hartlepoolmail.co.uk/news/crime/niramax-boss-neil-elliott-and-co-accused-face-trial-next-year-after-denying-murder-of-michael-phillips-472180" rel="nofollow" target="_blank" data-vars-event="gaEvent" data-vars-ec="navigation" data-vars-ea="in article" data-vars-el="plain links" data-vars-aidclick="472180" data-vars-titleclick="Previously, a director of Niramax waste management firm, Neil Elliott, 44, of Briarfield Close, Hartlepool, and co-accused Lee Darby, 31, of Ridley Court, Hartlepool denied killing Mr Phillips. " data-vars-urlclick="https://www.hartlepoolmail.co.uk/news/crime/niramax-boss-neil-elliott-and-co-accused-face-trial-next-year-after-denying-murder-of-michael-phillips-472180">Previously, a director of Niramax waste management firm, Neil Elliott, 44, of Briarfield Close, Hartlepool, and co-accused Lee Darby, 31, of Ridley Court, Hartlepool denied killing Mr Phillips. </a></strong></p>
</div>

However, readability drops the second hyperlinked paragraph, and only keeps the first hyperlinked paragraph amongst the other extracted text.

image image

I believe it could be an issue with readability and the density of commas in these wrapped sections of text. I found that when I remove a single comma from the first hyperlinked paragraph and then apply readability mode - both hyperlinked paragraphs show up.

The way in which this was found was:

  1. Open webpage in firefox
  2. Click on readability icon

Is this an expected limitation of the package?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions