You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The website might change, so here's a quick reproducible example from the current state of the page:
>>> import parsel
>>> html = '<h1 class="">LA Galaxy <@ San Jose Earthquakes </h1>'
>>> parsel.Selector(text=html).xpath('//h1').get()
'<h1 class="">LA Galaxy </h1>'
Though IMHO, this was an invalid HTML to begin with, should the webpage have encoded the < char as <, it should work correctly:
import parsel
html = '<h1 class="">LA Galaxy <@ San Jose Earthquakes </h1>'
parsel.Selector(text=html).xpath('//h1').get()
'<h1 class="">LA Galaxy <@ San Jose Earthquakes\xa0 </h1>'
Xpath
//h1
can't extract data correctly fromhttps://www.imdb.com/title/tt6757474/
The text was updated successfully, but these errors were encountered: