-
Hello, I am new to Heritirx. I would like to have Heritirx crawl only one web page and the embedded content it contains. What I have tried so far:. I set maxHops in TooManyHopsDecideRule to 1. → Did not work as desired. I used machine translation for this post. |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
Hi @nk-yuta ,
Then use
Those are the main settings for getting one page with embeds. I recommend keeping |
Beta Was this translation helpful? Give feedback.
-
Thanks for the advice! |
Beta Was this translation helpful? Give feedback.
-
Thanks to your advice, I can submit my graduation thesis! |
Beta Was this translation helpful? Give feedback.
Hi @nk-yuta ,
To crawl only one page and its embedded content try setting
maxHops
to 0, which should download your seed but not take any navigational links from that seed:Then use
TransclusionDecideRule
to indicate you want to get embedded content. For example you could set:Those are the main settings for getting one page with embeds. I recommend keeping
maxDocumentsDow…