Skip to content

TransclusionDecideRule: meaning of maxTransHops and maxSpeculativeHops #504

Answered by ato
cgr71ii asked this question in Q&A
Discussion options

You must be logged in to vote

The first important thing to understand is that TransclusionDecideRule as used in the default config is an ACCEPT rule not a REJECT rule. This means it allows URIs that would otherwise be rejected to be accepted. In other words it strictly only widens the scope. If a URI is already accepted due to another rule such as by being in SURT scope it will have no effect on it.

For the purposes of the maxTransHops setting a transclusion hop is any hop that is not a regular navigation link ('L'), a form submission ('S') or a site-map ('M') link.

A speculative hop (X) is where Heritrix finds a something that looks like a URL in JavaScript source. Heritrix is not able to understand JavaScript code s…

Replies: 6 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by ato
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
2 participants
Converted from issue

This discussion was converted from issue #496 on September 30, 2022 00:33.