-
Notifications
You must be signed in to change notification settings - Fork 763
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace A_TIMESTAMP with A_FETCH_BEGAN_TIME #292
base: master
Are you sure you want to change the base?
Conversation
It seems that A_TIMESTAMP went out of favor quite a long time ago. A_FETCH_BEGAN_TIME is used within FetchHistoryProcessor and throws an exception as is because of it.
This seems reasonable but I'm not familiar enough with this module to confidently review this. I'm hoping someone more knowledgeable about this comments, otherwise I'll merge it in a few days if we don't hear anything. |
This is @kngenie's code. I don't know if it's still used. |
I was trying to use it in conjunction with the HBase recrawl modules but everything besides these class seems to use heritrix3/modules/src/main/java/org/archive/modules/recrawl/FetchHistoryProcessor.java Line 125 in 9927e13
|
We use |
@anjackson Sorry about that, I've updated the link to the code. Any explanation about how you make use of |
Thanks, but I think your original link refers to: heritrix3/contrib/src/main/java/org/archive/modules/recrawl/wbm/WbmPersistLoadProcessor.java Lines 519 to 521 in 0581170
? We use it in much the same way, I think. We use it for deduplication and for checking how recently we visited to see if we need to re-crawl yet. But it looks like I hit similar issues because I populate both fields. I'm not sure how important these semantics are -- the timestamp in question may not necessarily be the time the fetch began, but in practice I think it always is, so I think it's fine to reduce it all to that name. |
Yeah, I did reference that comment. |
Your use of this constant also makes it clear that it's a more dangerous change than I thought. It is Perhaps the proper solution is to clarify the semantics and use the proper value in |
It seems that A_TIMESTAMP went out of favor quite a long time ago.
A_FETCH_BEGAN_TIME is used within FetchHistoryProcessor and throws an
exception as is because of it.