CRON: Document update tasks #1052
                
     Merged
            
            
          
  Add this suggestion to a batch that can be applied as a single commit.
  This suggestion is invalid because no changes were made to the code.
  Suggestions cannot be applied while the pull request is closed.
  Suggestions cannot be applied while viewing a subset of changes.
  Only one suggestion per line can be applied in a batch.
  Add this suggestion to a batch that can be applied as a single commit.
  Applying suggestions on deleted lines is not supported.
  You must change the existing code in this line in order to create a valid suggestion.
  Outdated suggestions cannot be applied.
  This suggestion has been applied or marked resolved.
  Suggestions cannot be applied from pending reviews.
  Suggestions cannot be applied on multi-line comments.
  Suggestions cannot be applied while the pull request is queued to merge.
  Suggestion cannot be applied right now. Please check back later.
  
    
  
    
Two changes to the cron job for document updates.
Skip updates to
relatedPapers: Drop this rather heavy, long-running task (see Examining the database data / Root of large data dump size #1024 (comment)). It's not completely clear that updates to these make any tangible difference (e.g. the most recent papers of interest).Filtering docs to update: Strips out those documents without any
paperId, that is, documents that were auto-created to notify referenced authors that their paper was cited by a Biofactoid document (viral email).In this way, documents will get fresh data for the submitted article, and those submitting a paper not in PubMed at the time will be checked as well.
Some empirical analysis to follow up on (#1024 (comment)):
Refs #1024