Custom item classes #47
                
     Merged
            
            
          
  Add this suggestion to a batch that can be applied as a single commit.
  This suggestion is invalid because no changes were made to the code.
  Suggestions cannot be applied while the pull request is closed.
  Suggestions cannot be applied while viewing a subset of changes.
  Only one suggestion per line can be applied in a batch.
  Add this suggestion to a batch that can be applied as a single commit.
  Applying suggestions on deleted lines is not supported.
  You must change the existing code in this line in order to create a valid suggestion.
  Outdated suggestions cannot be applied.
  This suggestion has been applied or marked resolved.
  Suggestions cannot be applied from pending reviews.
  Suggestions cannot be applied on multi-line comments.
  Suggestions cannot be applied while the pull request is queued to merge.
  Suggestion cannot be applied right now. Please check back later.
  
    
  
    
This PR adds the ability to define custom item classes and and an option for item processors to only process certain items.
Custom Items
Custom items are simple PHP objects which extend the
RoachPHP\ItemPipeline\AbstractItemclass. TheAbstractItemclass implements all necessary interfaces in order to stand in for any other kind of item.To yield a custom item from a spider, we can continue to use the spider's
itemmethod. Instead of passing in array, however, we pass in an instance of our custom item class.This can already be nice on its own if we want to structure our scraped data a little more instead of passing around raw arrays. The real value comes from combining this with the next feature, however: custom item processors.
Custom Item Processors
Up until now, every processor of a spider would run for each yielded item. This can become problematic if our spider yields multiple different types of data from the same parse callback.
Say we're parsing a match summary of a football match. We might want to yield a
Teamitem for both the home and the away team. The teams should get saved to the database so we can reference them later when we store the matches themselves. However, we also want to yield aFootballMatchitem containing the information about the match itself.The issue now is that we probably want to process the two types of items completely differently. The only way to deal with this at the moment is to add
if-elseblocks to all of our processors to manually check which kind of item we're dealing with. This is really cumbersome because it often requires us to add additional metadata to our items for the sole purpose of being able to tell which kind of item we're dealing with in our processor.This PR introduces a new
ConditionalItemProcessorinterface as well as aCustomItemProcessorbase class. TheConditionalItemProcessorinterface describes a processor which may not run for each yielded item.The item pipeline will call the
shouldHandlemethod of each processor that implements theConditionalItemProcessorinterface to check if this processor should handle the item.The
CustomItemProcessorbase class is a convenience to handle one of the most common cases why we might do this: handling only a certain type of item. To create a processor like this, we extend theConditionalItemProcessorclass and implement thegetHandledItemClassesas well as the usualprocessItemmethods.We can then define a separate item processor to only process
FootballMatchitems. We register these processors just like any other processor.Note: Custom processors only process the item types defined in the
getHandledItemClassesmethod. This means that non-custom items don't get processed.