Skip to content

Commit

Permalink
Skip extract_hyperlinks if not required
Browse files Browse the repository at this point in the history
Revisiting #436 .

This Patch uses relationships data to determine if a sheet includes hyperlinks or not.

As extract_hyperlinks loads the whole document in memory it is quite problematic for each_row_streaming. This patch tries to skip extract_hyperlinks when not required.
  • Loading branch information
chopraanmol1 committed Jan 22, 2019
1 parent 5bbda98 commit 6170cf9
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 2 deletions.
10 changes: 9 additions & 1 deletion lib/roo/excelx/relationships.rb
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
# frozen_string_literal: true

require 'roo/excelx/extractor'

module Roo
Expand All @@ -11,10 +13,16 @@ def to_a
@relationships ||= extract_relationships
end

def include_type?(type)
to_a.any? do |_, rel|
rel["Type"]&.include? type
end
end

private

def extract_relationships
return [] unless doc_exists?
return {} unless doc_exists?

doc.xpath('/Relationships/Relationship').each_with_object({}) do |rel, hash|
hash[rel['Id']] = rel
Expand Down
2 changes: 1 addition & 1 deletion lib/roo/excelx/sheet_doc.rb
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ def cells(relationships)

def hyperlinks(relationships)
# If you're sure you're not going to need this hyperlinks you can discard it
@hyperlinks ||= if @options[:no_hyperlinks]
@hyperlinks ||= if @options[:no_hyperlinks] || !relationships.include_type?("hyperlink")
{}
else
extract_hyperlinks(relationships)
Expand Down

0 comments on commit 6170cf9

Please sign in to comment.