Skip to content

Commit

Permalink
Adds each_x_by_sql, Model.each_row features
Browse files Browse the repository at this point in the history
* Removes in-line testing.
* Works for ActiveRecord (Rails) 4.1
* Still need to test AR 4.0
* Not planning on supporting AR 3.x anymore! Use gem v0.4.x
* Should be ready for use now, but more testing underway.
  • Loading branch information
afair committed May 18, 2014
1 parent eeb30c4 commit 8744c65
Show file tree
Hide file tree
Showing 5 changed files with 61 additions and 32 deletions.
11 changes: 6 additions & 5 deletions README.rdoc
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,10 @@ around 20 rows is returned to the user. When you do this
Model.find(:all, :conditions=>["id>0"]

The database returns all matching result set rows to ActiveRecord, which instantiates each row with
the data returned. This function returns an array of all these rows to the caller.
the data returned. This function returns an array of all these rows to the caller.

Asyncronous, Background, or Offline processing may require processing a large amount of data.
When there is a very large number of rows, this requires a lot more memory to hold the data. Ruby
When there is a very large number of rows, this requires a lot more memory to hold the data. Ruby
does not return that memory after processing the array, and the causes your process to "bloat". If you
don't have enough memory, it will cause an exception.

Expand Down Expand Up @@ -46,7 +46,7 @@ to declare a cursor to run a given query returning "chunks" of rows to the appli
retaining the position of the full result set in the database. This overcomes all the disadvantages
of using find_each and find_in_batches.

Also, with PostgreSQL, you have on option to have raw hashes of the row returned instead of the
Also, with PostgreSQL, you have on option to have raw hashes of the row returned instead of the
instantiated models. An informal benchmark showed that returning instances is a factor of 4 times
slower than returning hashes. If you are can work with the data in this form, you will find better
performance.
Expand Down Expand Up @@ -79,9 +79,10 @@ Allen Fair, allen.fair@gmail.com, http://github.com/afair
Thank you to:
* Iulian Dogariu, http://github.com/iulianu (Fixes)
* Julian Mehnle, julian@mehnle.net (Suggestions)
* ...And all the other contributers!

== Note on Patches/Pull Requests

* Fork the project.
* Make your feature addition or bug fix.
* Add tests for it. This is important so I don't break it in a
Expand All @@ -92,4 +93,4 @@ Thank you to:

== Copyright

Copyright (c) 2010 Allen Fair. See LICENSE for details.
Copyright (c) 2010-2014 Allen Fair. See LICENSE for details.
18 changes: 1 addition & 17 deletions lib/postgresql_cursor.rb
Original file line number Diff line number Diff line change
Expand Up @@ -7,22 +7,6 @@
# ActiveRecord 4.x
require 'active_record'
require 'active_record/connection_adapters/postgresql_adapter'
ActiveRecord::Base.include(PostgreSQLCursor::ActiveRecord::SqlCursor)
ActiveRecord::Base.extend(PostgreSQLCursor::ActiveRecord::SqlCursor)
ActiveRecord::Relation.include(PostgreSQLCursor::ActiveRecord::Relation::CursorIterators)
ActiveRecord::ConnectionAdapters::PostgreSQLAdapter.include(PostgreSQLCursor::ActiveRecord::ConnectionAdapters::PostgreSQLTypeMap)

# Temp test
ActiveRecord::Base.establish_connection(
"postgres://#{ENV['USER']}:@localhost/#{ENV['USER']}"
)

class List < ActiveRecord::Base
self.table_name = 'list'
end

List.order("list_id").each_hash {|r| p r }
List.order("list_id").each_instance {|r|
r.upd_ts
$r = r
p r
}
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ module PostgreSQLCursor
module ActiveRecord
module ConnectionAdapters
module PostgreSQLTypeMap
def get_type_map
def get_type_map # :nodoc:
type_map
end
end
Expand Down
11 changes: 8 additions & 3 deletions lib/postgresql_cursor/active_record/relation/cursor_iterators.rb
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,15 @@ module CursorIterators
# Public: Executes the query, returning each row as a hash
# to the given block.
#
# options - Hash to control
# options - Hash to control
# fraction: 0.1..1.0 - The cursor_tuple_fraction (default 1.0)
# block_size: 1..n - The number of rows to fetch per db block fetch
# while: value - Exits loop when block does not return this value.
# until: value - Exits loop when block returns this value.
#
# Example:
# Post.where(user_id:123).each_row { |hash| Post.process(hash) }
#
# Returns the number of rows yielded to the block
def each_row(options={}, &block)
options = {:connection => self.connection}.merge(options)
Expand All @@ -22,13 +25,15 @@ def each_row(options={}, &block)

# Public: Like each_row, but returns an instantiated model object to the block
#
# Paramaters: same as each_row
# Paramaters: same as each_row
#
# Example:
# Post.where(user_id:123).each_instance { |post| post.process }
#
# Returns the number of rows yielded to the block
def each_instance(options={}, &block)
options = {:connection => self.connection}.merge(options)
options[:symbolize_keys] = false # Must be strings to initiate
pgresult = nil
PostgreSQLCursor::Cursor.new(to_sql, options).each do |row, column_types|
model = instantiate(row, column_types)
yield model
Expand Down
51 changes: 45 additions & 6 deletions lib/postgresql_cursor/active_record/sql_cursor.rb
Original file line number Diff line number Diff line change
@@ -1,30 +1,69 @@
module PostgreSQLCursor
module ActiveRecord
module SqlCursor
# Public: Executes the query, returning each row as a hash
# to the given block.
#
# options - Hash to control
# fraction: 0.1..1.0 - The cursor_tuple_fraction (default 1.0)
# block_size: 1..n - The number of rows to fetch per db block fetch
# while: value - Exits loop when block does not return this value.
# until: value - Exits loop when block returns this value.
#
# Example:
# Post.each_row { |hash| Post.process(hash) }
#
# Returns the number of rows yielded to the block
def each_row(options={}, &block)
options = {:connection => self.connection}.merge(options)
all.each_row(options, &block)
end
alias :each_hash :each_row

# Public: Like each_row, but returns an instantiated model object to the block
#
# Paramaters: same as each_row
#
# Example:
# Post.each_instance { |post| post.process }
#
# Returns the number of rows yielded to the block
def each_instance(options={}, &block)
options = {:connection => self.connection}.merge(options)
all.each_instance(options, &block)
end

# Public: Returns each row as a hash to the given block
#
# sql - Full SQL statement, variables interpolated
# options - Hash to control
# options - Hash to control
# fraction: 0.1..1.0 - The cursor_tuple_fraction (default 1.0)
# block_size: 1..n - The number of rows to fetch per db block fetch
# while: value - Exits loop when block does not return this value.
# until: value - Exits loop when block returns this value.
#
# Example:
# Post.each_row_by_sql("select * from posts") { |hash| Post.process(hash) }
#
# Returns the number of rows yielded to the block
def self.each_row_by_sql(sql, options={}, &block)
def each_row_by_sql(sql, options={}, &block)
options = {:connection => self.connection}.merge(options)
PostgreSQLCursor.new(sql, options).each(&block)
PostgreSQLCursor::Cursor.new(sql, options).each(&block)
end
alias :each_hash_by_sql :each_row_by_sql

# Public: Returns each row as a model instance to the given block
# As this instantiates a model object, it is slower than each_row_by_sql
# As this instantiates a model object, it is slower than each_row_by_sql
#
# Paramaters: see each_row_by_sql
#
# Example:
# Post.each_instance_by_sql("select * from posts") { |post| post.process }
#
# Returns the number of rows yielded to the block
def self.each_instance_by_sql(sql, options={}, &block)
def each_instance_by_sql(sql, options={}, &block)
options = {:connection => self.connection}.merge(options)
PostgreSQLCursor.new(sql, options).each do |row|
PostgreSQLCursor::Cursor.new(sql, options).each do |row|
model = instantiate(row)
yield model
end
Expand Down

0 comments on commit 8744c65

Please sign in to comment.