Skip to content

Latest commit

 

History

History
267 lines (196 loc) · 10.3 KB

references-and-values.md

File metadata and controls

267 lines (196 loc) · 10.3 KB

References and Values

Learning Goals

At the end of this lesson, students should be able to...

  • Discuss how objects are stored in Ruby
  • Differentiate between references and values
  • Compare modifying an object with reassigning a variable

Motivation

We're going to start today with some Ruby code that does something a little unexpected. It's a method that takes an array of strings as an argument, and truncates (chops off the end of) all the strings with more than three characters. Or at least that's the idea.

def short_strings(input)
  result = []
  input.each do |word|
    # Slice characters 0 to 2
    result << word[0..2]
  end
  input = result
end

pets = ['dog', 'parrot', 'cat', 'llama']
short_strings(pets)
puts "#{pets}"

Running this code results in ["dog", "parrot", "cat", "llama"]. The array in unchanged! Let's do some debugging:

def short_strings(input)
  result = []
  input.each do |word|
    # Slice characters 0 to 2
    result << word[0..2]
  end
  input = result
  puts "Inside short_strings, input is"
  puts "#{input}"
end

pets = ['dog', 'parrot', 'cat', 'llama']
short_strings(pets)
puts "After calling short_strings"
puts "#{pets}"

The results:

Inside short_strings, input is
["dog", "par", "cat", "lla"]
After calling short_strings
["dog", "parrot", "cat", "llama"]

Seems like our method is indeed creating a list of shortened words, but our outer variable isn't being updated. The reason why has to do with references and values, and how data is stored in a computer.

As an aside: one way to fix our method is to simply return the new array, and when calling it say pets = short_strings(pets). However, sometimes this isn't an option - for example, what if our method was supposed to return the number of words that had to be truncated?

References and Values

When we create an array in Ruby (or a string or a hash or any other complex data type), we're actually creating two things.

The first is the value of the array, which involves asking the operating system for a bit of memory and then putting our data in it. You can think of this as the actual object. Each piece of memory we get from the OS has an address representing its physical location in hardware, which is how we get back to it later.

The second is a reference to the array, which ties together the address of that memory with a name for our program to use. References are sometimes called pointers (especially in C), and we say that a variable points to or references an object.

references and variables

Every variable in Ruby consists of these two parts, a reference and a value. Normally when you type the variable's name, Ruby automatically goes and gets the object. If you want to find out the address, you can use the object_id method:

# You can see an object's memory address using object_id
pets = ["dog", "parrot", "cat", "llama"]
puts "pets.object_id: #{pets.object_id}"

# Different objects have different IDs
veggies = ["turnip", "beet"]
puts "veggies.object_id: #{veggies.object_id}"

The = Operator

The = operator changes what a variable points at.

If we assign one variable to another variable, they will both reference the same underlying object.

# Two variables can point to the same object
repeat = veggies
puts "repeat.object_id: #{repeat.object_id}" # same as veggie.object_id

referencing a variable twice

If we make changes to the object through one variable, you can see the changes via the other. The variables are just names, but the underlying object is the same.

veggies[1] = "onion"
repeat.push("potato")
puts "#{veggies}"     # ["turnip", "onion", "potato"]
puts "#{veggies}"     # ["turnip", "onion", "potato"]

modifying the underlying object

When we use the = operator, we are not changing the underlying object but instead changing what our variable points to. This does not affect any other variables.

repeat = ["new", "array"]
puts "repeat.object_id: #{repeat.object_id}"
puts "value of repeat:"
puts "#{repeat}"    # ["new", "array"]
puts "value of veggies:"
puts "#{veggies}"   # ["turnip", "onion", "potato"]

creating a new array

So to summarize, if two variables point to the same underlying object:

  • Modifications to the object (the value) will be visible from both variables
  • Reassigning one variable (the reference) with = does not affect the other variable

Note that += and the other shorthand operators all involve reassignment. If we say veggies += ['rutabaga'], Ruby creates a new array, copies all the values from veggies, adds in rutabaga, and reassigns veggies to point to this new array. This is true of strings and numbers as well.

In general, calling a method on an object like .concat() or .push() will change the underlying object, while any operation that contains an = will result in reassignment.

Passing Parameters

Question: When we pass a parameter to a method, what do you get?

  • Is it the same underlying object?
  • Is it the same variable?
  • How can we find out?

Let's write some code that will help us investigate this.

def reassign_parameter(param)
  puts "  Inside reassign_parameter"
  puts "  at start, param.object_id is #{param.object_id}"

  # .push modifies the underlying object
  param.push('gecko')
  puts "  after modification, param.object_id is #{param.object_id}"

  # = changes the reference
  param = ["new", "array"]
  puts "  after reassignment, param.object_id is #{param.object_id}"
  puts "  with value #{param}"
  puts "  Finish reassign_parameter"
end

pets = ["dog", "parrot", "cat", "llama"]
puts "Before reassign_parameter"
puts "pets.object_id is #{pets.object_id}"
puts

reassign_parameter(pets)

puts
puts "After reassign_parameter"
puts "pets.object_id is #{pets.object_id}"
puts "with value #{pets}"

Before running this code, take a couple minutes to read through it. What is it doing? What do you expect the output to be?

Running the code yields (your object_ids may be different):

Before reassign_parameter
pets.object_id is 70144030241620

  Inside reassign_parameter
  at start, param.object_id is 70144030241620
  after modification, param.object_id is 70144030241620
  after reassignment, param.object_id is 70144030228060
  with value ["new", "array"]
  Finish reassign_parameter

After reassign_parameter
pets.object_id is 70144030241620
with value ["dog", "parrot", "cat", "llama", "gecko"]

We can make a few interesting observations about this output:

  • The parameter inside the method has the same object_id as the variable we passed from outside
  • Modifications to the underlying object are visible outside the method
  • Reassigning the parameter with = does not reassign the outer variable

This is exactly the same behavior we saw before, when we had two variables referencing the same object. From this we can conclude: when you pass a variable as parameter, Ruby creates a new variable that references same object.

Fixing the short_strings Method

Question: Given what we've learned, how can we modify our short_strings method to do what we want?

The answer is to modify the underlying object, rather than reassigning the parameter. Here's what the resulting code might look like:

def short_strings(input)
  input.each_with_index do |word, i|
    # Slice characters 0 to 2
    input[i] = word[0..2]
  end
end

pets = ['dog', 'parrot', 'cat', 'llama']
short_strings(pets)
puts "#{pets}"

This produces the expected output. Note that we can't just say word = word[0..2], for the same reason as above: that reassigns the block parameter word to a new string containing just the first 3 letters, but neither modifies nor reassigns the string in the array. Instead we reassign input[i], which does what we want: change the value stored in the array.

We could also use the map! enumerable method, since that modifies the original. map (without a !) would not work, because it creates a new array.

Other Objects

We've talked a lot about arrays today, but this pattern holds true for all complex objects in Ruby: strings, hashes, instances of classes, etc. For example, consider the following code:

# Reassign a string using +=
def reassign_string(str)
  str += ' reassigned'
  puts "inside reassign_string, str is '#{str}'"
end

text = 'original'
reassign_string(text)
puts "outside reassign_string, text is '#{text}'"


# Modify a string using the .concat() method
def modify_string(str)
  # str << ' modified' would do the same thing
  str.concat(' modified')
  puts "inside modify_string, str is '#{str}'"
end

text = 'original'
modify_string(text)
puts "outside modify_string, text is '#{text}'"

Primitive types like numbers, booleans and nil follow basically the same rules. The catch is there's no way to change the underlying value of a primitive without reassignment. In programming lingo, we say that these types are immutable. This means that whenever you change the value, Ruby makes a copy and changes that instead.

Takeaway

  • A variable in Ruby consists of two things:
    • The variable itself, tying a name to an address in memory
    • The object at that memory address
    • We say a variable references or points to an object
  • Multiple variables can reference the same object
    • Changes to the underlying object will be reflected through both variables
      • Methods like .push() or .concat()
    • Changing what one variable points to does not affect any other variables
      • =, +=, etc.
  • Passing an argument to a method creates a new variable referencing the same object
  • Primitives (numbers, booleans and nil) are immutable, meaning the underlying object can't be modified

Additional Resources