Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lockfile to mitigate r10k race condition. #268

Merged
merged 3 commits into from
Jan 5, 2017
Merged

Lockfile to mitigate r10k race condition. #268

merged 3 commits into from
Jan 5, 2017

Conversation

binford2k
Copy link
Member

r10k has a small race condition which can cause failed deploys if two happen
more or less simultaneously. To mitigate, we just lock on a file and wait for
the other one to complete. Note that this doesn't catch the potential
race that we get due to the fork-exec, but it at least makes the window
smaller. This dropped failure rate in a busy classroom from ~30% to zero.

The proper fix is to lock in r10k, of course.

r10k has a small race condition which can cause failed deploys if two happen
more or less simultaneously. To mitigate, we just lock on a file and wait for
the other one to complete. Note that this doesn't catch the potential
race that we get due to the fork-exec, but it at least makes the window
smaller. This dropped failure rate in a busy classroom from ~30% to zero.

The proper fix is to lock in r10k, of course.
@rnelson0
Copy link
Member

+1, we have this issue with a Stash server, where you can choose to delete the branch at the time of merge or not at all (unlike github which leaves a delete button to click on later). Those events fire pretty much at the exact some moment and who knows which one wins, but everyone loses.

@acidprime
Copy link
Collaborator

@rnelson0 can you test this patch in your environment and report back to this PR?

@rnelson0
Copy link
Member

I'll try on Monday. Been running solo for a week and want to enjoy my weekend at least a little bit :)

@rnelson0
Copy link
Member

rnelson0 commented May 4, 2016

@acidprime @binford2k Yes and no. Here's the result in /var/log/webhook/access.log. The trace was reformatted because it had no line feeds.

[2016-05-04 18:57:02] DEBUG accept: AAA.BB.CC.DDD:49541
[2016-05-04 18:57:02] DEBUG Rack::Handler::WEBrick is invoked.
[2016-05-04 18:57:02] INFO  authenticated: puppet
[2016-05-04 18:57:02] DEBUG accept: AAA.BB.CC.DDD:27937
[2016-05-04 18:57:02] DEBUG Rack::Handler::WEBrick is invoked.
[2016-05-04 18:57:02] INFO  authenticated: puppet
[2016-05-04 18:57:07] INFO  message: triggered: umask 0022; r10k deploy environment production -pv

WARN     -> The r10k configuration file at /etc/r10k.yaml is deprecated.
WARN     -> Please move your r10k configuration to /etc/puppetlabs/r10k/r10k.yaml.
INFO     -> Deploying environment /etc/puppetlabs/puppet/environments/production
INFO     -> Environment production is now at bd850e7b16a185b1ca963241765fd4924f9e3502
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/dhcp
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/make
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/common
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/kickstart
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/nfs
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/docker
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/augeasproviders_core
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/augeasproviders_shellvar
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/hiera
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/staging
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/createrepo
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/community_kickstarts
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/windowsfeature
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/apache
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/concat
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/f5
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/firewall
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/gcc
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/git
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/inifile
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/lvm
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/mysql
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/ntp
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/pe_gem
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/puppetserver_gem
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/powershell
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/ruby
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/stdlib
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/tftp
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/vcsrepo
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/xinetd
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/certs
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/domain_join
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/hiera_resources
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/logrotate
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/squid3
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/motd
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/rsyslog
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/ssh
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/sudo
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/epel
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/zabbix
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/exports
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/r10k
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/kickstart_files
INFO     -> Deploying module /etc/puppetlabs/puppet/environments/production/modules/site_certificates
INFO     -> Removing unmanaged path /etc/puppetlabs/puppet/environments/noop
 branch: production
[2016-05-04 18:57:07] DEBUG close: AAA.BB.CC.DDD:49541
[2016-05-04 18:57:08] ERROR message:
WARN     -> The r10k configuration file at /etc/r10k.yaml is deprecated.
WARN     -> Please move your r10k configuration to /etc/puppetlabs/r10k/r10k.yaml.
ERROR    -> Environment(s) 'noop' cannot be found in any source and will not be deployed.
 trace: [
    "/usr/local/bin/webhook:197:in `block in run_command'", 
    "/usr/local/bin/webhook:183:in `open'", 
    "/usr/local/bin/webhook:183:in `run_command'", 
    "/usr/local/bin/webhook:237:in `deploy'", 
    "/usr/local/bin/webhook:156:in `block in <class:Server>'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:1611:in `call'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:1611:in `block in compile!'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:975:in `[]'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:975:in `block (3 levels) in route!'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:994:in `route_eval'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:975:in `block (2 levels) in route!'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:1015:in `block in process_route'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:1013:in `catch'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:1013:in `process_route'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:973:in `block in route!'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:972:in `each'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:972:in `route!'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:1085:in `block in dispatch!'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:1067:in `block in invoke'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:1067:in `catch'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:1067:in `invoke'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:1082:in `dispatch!'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:907:in `block in call!'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:1067:in `block in invoke'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:1067:in `catch'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:1067:in `invoke'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:907:in `call!'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:895:in `call'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/rack-protection-1.5.3/lib/rack/protection/xss_header.rb:18:in `call'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/rack-protection-1.5.3/lib/rack/protection/path_traversal.rb:16:in `call'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/rack-protection-1.5.3/lib/rack/protection/json_csrf.rb:18:in `call'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/rack-protection-1.5.3/lib/rack/protection/base.rb:49:in `call'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/rack-protection-1.5.3/lib/rack/protection/base.rb:49:in `call'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/rack-protection-1.5.3/lib/rack/protection/frame_options.rb:31:in `call'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/rack-1.6.4/lib/rack/nulllogger.rb:9:in `call'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/rack-1.6.4/lib/rack/head.rb:13:in `call'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/show_exceptions.rb:25:in `call'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:182:in `call'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:2013:in `call'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:1487:in `block in call'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:1787:in `synchronize'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/sinatra-1.4.7/lib/sinatra/base.rb:1487:in `call'", 
    "/opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/rack-1.6.4/lib/rack/handler/webrick.rb:88:in `service'", 
    "/opt/puppetlabs/puppet/lib/ruby/2.1.0/webrick/httpserver.rb:138:in `service'", 
    "/opt/puppetlabs/puppet/lib/ruby/2.1.0/webrick/httpserver.rb:94:in `run'", 
    "/opt/puppetlabs/puppet/lib/ruby/2.1.0/webrick/server.rb:295:in `block in start_thread'"
]
[2016-05-04 18:57:08] DEBUG close: AAA.BB.CC.DDD:27937

It does not crap out and blow up git/r10k, and the second run's error is present anytime I delete a branch, even if it's not at the same time. That seems like an existing issue, but wanted to report it to make sure it's not a regression.

@rnelson0
Copy link
Member

rnelson0 commented May 6, 2016

@acidprime If you can get a new forge release out with this fix, we can bump and start using it in production next week, for more rapid testing and feedback with it. We may be going to PE code manager at work soon, which will limit our ability to test (my home setup is much more static).

@rnelson0
Copy link
Member

@acidprime ping

@rnelson0
Copy link
Member

@acidprime Ping. Would really love to see a new release that includes this. If there's anything I can do to help with maintenance of this module, let me know.

@acidprime
Copy link
Collaborator

@rnelson0 since code manager has shipped in PE I have let this module fall out of my day to day job responsibilities , I have been meaning to look into migrating it to vox populi. In the interim I have added you as a collaborator to this repo. If our interested in testing/running beaker tests on the feature PRs I am okay with you merging them and updating the Change log . I can cut a forge release for you once we are sure there is no breaking changes on a given feature tag.

@rnelson0
Copy link
Member

rnelson0 commented Nov 8, 2016

@binford2k could you rebase this once more to make sure it still fits and passes? I know it probably will but we did revamp the tests a good bit. Thanks!

@rnelson0
Copy link
Member

@binford2k the modulesync is complete, ready to rebase. Hit me up on slack/irc if you need to, the modulesync changed a LOT.

@rnelson0
Copy link
Member

Hrm, maybe it's not needed? Is there any relevant test for this?

@binford2k
Copy link
Member Author

I'm not sure what a relevant test would be. Do you think that spec/acceptance/basic_webhook_spec.rb should fire up a few threads and all try to sync simultaneously? Is that overkill? It's certainly somewhat indeterminate.

@adrienthebo what do you think would be a appropriate test here?

# the other one to complete.
file.flock(File::LOCK_EX)

if Open3.respond_to?('capture3')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh god, can we please, please let Ruby 1.8.7 die? Ruby 1.9.3 has this (https://ruby-doc.org/stdlib-1.9.3/libdoc/open3/rdoc/Open3.html#method-c-capture3), and even that version is EOL'd.

Please, Ruby 1.8.7 is old and tired. It just wants to go out to pasture, graze quietly, and enjoy the twilight years of its life. Stop making Ruby 1.8.7 suffer on.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ruby 1.8 is not well

@adrienthebo
Copy link
Member

@binford2k something like this

4.times.map do
  Thread.new do
    # curl webhook thingy
  end
end.each(&:join)

Also I'm going to take this time to say PLEASE LET RUBY 1.8.7 DIE AND NO I WILL NOT STOP HARPING ON THAT POINT BECAUSE AT THIS POINT I'M ENTERTAINED BY MY OWN COMPLAINING

@rnelson0
Copy link
Member

Vox Pupuli does not support ruby 1.8.7 and only has to support 1.9.3 for

another 60ish days!

Rob Nelson

@adrienthebo
Copy link
Member

@rnelson0 OH MY GOD I LOVE YOU SO MUCH AND YOU HAVE BROUGHT SUCH JOY TO MY COLD, DEAD HEART

@rnelson0 rnelson0 added the needs-work not ready to merge just yet label Nov 27, 2016
This triggers 4 deploys in rapid fire. They should all be queued up and
eventually succeed.
@@ -58,5 +58,17 @@ class {'r10k::webhook':
expect(r.exit_code).to eq(0)
end
end

it 'should successfully lock when hammered with multiple requests' do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put the it tests inside the shell loop to remedy the rubocop warnings on too many expectations. The other rubocop warnings should be pretty explanatory except the array one, but I've not seen that before, sorry.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These aren't really good validations. Having four separate it blocks implies that it's testing four different things, and it's not. It's testing one.

Not using the word "should" is fine, but all the other tests do, so that becomes inconsistent.

Not allowing a newline to delineate different blocks is just asinine, especially when theres a cascade of ends, like below. Which one ends which block?

          end
        end
      end
    end
  end
end

Its style guide even praises using newlines. https://github.com/bbatsov/ruby-style-guide#empty-lines-between-methods but has nothing for or against newlines around blocks.

The regex rule linked to doesn't actually say to always use %r, it says "Use %r only for regular expressions matching at least one '/' character." This regex does not, it simply matches the word "success". I'd change it anyways to make the rule happy, but then this test would be inconsistent with the rest.

I fixed the times.map (wasn't really needed anyway), took out the readability whitespace, and added a disable comment for the multiple expectations. If you want the other rules to pass, you ought to make a single commit to change all of them at once for consistency.

@rnelson0
Copy link
Member

In discussions with Ben on Slack, this will me merged and then all the relevant rubocop tests will be fixed. See https://travis-ci.org/voxpupuli/puppet-r10k/jobs/181471454 for the actual failures.

Ready to merge, pending ability to quickly PR the rubocop failures afterward.

@rnelson0 rnelson0 added ready and removed needs-work not ready to merge just yet labels Dec 14, 2016
@rnelson0 rnelson0 merged commit 1b8a56e into voxpupuli:master Jan 5, 2017
@rnelson0 rnelson0 removed the ready label Jan 5, 2017
rnelson0 added a commit that referenced this pull request Jan 5, 2017
rnelson0 added a commit that referenced this pull request Jan 5, 2017
rnelson0 added a commit that referenced this pull request Jan 5, 2017
rnelson0 added a commit that referenced this pull request Jan 5, 2017
@stanleyz
Copy link

Is the race condition still there in R10K? If not, should the lock mechanism in webhook binary and MCollective R10K plugin be removed? plus is there a link to the R10K racing condition issue? I didn't find one sorry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants