-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use terraform to install the masters #135
Conversation
Nice I was about to do something very similar to this :) |
This looks great! |
Build SUCCESS, see build http://10.8.144.11:8080/job/dev-tools/149/ |
Build SUCCESS, see build http://10.8.144.11:8080/job/dev-tools/156/ |
I rebased and squashed. Masters seem to deploy fine to me. TODO list before merging:
Anything I missed? |
Build SUCCESS, see build http://10.8.144.11:8080/job/dev-tools/174/ |
I think we could defer destroy to a different JIRA ticket, if we want to get this in sooner rather than later. Do we want to wait for gophercloud/utils#82? We could just leave the hack to pull from the PR until it's merged. |
Actually deleted is easy enough to implement, it's in the terraform provider now. |
Awesome, I'll try it out.
Surely we can figure out some way to make the terraform-provider-ironic build from a clean checkout without having to modify a version-controlled file? Can't we vendor your fork? I hadn't looked at go modules too closely before but ... you'd hope this wouldn't be such a crazy thing to want to do |
Looks to be just about working. Got the error below first eim, then re-ran it, and it completed with no errors:
|
That should be fixed now, I read Ironic's state diagram wrong. I'll have a look about vendoring my gohpercloud changes until the PR gets merged. https://github.com/Masterminds/glide looks promising. |
Build SUCCESS, see build http://10.8.144.11:8080/job/dev-tools/181/ |
I updated openshift-metal3/terraform-provider-ironic#2 to use Do you want to try removing the hack for utils? Then we can merge openshift-metal3/terraform-provider-ironic#2 |
Yeah, it's quite a thing! Agree we can move back later
Seems to be working for me. Thanks! |
@markmc I merged openshift-metal3/terraform-provider-ironic#2, you can remove the hack for that too - and we should be good to go! |
Ok, I think this is good to go if CI passes. I'm re-testing locally also. |
Build SUCCESS, see build http://10.8.144.11:8080/job/dev-tools/188/ |
This is a result on the earlier push, where we were still using openshift-metal3/terraform-provider-ironic#2 |
Looks good from my local testing |
Build SUCCESS, see build http://10.8.144.11:8080/job/dev-tools/189/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me.
Looks like we need a rebase but I can pull/test when done if we think this is ready to go? Otherwise lgtm - seems like a great step towards driving the master deployment via kni-installer :) |
Go for it Steve, thanks! |
Rebased but haven't tested yet |
Build FAILURE, see build http://10.8.144.11:8080/job/dev-tools/225/ |
Install terraform and terraform-provider-ironic, and use them to replace the Ironic interaction in 07_deploy_masters.sh.
Now that terraform-provider-ironic supports delete.
Ok, works fine for me with openshift-metalkube#165 |
@stbenjam Hey did the concerns about the Ironic API load/locking get resolved, or should we hold this pending further investigation? |
@hardys It's not resolved. Throughout testing I've deployed with terraform maybe a dozen times and I've personally run into the sqlite3 locking issue twice. I have only seen it while I'm deploying and doing something like @markmc Have you seen it at all? |
Build FAILURE, see build http://10.8.144.11:8080/job/dev-tools/228/ |
With this PR, Yurii is seeing:
even with:
Dmitri suggests:
|
No. I haven't. |
Ok tested and this works for me, lets merge it and iterate on the sqlite thing if it recreates. One thing to note is folks need to destroy with the old cleanup before consuming this patch, as (understandably) the new destroy script fails due to missing ocp/tf-master |
Failed like: ironic_node_v1.openshift-master-2: Still creating... (2m50s elapsed)
ironic_node_v1.openshift-master-0: Still creating... (2m50s elapsed)
ironic_node_v1.openshift-master-1: Still creating... (2m50s elapsed)
2019/03/15 10:29:45 [ERROR] root: eval: *terraform.EvalApplyPost, err: 1 error(s) occurred:
* ironic_node_v1.openshift-master-1: Internal Server Error
2019/03/15 10:29:45 [ERROR] root: eval: *terraform.EvalSequence, err: 1 error(s) occurred:
* ironic_node_v1.openshift-master-1: Internal Server Error
2019/03/15 10:29:45 [TRACE] [walkApply] Exiting eval tree: ironic_node_v1.openshift-master-1
2019/03/15 10:29:48 [ERROR] root: eval: *terraform.EvalApplyPost, err: 1 error(s) occurred:
* ironic_node_v1.openshift-master-2: Internal Server Error
2019/03/15 10:29:48 [ERROR] root: eval: *terraform.EvalSequence, err: 1 error(s) occurred:
* ironic_node_v1.openshift-master-2: Internal Server Error
2019/03/15 10:30:02 [DEBUG] plugin: waiting for all plugin processes to complete...
Error: Error applying plan:
3 error(s) occurred:
* ironic_node_v1.openshift-master-0: 1 error(s) occurred:
* ironic_node_v1.openshift-master-0: Internal Server Error
* ironic_node_v1.openshift-master-1: 1 error(s) occurred:
* ironic_node_v1.openshift-master-1: Internal Server Error
* ironic_node_v1.openshift-master-2: 1 error(s) occurred:
2019-03-15T10:30:02.972+0200 [DEBUG] plugin.terraform-provider-ironic: 2019/03/15 10:30:02 [ERR] plugin: stream copy 'stderr' error: stream closed
* ironic_node_v1.openshift-master-2: Internal Server Error
Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.
2019-03-15T10:30:03.007+0200 [DEBUG] plugin.terraform-provider-ironic: 2019/03/15 10:30:03 [ERR] plugin: plugin server: accept unix /tmp/plugin005349256: use of closed network connection
2019-03-15T10:30:03.008+0200 [DEBUG] plugin: plugin process exited: path=/root/.terraform.d/plugins/terraform-provider-ironic 2019-03-15 08:29:25.649 44 ERROR wsme.api [req-747fc5e4-6050-463e-9d5d-8b7fa79a00f3 - - - - -] Server-side error: "(sqlite3.OperationalError) database is locked [SQL: u'SELECT anon_1.nodes_created_at AS anon_1_nodes_created_at, anon_1.nodes_updated_at AS anon_1_nodes_updated_at, anon_1.nodes_version AS anon_1_nodes_version, anon_1.nodes_id AS anon_1_nodes_id, anon_1.nodes_uuid AS anon_1_nodes_uuid, anon_1.nodes_instance_uuid AS anon_1_nodes_instance_uuid, anon_1.nodes_name AS anon_1_nodes_name, anon_1.nodes_chassis_id AS anon_1_nodes_chassis_id, anon_1.nodes_power_state AS anon_1_nodes_power_state, anon_1.nodes_target_power_state AS anon_1_nodes_target_power_state, anon_1.nodes_provision_state AS anon_1_nodes_provision_state, anon_1.nodes_target_provision_state AS anon_1_nodes_target_provision_state, anon_1.nodes_provision_updated_at AS anon_1_nodes_provision_updated_at, anon_1.nodes_last_error AS anon_1_nodes_last_error, anon_1.nodes_instance_info AS anon_1_nodes_instance_info, anon_1.nodes_properties AS anon_1_nodes_properties, anon_1.nodes_driver AS anon_1_nodes_driver, anon_1.nodes_driver_info AS anon_1_nodes_driver_info, anon_1.nodes_driver_internal_info AS anon_1_nodes_driver_internal_info, anon_1.nodes_clean_step AS anon_1_nodes_clean_step, anon_1.nodes_deploy_step AS anon_1_nodes_deploy_step, anon_1.nodes_resource_class AS anon_1_nodes_resource_class, anon_1.nodes_raid_config AS anon_1_nodes_raid_config, anon_1.nodes_target_raid_config AS anon_1_nodes_target_raid_config, anon_1.nodes_reservation AS anon_1_nodes_reservation, anon_1.nodes_conductor_affinity AS anon_1_nodes_conductor_affinity, anon_1.nodes_conductor_group AS anon_1_nodes_conductor_group, anon_1.nodes_maintenance AS anon_1_nodes_maintenance, anon_1.nodes_maintenance_reason AS anon_1_nodes_maintenance_reason, anon_1.nodes_fault AS anon_1_nodes_fault, anon_1.nodes_console_enabled AS anon_1_nodes_console_enabled, anon_1.nodes_inspection_finished_at AS anon_1_nodes_inspection_finished_at, anon_1.nodes_inspection_started_at AS anon_1_nodes_inspection_started_at, anon_1.nodes_extra AS anon_1_nodes_extra, anon_1.nodes_automated_clean AS anon_1_nodes_automated_clean, anon_1.nodes_protected AS anon_1_nodes_protected, anon_1.nodes_protected_reason AS anon_1_nodes_protected_reason, anon_1.nodes_owner AS anon_1_nodes_owner, anon_1.nodes_allocation_id AS anon_1_nodes_allocation_id, anon_1.nodes_description AS anon_1_nodes_description, anon_1.nodes_bios_interface AS anon_1_nodes_bios_interface, anon_1.nodes_boot_interface AS anon_1_nodes_boot_interface, anon_1.nodes_console_interface AS anon_1_nodes_console_interface, anon_1.nodes_deploy_interface AS anon_1_nodes_deploy_interface, anon_1.nodes_inspect_interface AS anon_1_nodes_inspect_interface, anon_1.nodes_management_interface AS anon_1_nodes_management_interface, anon_1.nodes_network_interface AS anon_1_nodes_network_interface, anon_1.nodes_raid_interface AS anon_1_nodes_raid_interface, anon_1.nodes_rescue_interface AS anon_1_nodes_rescue_interface, anon_1.nodes_storage_interface AS anon_1_nodes_storage_interface, anon_1.nodes_power_interface AS anon_1_nodes_power_interface, anon_1.nodes_vendor_interface AS anon_1_nodes_vendor_interface, node_traits_1.created_at AS node_traits_1_created_at, node_traits_1.updated_at AS node_traits_1_updated_at, node_traits_1.version AS node_traits_1_version, node_traits_1.node_id AS node_traits_1_node_id, node_traits_1.trait AS node_traits_1_trait, node_tags_1.created_at AS node_tags_1_created_at, node_tags_1.updated_at AS node_tags_1_updated_at, node_tags_1.version AS node_tags_1_version, node_tags_1.node_id AS node_tags_1_node_id, node_tags_1.tag AS node_tags_1_tag \nFROM (SELECT nodes.created_at AS nodes_created_at, nodes.updated_at AS nodes_updated_at, nodes.version AS nodes_version, nodes.id AS nodes_id, nodes.uuid AS nodes_uuid, nodes.instance_uuid AS nodes_instance_uuid, nodes.name AS nodes_name, nodes.chassis_id AS nodes_chassis_id, nodes.power_state AS nodes_power_state, nodes.target_power_state AS nodes_target_power_state, nodes.provision_state AS nodes_provision_state, nodes.target_provision_state AS nodes_target_provision_state, nodes.provision_updated_at AS nodes_provision_updated_at, nodes.last_error AS nodes_last_error, nodes.instance_info AS nodes_instance_info, nodes.properties AS nodes_properties, nodes.driver AS nodes_driver, nodes.driver_info AS nodes_driver_info, nodes.driver_internal_info AS nodes_driver_internal_info, nodes.clean_step AS nodes_clean_step, nodes.deploy_step AS nodes_deploy_step, nodes.resource_class AS nodes_resource_class, nodes.raid_config AS nodes_raid_config, nodes.target_raid_config AS nodes_target_raid_config, nodes.reservation AS nodes_reservation, nodes.conductor_affinity AS nodes_conductor_affinity, nodes.conductor_group AS nodes_conductor_group, nodes.maintenance AS nodes_maintenance, nodes.maintenance_reason AS nodes_maintenance_reason, nodes.fault AS nodes_fault, nodes.console_enabled AS nodes_console_enabled, nodes.inspection_finished_at AS nodes_inspection_finished_at, nodes.inspection_started_at AS nodes_inspection_started_at, nodes.extra AS nodes_extra, nodes.automated_clean AS nodes_automated_clean, nodes.protected AS nodes_protected, nodes.protected_reason AS nodes_protected_reason, nodes.owner AS nodes_owner, nodes.allocation_id AS nodes_allocation_id, nodes.description AS nodes_description, nodes.bios_interface AS nodes_bios_interface, nodes.boot_interface AS nodes_boot_interface, nodes.console_interface AS nodes_console_interface, nodes.deploy_interface AS nodes_deploy_interface, nodes.inspect_interface AS nodes_inspect_interface, nodes.management_interface AS nodes_management_interface, nodes.network_interface AS nodes_network_interface, nodes.raid_interface AS nodes_raid_interface, nodes.rescue_interface AS nodes_rescue_interface, nodes.storage_interface AS nodes_storage_interface, nodes.power_interface AS nodes_power_interface, nodes.vendor_interface AS nodes_vendor_interface \nFROM nodes ORDER BY nodes.id ASC\n LIMIT ? OFFSET ?) AS anon_1 LEFT OUTER JOIN node_traits AS node_traits_1 ON node_traits_1.node_id = anon_1.nodes_id LEFT OUTER JOIN node_tags AS node_tags_1 ON node_tags_1.node_id = anon_1.nodes_id ORDER BY anon_1.nodes_id ASC'] [parameters: (1000, 0)] (Background on this error at: http://sqlalche.me/e/e3q8)". Detail:
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/wsmeext/pecan.py", line 85, in callfunction
result = f(self, *args, **kwargs)
File "/usr/lib/python2.7/site-packages/ironic/api/controllers/v1/node.py", line 1872, in get_all
**extra_args)
File "/usr/lib/python2.7/site-packages/ironic/api/controllers/v1/node.py", line 1684, in _get_nodes_collection
filters=filters)
File "/usr/lib/python2.7/site-packages/ironic/objects/node.py", line 313, in list
sort_dir=sort_dir)
File "/usr/lib/python2.7/site-packages/ironic/db/sqlalchemy/api.py", line 400, in get_node_list
sort_key, sort_dir, query)
File "/usr/lib/python2.7/site-packages/ironic/db/sqlalchemy/api.py", line 229, in _paginate_query
return query.all()
File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2925, in all
return list(self)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 3081, in __iter__
return self._execute_and_instances(context)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 3106, in _execute_and_instances
result = conn.execute(querycontext.statement, self._params)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 980, in execute
return meth(self, multiparams, params)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/sql/elements.py", line 273, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1099, in _execute_clauseelement
distilled_params,
File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1240, in _execute_context
e, statement, parameters, cursor, context
File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1456, in _handle_dbapi_exception
util.raise_from_cause(newraise, exc_info)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 296, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1236, in _execute_context
cursor, statement, parameters, context
File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 536, in do_execute
cursor.execute(statement, parameters) |
Install terraform and terraform-provider-ironic. Add a script that you can use to experiment with using terraform to deploy the masters after 07_deploy_masters.sh exits early.
This uses the unmerged code in openshift-metal3/terraform-provider-ironic#2
TODO list before merging: