Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: use BUILD_TAGS instead of CEPH_VERSION #2626

Closed
wants to merge 1 commit into from

Conversation

Rakshith-R
Copy link
Contributor

Ceph base container images have predefined value
for CEPH_VERSION env variable. Hence, use BUILD_TAGS
as variable name instead to avoid mismatch.

Signed-off-by: Rakshith R rar@redhat.com

For example, in build run https://github.com/ceph/ceph-csi/runs/4144225306?check_suite_focus=true
build tag is set as pacific which should have been octopus sourced from build.env file

GOOS=linux go build -tags=pacific -mod vendor -a -ldflags ' -X github.com/ceph/ceph-csi/internal/util.GitCommit=f4a756eeb1c4d7215a90588fb9b401d5a49327df -X github.com/ceph/ceph-csi/internal/util.DriverVersion=canary' -o _output/cephcsi ./cmd/

Ceph base container images have predefined value
for CEPH_VERSION env variable. Hence, use BUILD_TAGS
as variable name instead to avoid mismatch.

Signed-off-by: Rakshith R <rar@redhat.com>
@mergify mergify bot added the component/build Issues and PRs related to compiling Ceph-CSI label Nov 9, 2021
@Rakshith-R Rakshith-R requested review from a team November 9, 2021 05:55
@Madhu-1 Madhu-1 requested a review from a team November 9, 2021 06:25
@Rakshith-R
Copy link
Contributor Author

/retest ci/centos/mini-e2e-helm/k8s-1.20

@Rakshith-R
Copy link
Contributor Author

@Rakshith-R
Copy link
Contributor Author

/retest ci/centos/mini-e2e-helm/k8s-1.21

@Rakshith-R
Copy link
Contributor Author

/retest ci/centos/mini-e2e-helm/k8s-1.21

rbd-nbd related failures

1109 06:04:54.855006   30030 cephcmds.go:62] ID: 72 Req-ID: 0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654 command succeeded: rbd [device list --format=json --device-type nbd]
I1109 06:04:54.874131   30030 rbd_attach.go:343] ID: 72 Req-ID: 0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654 rbd: map mon rook-ceph-mon-a.rook-ceph.svc.cluster.local:6789
I1109 06:04:55.555915   30030 cephcmds.go:62] ID: 72 Req-ID: 0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654 command succeeded: rbd [--id cephcsi-rbd-node -m rook-ceph-mon-a.rook-ceph.svc.cluster.local:6789 --keyfile=***stripped*** --log-file /var/log/ceph/rbd-nbd-0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654.log map replicapool/csi-vol-079a4eb5-4122-11ec-ae55-1e64c3bd8654 --device-type nbd --options try-netlink --options reattach-timeout=300 --options io-timeout=0]
I1109 06:04:55.555966   30030 nodeserver.go:401] ID: 72 Req-ID: 0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654 rbd image: 0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654/replicapool was successfully mapped at /dev/nbd0
I1109 06:04:55.556397   30030 mount_linux.go:463] Attempting to determine if disk "/dev/nbd0" is formatted using blkid with args: ([-p -s TYPE -s PTTYPE -o export /dev/nbd0])
I1109 06:04:55.575976   30030 mount_linux.go:466] Output: ""
E1109 06:04:55.606416   30030 nodeserver.go:684] ID: 72 Req-ID: 0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654 failed to run mkfs error: exit status 1, output: mke2fs 1.45.6 (20-Mar-2020)
Creating filesystem with 262144 4k blocks and 65536 inodes
Filesystem UUID: ac7801f6-9067-4351-91f6-ad769e00438b
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376

Allocating group tables: 0/8���   ���done                            
Writing inode tables: 0/8���   ���done                            
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: 0/8���   ���mkfs.ext4: Input/output error while writing out and closing file system
I1109 06:04:56.169207   30030 cephcmds.go:62] ID: 72 Req-ID: 0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654 command succeeded: rbd [unmap /dev/nbd0 --device-type nbd]
E1109 06:04:56.169500   30030 utils.go:186] ID: 72 Req-ID: 0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654 GRPC error: rpc error: code = Internal desc = exit status 1

----
I1109 06:06:58.916624   30030 cephcmds.go:62] ID: 76 Req-ID: 0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654 command succeeded: rbd [device list --format=json --device-type nbd]
I1109 06:06:58.934552   30030 rbd_attach.go:343] ID: 76 Req-ID: 0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654 rbd: map mon rook-ceph-mon-a.rook-ceph.svc.cluster.local:6789
I1109 06:06:59.627988   30030 cephcmds.go:62] ID: 76 Req-ID: 0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654 command succeeded: rbd [--id cephcsi-rbd-node -m rook-ceph-mon-a.rook-ceph.svc.cluster.local:6789 --keyfile=***stripped*** --log-file /var/log/ceph/rbd-nbd-0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654.log map replicapool/csi-vol-079a4eb5-4122-11ec-ae55-1e64c3bd8654 --device-type nbd --options try-netlink --options reattach-timeout=300 --options io-timeout=0]
I1109 06:06:59.628044   30030 nodeserver.go:401] ID: 76 Req-ID: 0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654 rbd image: 0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654/replicapool was successfully mapped at /dev/nbd0
I1109 06:06:59.628485   30030 mount_linux.go:463] Attempting to determine if disk "/dev/nbd0" is formatted using blkid with args: ([-p -s TYPE -s PTTYPE -o export /dev/nbd0])
I1109 06:06:59.630855   30030 mount_linux.go:466] Output: ""
E1109 06:06:59.634577   30030 nodeserver.go:684] ID: 76 Req-ID: 0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654 failed to run mkfs error: exit status 1, output: mke2fs 1.45.6 (20-Mar-2020)
mkfs.ext4: Device size reported to be zero.  Invalid partition specified, or
	partition table wasn't reread after running fdisk, due to
	a modified partition being busy and in use.  You may need to reboot
	to re-read your partition table.

I1109 06:07:00.203237   30030 cephcmds.go:62] ID: 76 Req-ID: 0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654 command succeeded: rbd [unmap /dev/nbd0 --device-type nbd]
E1109 06:07:00.203575   30030 utils.go:186] ID: 76 Req-ID: 0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654 GRPC error: rpc error: code = Internal desc = exit status 1

logs from https://jenkins-ceph-csi.apps.ocp.ci.centos.org/blue/rest/organizations/jenkins/pipelines/mini-e2e-helm_k8s-1.21/runs/1670/nodes/101/log/?start=0

cc @pkalever @Madhu-1

https://jenkins-ceph-csi.apps.ocp.ci.centos.org/blue/organizations/jenkins/mini-e2e-helm_k8s-1.21/detail/mini-e2e-helm_k8s-1.21/1670/pipeline

@Rakshith-R
Copy link
Contributor Author

/retest ci/centos/mini-e2e/k8s-1.20

@Rakshith-R
Copy link
Contributor Author

/retest ci/centos/mini-e2e/k8s-1.22

Copy link
Member

@nixpanic nixpanic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand why this is needed. The idea behind it is to use the CEPH_VERSION that the container provides. That is then used to construct the GO_TAGS for passing the right Ceph version (name) to go-ceph to limit/enable certain features. When setting a different BUILD_TAGS version, the container can have a different CEPH_VERSION that is not compatible.

Could you explain why the CEPH_VERSION from the container should not be used?

@nixpanic
Copy link
Member

nixpanic commented Nov 9, 2021

For example, in build run https://github.com/ceph/ceph-csi/runs/4144225306?check_suite_focus=true
build tag is set as pacific which should have been octopus sourced from build.env file

If this is the concern, add a comment above CEPH_VERSION in build.env and mention that the CEPH_VERSION is a fallback in case it is not set in the environment. When building inside Ceph containers, the environment variable should be set automatically. When not building in containers, but on developer's workstations, the version will be picked from build.env.

@Rakshith-R
Copy link
Contributor Author

I do not understand why this is needed. The idea behind it is to use the CEPH_VERSION that the container provides. That is then used to construct the GO_TAGS for passing the right Ceph version (name) to go-ceph to limit/enable certain features. When setting a different BUILD_TAGS version, the container can have a different CEPH_VERSION that is not compatible.
Could you explain the CEPH_VERSION from the container should not be used?

I hit this when trying to pass ceph_preview tag in addition to <ceph_version> tag to access rbd admin task apis.

And I really did not think of this The idea behind it is to use the CEPH_VERSION that the container provides..

I can modify the pr to append BUILD_TAGS to ceph builds if that's preferred.
(or make those changes in pr which introduces rbd admin task api)

@pkalever
Copy link

pkalever commented Nov 9, 2021

Test:

�[1mSTEP�[0m: perform IO on rbd-nbd volume after nodeplugin restart
[...]
Nov  9 05:58:17.382: INFO: PersistentVolume pvc-0afdf2e6-a84b-42da-b730-f456a71d4673 found and phase=Bound (2.537696ms)
Nov  9 05:58:17.393: INFO: Waiting up to csi-rbd-demo-pod to be in Running state
Nov  9 06:08:17.404: FAIL: failed to create PVC and application with error timed out waiting for the condition

Logs:

I1109 05:58:19.865440   30030 rbd_attach.go:343] ID: 45 Req-ID: 0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654 rbd: map mon rook-ceph-mon-a.rook-ceph.svc.cluster.local:6789
I1109 05:58:20.597269   30030 cephcmds.go:62] ID: 45 Req-ID: 0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654 command succeeded: rbd [--id cephcsi-rbd-node -m rook-ceph-mon-a.rook-ceph.svc.cluster.local:6789 --keyfile=***stripped*** --log-file /var/log/ceph/rbd-nbd-0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654.log map replicapool/csi-vol-079a4eb5-4122-11ec-ae55-1e64c3bd8654 --device-type nbd --options try-netlink --options reattach-timeout=300 --options io-timeout=0]
I1109 05:58:20.597338   30030 nodeserver.go:401] ID: 45 Req-ID: 0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654 rbd image: 0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654/replicapool was successfully mapped at /dev/nbd0
I1109 05:58:20.597705   30030 mount_linux.go:463] Attempting to determine if disk "/dev/nbd0" is formatted using blkid with args: ([-p -s TYPE -s PTTYPE -o export /dev/nbd0])
I1109 05:58:20.610907   30030 mount_linux.go:466] Output: ""
E1109 05:58:20.622863   30030 nodeserver.go:684] ID: 45 Req-ID: 0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654 failed to run mkfs error: exit status 1, output: mke2fs 1.45.6 (20-Mar-2020)
Creating filesystem with 262144 4k blocks and 65536 inodes
Filesystem UUID: 99ee613c-b99b-42c5-a17a-828a4c5783e6
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376

Allocating group tables: 0/8���   ���done                            
Writing inode tables: 0/8���   ���done                            
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: 0/8���mkfs.ext4: Input/output error while writing out and closing file system
I1109 05:58:21.183075   30030 cephcmds.go:62] ID: 45 Req-ID: 0001-0024-02d2e691-7358-4844-937b-4a16c8eee4f1-0000000000000001-079a4eb5-4122-11ec-ae55-1e64c3bd8654 command succeeded: rbd [unmap /dev/nbd0 --device-type nbd]

We have seen Input/Output errors before when the nodeplugin pod is restarted, which is anyway tracked with #2204

@Madhu-1 @nixpanic probably I'm missing who is attempting to do mkfs on /dev/nbd0 after nodeplugin restart? Not seen this before!

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Nov 9, 2021

@pkalever during nodeStage operation, if the device format is empty a cephcsi will format the device and mount it.

@nixpanic
Copy link
Member

nixpanic commented Nov 9, 2021

I hit this when trying to pass ceph_preview tag in addition to <ceph_version> tag to access rbd admin task apis.

I can modify the pr to append BUILD_TAGS to ceph builds if that's preferred.

Adding support for additional BUILD_TAGS in the Makefile is fine, of course. Default to CEPH_VERSION if BUILD_TAGS is not set (which will be available from the environment, or has been read from build.env already). Note that you will need to pass BUILD_TAGS too on the make targets that use containers.

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Nov 9, 2021

BUILD_TAGS makes more sense for golang build instead of CEPH_VERSION.

I hit this when trying to pass ceph_preview tag in addition to <ceph_version> tag to access rbd admin task apis.
I can modify the pr to append BUILD_TAGS to ceph builds if that's preferred.

Adding support for additional BUILD_TAGS in the Makefile is fine, of course. Default to CEPH_VERSION if BUILD_TAGS is not set (which will be available from the environment, or has been read from build.env already). Note that you will need to pass BUILD_TAGS too on the make targets that use containers.

BUILD_TAGS makes more sense for golang build instead of CEPH_VERSION. cant we git rid of one more variable here?

ifndef CEPH_VERSION
CEPH_VERSION = $(shell . $(CURDIR)/build.env ; echo $${CEPH_VERSION})
ifndef BUILD_TAGS
BUILD_TAGS = $(shell . $(CURDIR)/build.env ; echo $${BUILD_TAGS})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be call this as `CEPH_BUILD_TAGS" as it is referring to Ceph Build ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need to prefix everything with CEPH_CSI_ or CEPH_ imho. BUILD_TAGS suggest any Golang build-tags can be included, which is the goal here.

@nixpanic
Copy link
Member

nixpanic commented Nov 9, 2021

BUILD_TAGS makes more sense for golang build instead of CEPH_VERSION. cant we git rid of one more variable here?

We do need to consume the CEPH_VERSION from the container where the build is done, that can not be removed (or renamed) as it is part of the base image. We can drop CEPH_VERSION from build.env, but in that case we need to add BUILD_TAGS with some default Ceph version in build.env, which gets overwritten by CEPH_VERSION when running in a container...

ifndef CEPH_VERSION
CEPH_VERSION = $(shell . $(CURDIR)/build.env ; echo $${CEPH_VERSION})
ifndef BUILD_TAGS
BUILD_TAGS = $(shell . $(CURDIR)/build.env ; echo $${BUILD_TAGS})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need to prefix everything with CEPH_CSI_ or CEPH_ imho. BUILD_TAGS suggest any Golang build-tags can be included, which is the goal here.

@@ -6,7 +6,7 @@
# options for analysis running
run:
build-tags:
- @@CEPH_VERSION@@
- @@BUILD_TAGS@@
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you pass octopus,ceph_preview here, or does it need to be a list like

build-tags:
  - @@CEPH_VERSION@@
  - ceph_preview

@Rakshith-R
Copy link
Contributor Author

BUILD_TAGS makes more sense for golang build instead of CEPH_VERSION. cant we git rid of one more variable here?

We do need to consume the CEPH_VERSION from the container where the build is done, that can not be removed (or renamed) as it is part of the base image. We can drop CEPH_VERSION from build.env, but in that case we need to add BUILD_TAGS with some default Ceph version in build.env, which gets overwritten by CEPH_VERSION when running in a container...

I guess adding BUILD_TAGS variable in addition to CEPH_VERSION(preserving its behaviour of it being picked up from base container) makes it easier, something like the following
In Makefile

ifndef CEPH_VERSION
	CEPH_VERSION = $(shell . $(CURDIR)/build.env ; echo $${CEPH_VERSION})
endif
ifdef CEPH_VERSION
	# pass -tags to go commands (for go-ceph build constraints)
	GO_TAGS = -tags=$(CEPH_VERSION)
endif
ifdef BUILD_TAGS
	# additional build tags
	GO_TAGS = GO_TAGS,$(BUILD_TAGS)
endif

In golangci.yaml.in

run:
  build-tags:
    - @@CEPH_VERSION@@
    - @@BUILD_TAGS@@

this sounds better?

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Nov 9, 2021

CEPH_VERSION(preserving its behaviour of it being picked up from base container..)

why we want it to pick up from the base image as we are expertly passing when building the image? am I missing something here?

@Rakshith-R
Copy link
Contributor Author

Rakshith-R commented Nov 9, 2021

why we want it to pick up from the base image as we are expertly passing when building the image? am I missing something here?

BUILD_TAGS makes more sense for golang build instead of CEPH_VERSION. cant we git rid of one more variable here?
We do need to consume the CEPH_VERSION from the container where the build is done, that can not be removed (or renamed) as it is part of the base image. We can drop CEPH_VERSION from build.env, but in that case we need to add BUILD_TAGS with some default Ceph version in build.env, which gets overwritten by CEPH_VERSION when running in a container...

from @nixpanic comment here #2626 (comment)

in that case we need to add BUILD_TAGS with some default Ceph version in build.env, which gets overwritten by CEPH_VERSION when running in a container...

This part sounds very tedious, : exactly locating CEPH_VERSION withing BUILD_TAGS string and replacing/overwriting it.
ex: if BUILD_TAGS= pacific,ceph_preview .
Instead I thought it better to have another variable.

@Rakshith-R
Copy link
Contributor Author

closing this pr, since the proposed change here is introduced as part of #2633

@Rakshith-R Rakshith-R closed this Nov 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/build Issues and PRs related to compiling Ceph-CSI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants