-
Notifications
You must be signed in to change notification settings - Fork 709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tfx samples without gcs. #19
Comments
Hi @nidhidamodaran, No, it's not tied to GCS access at all. Kubeflow Pipelines itself is designed to be run on-premise as well as on GKE, so you shouldn't feel the need to use GCP at all. The example TFX chicago taxi pipeline on kubeflow does use GCP services, including GCS for storage, Dataflow for Beam jobs and Cloud ML Engine for training at scale. You can however easily remove these dependencies and run the pipeline on-premise, though scalability will be a problem without a distributed runner for Beam. Hope that helps. Let me know if you have any more questions. |
Hi @neuromage thanks for the reply. I was trying out chicage taxi sample to get hands-on with tfx. @PipelineDecorator( def _create_pipeline():
pipeline = KubeflowRunner().run(_create_pipeline()) When i run the pipeline, pod creation fails with error : Unable to mount volumes for pod "chicago-taxi-simple-wwxmc-4030983742_kubeflow(438e1b67-4c73-11e9-a7e6-0273ce6a77d4)": timeout expired waiting for volumes to attach or mount for pod "kubeflow"/"chicago-taxi-simple-wwxmc-4030983742". list of unmounted volumes=[gcp-credentials]. list of unattached volumes=[podmetadata docker-lib docker-sock gcp-credentials pipeline-runner-token-gk4d7] Could you help me understand what I am doing wrong in here. |
Are you running this in a Kubeflow cluster in GCP? It looks like that's not the case. The error indicates that it was unable to mount the GCP credentials. I think this brings up an important issue though. We should allow non-gcp usage of the components, and so mounting of GCP credentials should be user-configurable. I'll work on fixing this. In the meantime, if you're truly running on-prem, you can change the following line of code to remove the
|
yes @neuromage I will try that. Thanks. |
Also, is there any option of using custom image for different pipeline stages in tfx? |
Right now, this isn't possible without some work. You'd probably need to write the pipeline using Kubeflow PIpelines SDK instead, which would let you insert custom images/steps into your pipeline. However, this isn't straightforward, as you need to figure out how to pass around the metadata artifacts and use it in your custom step. I am planning to enable this use-case soon though, and will document it as a sample within the Kubeflow Pipelines repo when it's done. /cc @krazyhaas |
I'd love to add that to the docs or maybe even the We had found this PR and was wondering if it was possible already possible to pass these artifacts around, if that was encouraged, or if to use the metadata store the pipeline could only be constructed of tfx components. Your comment seems to answer all of these questions! Thanks for all of these detailed responses @neuromage, they have been super helpful in connecting some dots while getting started. |
Thanks @MattMorgis. Yeah, right now, we record those artifacts and TFX components know how to pass them around in a Kubeflow pipeline, but we haven't made this easily accessible by custom components just yet. I am planning on enabling this over the next few weeks. I'll update this thread with more info then. |
We made some progress running this on AWS with Kubeflow, but we just hit one snag that is going to take a bit to overcome:
It's interesting because it is successfully connecting to S3 to read the filename, However, I think the error that is raised is related to Apache Beams' Python SDK not having an S3 FileSystem: https://issues.apache.org/jira/browse/BEAM-2572 |
That's correct. Until beam's python SDK supports S3, we can't run most of the TFX libraries on S3. We have a similar challenge with Azure Blob Storage. |
I've been working on it. I'm about 50% complete and working with the Beam project/team to get it merged. According to the ticket there is a Google Summer of Code student who may do the Azure Blob Storage file system as well. |
It's worthy noting that we rely on both Beam python SDK and tensorflow IO
for DFS support. I believe S3 is support for tensorflow IO but not for
Azure. If someone plans to use Azure storage, (s)he might also look at
tensorflow/io#46 which seems a pending effort
about Azure Storage support in tensorflow.
…On Tue, Apr 16, 2019 at 5:35 PM Matt Morgis ***@***.***> wrote:
I've been working on it. I'm about 50% complete and working with the Beam
project/team to get it merged.
According to the ticket <https://issues.apache.org/jira/browse/BEAM-2572>
there is a Google Summer of Code student who may do the Azure Blob Storage
file system as well.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#19 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADHgZjbhsdD5NtAAwx5gfH2wKgnItZ1qks5vhmw5gaJpZM4cDRPh>
.
--
Cheers,
Zhitao Li
|
That is a very good point @zhitaoli. We realized Tensorflow itself had S3 support, and it was able to find the CSV file in the bucket we were pointing to, however we then ran into the Beam unsupported S3 file system error. I didn't realize Azure Blob Storage wasn't supported in Tensorflow itself either, in addition to Beam. I'll mention that in the ticket. |
Looks like Beam support for S3 is close to being implemented (see https://issues.apache.org/jira/browse/BEAM-2572 and apache/beam#9955). I would just like to second what has been discussed here. There is a pretty large user community who are interested in TFX and/or Kubeflow but are currently struggling to get into those frameworks due to a lack of non-GCP examples (and sometimes core functionality). A TFX Chicaco Taxi example on Kubeflow for AWS/Azure/On-prem would be a great starting point for those of us who are currently not on GCP! |
taxi_pipeline_kubeflow_local.py does not depend on GCO or GKE, but only depend on Kubeflow Pipelines deployment (backend) on any k8s. |
To capture the discussion: TFX examples can already use HDFS and GCS (although we don't have example for former), and after Beam 2.18 is picked up there is also S3 support. Azure blob storage support is tracked in Beam side. We will file a separate feature request (#1185 ) for using separate images in each stage and discuss it there. We will close this issue as won't fix. Please let us know if you think otherwise. |
@zhitaoli I thought the beam fix won't be integrated until 2.19 per last update from Pablo. Can you confirm it will be available in 2.18? |
@zhitaoli @gowthamkpr With Apache Beam 2.19, we still get
Are there some tfx dependencies? We use tfx 0.21. |
I think TFX installs Apache Beam with the |
Is tfx using kubeflow pipeline strictly tied with gcs access?
The text was updated successfully, but these errors were encountered: