Skip to content

Update 454-dagster.txt #139

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 25, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
103 changes: 51 additions & 52 deletions transcripts/454-dagster.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@

00:00:06 I bet that data needs processed, filtered, transformed, distributed, and much more.

00:00:11 One of the biggest tools to create these data pipelines with Python is Daxter.
00:00:11 One of the biggest tools to create these data pipelines with Python is Dagster.

00:00:16 And we're fortunate to have Pedram Navid on the show to tell us about it.

00:00:20 Pedram is the head of data engineering and dev rel at Daxter Labs.
00:00:20 Pedram is the head of data engineering and dev rel at Dagster Labs.

00:00:24 And we're talking data pipelines this week here at Talk Python.

Expand Down Expand Up @@ -64,7 +64,7 @@

00:02:20 Python type hints are really starting to transform Python, especially from the ecosystem's perspective.

00:02:26 Think FastAPI, Pytantic, BearType, et cetera.
00:02:26 Think FastAPI, Pydantic, BearType, et cetera.

00:02:30 This course shows you the ins and outs of Python typing syntax, of course, but it also

Expand Down Expand Up @@ -108,7 +108,7 @@

00:03:27 >> Of course, yeah.

00:03:28 So my name is Pedram Naveed.
00:03:28 So my name is Pedram Navid.

00:03:29 I'm the head of data engineering and dev rel at Dagster.

Expand Down Expand Up @@ -212,21 +212,21 @@

00:05:55 like Airflow and tried to automate data pipelines instead of patches on a server.

00:06:01 That one day led to, I guess, making a long story short, a role at Daxter, where now I
00:06:01 That one day led to, I guess, making a long story short, a role at Dagster, where now I

00:06:06 contribute a little bit to Daxter.
00:06:06 contribute a little bit to Dagster.

00:06:08 I work on Daxter, the core project itself, but I also use Daxter internally to build
00:06:08 I work on Dagster, the core project itself, but I also use Dagster internally to build

00:06:11 our own data pipelines.

00:06:13 I'm sure it's interesting to see how you all both build Daxter and then consume Daxter.
00:06:13 I'm sure it's interesting to see how you all both build Dagster and then consume Dagster.

00:06:19 Yeah, it's been wonderful.

00:06:21 I think there's a lot of great things about it.

00:06:23 One is like getting access to Daxter before it's fully released, right?
00:06:23 One is like getting access to Dagster before it's fully released, right?

00:06:27 Internally, we dog food, new features, new concepts, and we work with the product team,

Expand Down Expand Up @@ -292,7 +292,7 @@

00:07:49 Not that people speak Python, but is it different in the sense that, "Hey, I could give them

00:07:53 a Jupyter notebook," or, "I could give them Streamlet," or one of these things, right?
00:07:53 a Jupyter notebook," or, "I could give them Streamlit," or one of these things, right?

00:07:58 A little more or less you building and just plug it in?

Expand Down Expand Up @@ -322,7 +322,7 @@

00:08:35 It's just not fun to write.

00:08:37 Streamlet makes it so easy to do that.
00:08:37 Streamlit makes it so easy to do that.

00:08:39 So it's something like retool and there's a thousand other ways now that you can bring

Expand Down Expand Up @@ -368,7 +368,7 @@

00:09:32 data orchestration, all those things.

00:09:33 We'll talk about Daxter and some of the trends and that.
00:09:33 We'll talk about Dagster and some of the trends and that.

00:09:36 So let's grab some random internet search for what does a data pipeline maybe look like?

Expand Down Expand Up @@ -486,11 +486,11 @@

00:12:20 Yes.

00:12:21 Daxter to me is a way to build a data platform.
00:12:21 Dagster to me is a way to build a data platform.

00:12:24 It's also a different way of thinking about how you build data pipelines.

00:12:28 Maybe it's good to compare it with kind of what the world was like, I think, before Daxter
00:12:28 Maybe it's good to compare it with kind of what the world was like, I think, before Dagster

00:12:32 and how it came about to be.

Expand Down Expand Up @@ -646,7 +646,7 @@

00:15:49 Fun.

00:15:50 So, Daxter is the company, but also is open source.
00:15:50 So, Dagster is the company, but also is open source.

00:15:54 What's the story around like, can I use it for free?

Expand All @@ -658,37 +658,37 @@

00:16:00 Okay.

00:16:01 So, Daxter Labs is the company.
00:16:01 So, Dagster Labs is the company.

00:16:02 Daxter open source is the product.
00:16:02 Dagster open source is the product.

00:16:03 It's 100% free.

00:16:04 We're very committed to the open source model.

00:16:06 I would say 95% of the things you can get out of Daxter are available through open source.
00:16:06 I would say 95% of the things you can get out of Dagster are available through open source.

00:16:11 And we tend to try to release everything through that model.

00:16:14 You can run very complex pipelines, and you can deploy it all on your own if you wish.

00:16:19 There is a Daxter cloud product, which is really the hosted version of Daxter.
00:16:19 There is a Dagster cloud product, which is really the hosted version of Dagster.

00:16:23 If you want hosted plain, we can do that for you through Daxter cloud, but it all runs
00:16:23 If you want hosted plain, we can do that for you through Dagster cloud, but it all runs

00:16:27 on the same code base and the modeling and the files all essentially look the same.

00:16:32 Okay.

00:16:33 So obviously you could get, like I talked about at the beginning, you could go down

00:16:36 the DevOps side, get your own open source Daxter set up, schedule it, run it on servers,
00:16:36 the DevOps side, get your own open source Dagster set up, schedule it, run it on servers,

00:16:41 all those things.

00:16:42 But if we just wanted something real simple, we could just go to you guys and say, "Hey,

00:16:47 I built this with Daxter.
00:16:47 I built this with Dagster.

00:16:48 Will you run it for me?" Pretty much.

Expand All @@ -698,7 +698,7 @@

00:16:52 So there's two options there.

00:16:53 You can do the serverless model, which says, "Daxter, just run it.
00:16:53 You can do the serverless model, which says, "Dagster, just run it.

00:16:55 We take care of the compute, we take care of the execution for you, and you just write

Expand Down Expand Up @@ -734,9 +734,9 @@

00:17:32 You can say, "We've got this Kubernetes cluster, this ECS cluster, but we still want to use

00:17:37 a Daxter Cloud product to sort of manage the control plane.
00:17:37 a Dagster Cloud product to sort of manage the control plane.

00:17:40 Daxter Cloud will do that." And then you can go off and execute things on your own environment if that's something
00:17:40 Dagster Cloud will do that." And then you can go off and execute things on your own environment if that's something

00:17:44 you wish to do.

Expand All @@ -752,7 +752,7 @@

00:17:57 Okay.

00:17:58 Well, let's maybe talk about Daxter for a bit.
00:17:58 Well, let's maybe talk about Dagster for a bit.

00:17:59 I want to talk about some of the trends as well, but let's just talk through maybe setting

Expand Down Expand Up @@ -794,7 +794,7 @@

00:18:58 to persist it, that's really up to you.

00:19:00 And then the resources is sort of where the power, I think, of a lot of Daxter comes in.
00:19:00 And then the resources is sort of where the power, I think, of a lot of Dagster comes in.

00:19:04 So the asset is sort of like declaration of the thing I'm going to create.

Expand All @@ -812,15 +812,15 @@

00:19:32 that data is going to be persistent.

00:19:34 Does Daxter know how to talk to those different platforms?
00:19:34 Does Dagster know how to talk to those different platforms?

00:19:37 Does it like natively understand DuckDB and Snowflake?

00:19:40 Yeah.

00:19:41 Interesting.

00:19:42 People often look to Daxter and like, "Oh, does it do X?" And the question is like, "Daxter does anything you can do Python with?"
00:19:42 People often look to Dagster and like, "Oh, does it do X?" And the question is like, "Dagster does anything you can do Python with?"

00:19:48 Which is most things, yeah.

Expand All @@ -838,7 +838,7 @@

00:19:59 You want to use S3, you need to find the S3 provider.

00:20:01 With Daxter, you kind of say you don't have to do any of that.
00:20:01 With Dagster, you kind of say you don't have to do any of that.

00:20:04 If you want to use Snowflake, for example, install the Snowflake connector package from

Expand Down Expand Up @@ -968,7 +968,7 @@

00:22:49 it or monitor it?

00:22:50 Everything in Daxter is written as code.
00:22:50 Everything in Dagster is written as code.

00:22:52 The UI reads that code and it interprets it as a DAG and then it displays that for you.

Expand All @@ -978,7 +978,7 @@

00:23:06 schedules.

00:23:07 But the core, we really believe this is Daxter, like the core declaration of how things are
00:23:07 But the core, we really believe this is Dagster, like the core declaration of how things are

00:23:11 done, it's always done through code.

Expand Down Expand Up @@ -1342,7 +1342,7 @@

00:31:26 All right.

00:31:27 So on the homepage at Daxter.io, you've got a nice graphic that shows you both how to
00:31:27 So on the homepage at Dagster.io, you've got a nice graphic that shows you both how to

00:31:33 write the code, like some examples of the code, as well as how that looks in the UI.

Expand Down Expand Up @@ -1540,7 +1540,7 @@

00:35:41 That's really hard to do.

00:35:42 But this debugger really is, is, is a structured log of every step that's been going on through
00:35:42 But this debugger really is, a structured log of every step that's been going on through

00:35:47 your pipeline, right?

Expand Down Expand Up @@ -1714,7 +1714,7 @@

00:39:47 There's memory concerns, but let's pretend the world is simple.

00:39:51 Anything that can be parallelized will be through Daxter.
00:39:51 Anything that can be parallelized will be through Dagster.

00:39:54 And that's really the benefit of writing these DAGs is that there's a nice algorithm for

Expand Down Expand Up @@ -1752,17 +1752,17 @@

00:40:40 It's a fun little way to break things apart.

00:40:43 - So if we run this on the Daxter cloud or even on our own, is this pretty much automatic?
00:40:43 - So if we run this on the Dagster cloud or even on our own, is this pretty much automatic?

00:40:49 We don't have to do anything?

00:40:51 I think Daxter just looks at it and says, this looks parallelizable and it'll go or?
00:40:51 I think Dagster just looks at it and says, this looks parallelizable and it'll go or?

00:40:55 - That's right.

00:40:56 Yeah.

00:40:57 As long as you've got the full deployment, whether it's OSS or cloud, Daxter will basically
00:40:57 As long as you've got the full deployment, whether it's OSS or cloud, Dagster will basically

00:41:00 parallelize it for you, which is possible.

Expand All @@ -1784,9 +1784,9 @@

00:41:24 I want to talk about some of the tools and some of the tools that are maybe at play here

00:41:29 when working with Daxter and some of the trends and stuff.
00:41:29 when working with Dagster and some of the trends and stuff.

00:41:31 But before that, maybe speak to where you could see people adopt a tool like Daxter,
00:41:31 But before that, maybe speak to where you could see people adopt a tool like Dagster,

00:41:37 but they generally don't.

Expand Down Expand Up @@ -1814,7 +1814,7 @@

00:42:18 I would say.

00:42:19 So probably the first like trigger for me of thinking of, you know, is Daxter a good
00:42:19 So probably the first like trigger for me of thinking of, you know, is Dagster a good

00:42:24 choice is like, am I trying to ingest data from somewhere?

Expand All @@ -1836,7 +1836,7 @@

00:42:46 probably fine.

00:42:47 I don't think you need to implement all of Daxter just to do that.
00:42:47 I don't think you need to implement all of Dagster just to do that.

00:42:51 But the more closer you get to data pipelining, I think the better your life will be if you're

Expand Down Expand Up @@ -1906,7 +1906,7 @@

00:44:39 Yes, exactly.

00:44:40 The Dutch have given us so much and they've asked nothing of us.
00:44:40 The Duck have given us so much and they've asked nothing of us.

00:44:42 So I'm always very thankful for them.

Expand Down Expand Up @@ -2152,9 +2152,9 @@

00:50:39 the way forward for everyone, but it is something we're trying.

00:50:42 And I think for Dexter, I think it's working pretty well.
00:50:42 And I think for Dagster, I think it's working pretty well.

00:50:44 And what I think is really powerful about Dexter is like the open source product is
00:50:44 And what I think is really powerful about Dagster is like the open source product is

00:50:48 really, really good.

Expand Down Expand Up @@ -2182,9 +2182,9 @@

00:51:27 To me, that's one of the more exciting parts, right?

00:51:29 A lot of the development that we do in Dexter open source is driven by people who are paid
00:51:29 A lot of the development that we do in Dagster open source is driven by people who are paid

00:51:35 through what happens on Dexter cloud.
00:51:35 through what happens on Dagster cloud.

00:51:37 And I think from what I can tell, there's no better way to build open source product

Expand Down Expand Up @@ -2380,7 +2380,7 @@

00:55:52 Like it's not, that's not the point.

00:55:53 The point is to look at the code and see, you know, how does Daxter use Daxter and what
00:55:53 The point is to look at the code and see, you know, how does Dagster use Dagster and what

00:55:56 does that kind of look like?

Expand All @@ -2394,15 +2394,15 @@

00:56:01 Yeah, I guess let's wrap it up with the final call to action.

00:56:05 People are interested in Daxter.
00:56:05 People are interested in Dagster.

00:56:06 How do they get started?

00:56:07 What do you tell them?

00:56:08 Oh, yeah.

00:56:09 Well, Daxter is probably the greatest place to start.
00:56:09 Well, Dagster is probably the greatest place to start.

00:56:11 You can try the cloud product.

Expand Down Expand Up @@ -2434,7 +2434,7 @@

00:56:43 Well, Pedram, thank you for being on the show.

00:56:44 Make sure the work on Daxter and sharing it with us.
00:56:44 Make sure the work on Dagster and sharing it with us.

00:56:47 Thank you, Michael.

Expand Down Expand Up @@ -2507,4 +2507,3 @@
00:58:27 [MUSIC ENDS]

00:58:30 We just recorded it.