-
Notifications
You must be signed in to change notification settings - Fork 231
Expand file tree
/
Copy pathzachs_posts.csv
More file actions
We can make this file beautiful and searchable if this error is corrected: Unclosed quoted field in line 79.
709 lines (709 loc) · 42.6 KB
/
zachs_posts.csv
File metadata and controls
709 lines (709 loc) · 42.6 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
Date,ShareLink,ShareCommentary,SharedURL,MediaURL,Visibility
2023-09-18 23:46:14,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7109684072821489665,"Seventy of my boot camp graduates qualified for three or six mentorship sessions by doing the work to get certified! ""
""""
""I've been picky about the group of mentors because I want the best of the best to mentor my students. ""
""""
""This is the list of rockstar mentors that we have so far. (I'll be getting one or two more since I didn't expect so many of my boot camp attendees to get certified!)""
""""
""- Stephanie Murphy, senior data engineer at Tesla, she attended the boot camp ""
""- Rimzim Thube, data engineer at Amazon, with over 10 years of experience""
""- Bhargavi Reddy Dokuru, senior data engineer at Netflix""
""- Ankit Biradar, data-focused software engineer at Uber ""
""- Lakshmi Srivalli Kristam, senior data engineer at Grubhub, she was a boot camp mentor too""
""- Lakshmi Malladi, data engineer at Salesforce, also ex-Meta, she attended the boot camp""
""- Venkatesh Selvaraj, senior data engineer at Meta""
""- Lenny A, data engineering leader with many years of experience at Meta and Amazon ""
""- Francesco Quaratino, senior data engineer at Unite with over 11 years of experience (for our Europe/Africa attendees)""
""- Ankit Shrivastava, senior software engineer at Uber (for our Asia attendees)""
""""
""#dataengineering ",,,MEMBER_NETWORK
2023-09-18 23:31:30,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7109680364901564416,Miguel was an amazing student! #dataengineering,,,MEMBER_NETWORK
2023-09-18 18:45:52,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7109608483347468288,"I'm excited to announce that I hired my first full-time employee! ""
""""
""JulieAnn is joining EcZachly Inc as a software engineer and business admin! ""
""""
""JulieAnn was one of my students in the first boot camp and she brought a lot of really positive energy and organization skills. Without her, the v1 would've been a very disorganized mess. ""
""""
""In the v2 boot camp, I hired her part-time as a community manager to help organize the Discord and curricula. She played a critical part in getting dataengineer.io to a great place!""
""""
""For the v3 boot camp that will start in early November, we'll be organizing and creating to level up the quality even higher! ""
""""
""Really excited to have you on the team JulieAnn! ""
""""
""#softwarengineering ""
""#dataengineering ",,,MEMBER_NETWORK
2023-09-17 21:13:35,https://www.linkedin.com/feed/update/urn%3Ali%3AugcPost%3A7109283267106795520,"Did Bill Inmon and Ralph Kimball ever consider cage fighting like Mark Zuckerburg and Elon Musk over who had the better data warehouse methodology? ""
""""
""I asked the man himself in my boot camp. ""
""""
""I hope you enjoy this clip! ""
""""
""#dataengineering ""
""#dataarchitecture ",,,MEMBER_NETWORK
2023-09-17 18:44:43,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7109245803503775744,"Here’s two hours of free data engineering content on LLM-driven data engineering. Code base and slides linked too. ""
""""
""The lecture:""
""""
""""
""https://lnkd.in/gaCs8NDz ""
""""
""The lab: ""
""""
""https://lnkd.in/g3nSPWq8""
""""
""I’ll be filming two more hours this Thursday as well! You can register to attend that session here: ""
""""
""https://lnkd.in/gg3MH4vR""
""""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-09-15 23:30:12,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7108592874878889985,"Just finished adding certifications to my boot camp platform! Over 70% of my students in the v2 boot camp got the attendee-level certification which required that you attended at least 70% of the lectures! ""
""""
""Going to be sharing the people who got the excellence-level certification which is attendee + all the homework soon! Grading stuff ferociously right now! ""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-09-15 23:27:07,https://www.linkedin.com/feed/update/urn%3Ali%3AugcPost%3A7108592096839663616,"Victor was a really great student! Glad to see he got certified! ""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-09-14 19:20:18,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7108167598000807936,"The engineer who asks a stupid question looks dumb for a second. ""
""The engineer who doesn’t ask stupid questions looks dumb for a lifetime ""
""""
""#softwareengineering",,,MEMBER_NETWORK
2023-09-14 16:55:17,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7108131101629452288,"How will LLMs disrupt different data engineering tasks? ""
""""
""Here’s a scatter plot of almost every data engineering task based on how technical vs soft skill it is and how tactical vs strategic it is. ""
""""
""The main takeaways from this should be: ""
""""
""- tactical + technical tasks are going to be disrupted a lot""
""""
""- strategic + soft skill tasks are safe ""
""""
""- LLMs + agents will solve the pain of oncall for 80-90% of failures""
""""
""- just knowing SQL, Python and Spark puts your job at risk ""
""""
""""
""What would you add or change on this chart? Any additional takeaways? ""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-09-13 21:40:48,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7107840567920795648,"Clement has the wisdom here. I’m glad I followed my heart in March! The future is so bright! ""
""""
""#mentalhealth",,,MEMBER_NETWORK
2023-09-13 01:49:30,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7107540764926074880,"The most important lesson I learned as a software engineer entrepreneur this year. ""
""""
""Write code to facilitate business not to generate it! ""
""""
""#softwareengineering",,,MEMBER_NETWORK
2023-09-13 00:01:19,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7107513541049339904,"The number of people DMing me about quitting their big tech jobs is so high right now! ""
""Let’s go!!! Big tech exodus! ""
""""
""#mentalhealth",,,MEMBER_NETWORK
2023-09-12 20:32:43,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7107461043034931200,"I started my Substack newsletter on June 15th. The 90 days since then have been wild. ""
""""
""Went from 0 paid subs to 117. ""
""Went from 9300 free subs to 21700. ""
""""
""Key lessons: ""
""""
""- my articles on data modeling and SQL interviews were my most successful. Providing new value is the most important thing! ""
""""
""- partner with people to grow! ""
""I partnered with Alex Xu, Benjamin Rogojan, Ananth Packkildurai, Sarah Floris, MS, and Ryan Peterman. We recommend each other and that brought in an additional 5,000 subs!!""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-09-12 18:52:44,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7107435884483907584,"Finally someone seriously taking on Great Expectations! We can and must do better!""
""""
""Congrats, 🎯 Mark Freeman II! ""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-09-12 03:25:46,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7107202604723535872,"Li is a highly technical founder who knows AI extremely well. this is an incredible opportunity!""
""""
""#machinelearning",,,MEMBER_NETWORK
2023-09-11 01:32:37,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7106811740004249600,"The data engineering SQL interview is the most common interview in big tech! ""
""""
""At most companies, you end up getting asked SQL questions for about two hours. ""
""""
""There are two rounds to be aware of:""
""- The screener round""
""Where they test that you have the fundamental knowledge of SQL and how to write code""
""- The onsite round""
""Where they test your depth of SQL and how to optimize with things like indexes and minimizing table scans!""
""""
""I wrote a free Substack article that goes into much more detail about how to pass these interviews here: https://lnkd.in/g_m9RZWH""
""""
""#dataengineering ",,,MEMBER_NETWORK
2023-09-10 23:39:18,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7106783223489200129,"Vitali completed the grind of v2 of EcZachly Inc’s boot camp! ""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-09-10 21:11:55,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7106746135200858112,Great post by Li! Make sure to find breaks for your #mentalhealth while you're building your dreams! ,,,MEMBER_NETWORK
2023-09-10 18:32:22,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7106705980750495744,"My boot camp attendees who meet the attendee certification bar will be paired with a data engineering mentor who has worked in big tech at least 4 years to help them with interview prep, referrals, resume review, and anything else the boot camp attendee wants. ""
""""
""The mentors get paid from the boot camp tuition. Combined track students get six mentorship sessions and single track students get three mentorship sessions! ""
""""
""Six weeks of learning data engineering from me isn’t enough to get you to success. ""
""""
""Finding mentors who can help you and establish long term success is the ultimate goal of the boot camps! ""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-09-10 02:51:20,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7106469161693822976,"My boot camp attendees were ENGAGED! ""
""""
""Kyle Dufrane attended 99.9% of the boot camp, he missed only 3 minutes of the 60+ hours of live content! ""
""""
""Jade Nguyen, Joseph Corrado, Lakshmi Malladi, Rushitaa Dattuluri attended 99.8% of the boot camp! ""
""""
""In total, about 60% of my students will meet the certification bar! ""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-09-09 16:22:50,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7106310995047849984,"Zach promoted from junior to staff in 4 years""
""Ryan promoted from junior to staff in 3 years ""
""""
""I have a feeling it’ll happen eventually where Meta is promoting new grads to staff engineer in one year after joining.""
""""
""It’s wild how quickly a good manager and mentor can accelerate your career! ""
""""
""#softwareengineering",,,MEMBER_NETWORK
2023-09-08 16:50:09,https://www.linkedin.com/feed/update/urn%3Ali%3AugcPost%3A7105955482468552705,"When I was 20, I had a dream of becoming a mathematics professor. I applied to graduate schools, had a perfect quant GRE score, and I was so excited to study. ""
""""
""I abandoned the dream to study data science in industry instead.""
""""
""Going back to school has been on my mind a lot recently. I yearn for the depth and rigor that academics bring. ""
""""
""#datascience",,,MEMBER_NETWORK
2023-09-07 20:13:02,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7105644151701962752,"My startup is averaging $50k/month since I started it in March""
""""
""Here’s the tech stack I use: ""
""""
""- Languages: TypeScript, SQL""
""- Frontend: NextJS""
""- Backend: ExpressJS""
""- Database: Postgres ""
""- Payments: Stripe ""
""- Content: S3 and CloudFront ""
""- Emails: SparkPost""
""- Platform: Heroku ""
""- Cloud bill: $400/month ""
""- Experiments: Statsig""
""- Logging: Kafka""
""""
""Future integrations: ""
""""
""- Analytics engineering learning platform: Trino and Tabular ""
""- Data engineering learning platform: Spark on Databricks and Iceberg ""
""- “leetcode for Spark” platform using Spark and Iceberg as well ""
""- Data engineering mentor matching platform using machine learning with PyTorch I think. Maybe something I can buy instead though ""
""""
""What else should I be building with my startup and team of engineers? ""
""""
""#SoftwareEngineering",,,MEMBER_NETWORK
2023-09-07 19:55:49,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7105639818700754944,"Amazing post by Ryan about being a good tech lead! ""
""""
""#softwareengineering",,,MEMBER_NETWORK
2023-09-07 19:05:30,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7105627156159885312,"Data engineering tooling has different groups of stacks depending on the company. ""
""""
""- the big tech company stack ""
""""
""- compute: Spark ""
""- Orchestration: Airflow (or similar)""
""- data quality: custom built ""
""- serving layer: Druid""
""- storage: Iceberg + S3 ""
""""
""- the mid-sized company stack ""
""""
""- compute: Snowflake/BigQuery ""
""- orchestration: Airflow or Fivetran or Informatica""
""- data quality: Great expectations and DBT ""
""- serving layer: Tableau extracts or Druid ""
""- storage: Snowflake/BigQuery ""
""""
""- the startup stack ""
""""
""- compute: Postgres ""
""- orchestration: CRON""
""- data quality: skipped ""
""- serving layer: SQL queries in Postgres""
""- storage: Postgres""
""""
""What would you change in these stacks?""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-09-07 02:15:17,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7105372928489263104,"Let’s go!!! More big tech influencers leaving to found their own things!""
""Congrats Cassie!!""
""""
""#datascience",,,MEMBER_NETWORK
2023-09-06 19:42:36,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7105274107633537024,"Data architect is the next step after data engineer on the technical ladder.""
""""
""What big questions should you be able to answer as a data architect?""
""""
""- should our pipelines be streaming or batch?""
""Having a firm understanding of the trade offs of lambda (streaming + batch) versus kappa (streaming only) architecture is a key thing to being a great data architect. ""
""""
""- how should our master data be modeled? ""
""This bucket is complex and has a few competing ideologies between Kimball data modeling, Inmon data modeling and one big table (OBT) data modeling. Each of these ideologies have trade offs that are too long to discuss in this LinkedIn post. ""
""""
""- what data stores should we use for serving our data? ""
""Technology selection is another critical component. Betting everything on Snowflake or Spark is a losing battle. Understanding low latency stores like Druid, Memcached and Redis will serve you well. Also know analytical DBs like CouchDB and DuckDB. ""
""""
""- how do we create processes to ensure data quality across all our pipelines ""
""Processes like spec review, design discussions, and data validation will greatly level up your data. As a data architect you should be flexing your leadership skills to get these adopted across your company. ""
""""
""""
""What other skills should a data architect know?""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-09-05 20:36:04,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7104925171936890880,"I took a refreshing phone break this past week and it has greatly improved my outlook and creativity! ""
""""
""Those 20 minutes on Saturday were to contact my family to let them know I wasn’t dying in the mud at burning man. ""
""""
""I’m excited to bring the new found energy and improved mental health back into my data engineering business!""
""""
""#mentalhealth ""
""#dataengineering",,,MEMBER_NETWORK
2023-08-26 18:05:12,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7101263325937766400,"Things I do before bedtime to calm my anxious mind and get more restful sleep ""
""""
""- yoga""
""The long stretchy poses are great to invite sleepiness and calm. Supported fish pose with a block is incredible for the lower back. Sleeping pigeon and thread the needle are two other great poses to loosen the shoulders and legs. ""
""""
""- no phone a few hours before bed ""
""Blue light simulates sunlight and tricks the brain into thinking it’s day time. Quit messing up your brain into thinking it’s day time at 1 AM by scrolling Instagram Reels. ""
""""
""- a hot shower than ends cold ""
""I start my shower off hot and the last 2 minutes I end it cold. The quick shift and need for my body to warm up helps invite sleep ""
""""
""- avoid taking melatonin too many nights ""
""Take melatonin at most once or twice a week. Don’t interfere with your bodies natural sleep processes.""
""""
""#mentalhealth",,,MEMBER_NETWORK
2023-08-25 22:56:43,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7100974300840529920,"Julio learned a lot from the boot camp!""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-08-25 18:03:04,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7100900403222904833,"Benjamin puts out some of the best data engineering content in YouTube out there!""
""""
""Give him a follow!""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-08-25 11:11:58,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7100796944272293888,"22 lectures, 20 labs, 12 Q&As, 6 speaker sessions with 12 speakers! ""
""""
""The last 6 weeks have been intense but I’m glad I have a treasure trove of content now! ""
""""
""Excited for future iterations of boot camps and building my educational platform! ""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-08-24 18:58:49,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7100552046415675392,"My 136th video on TikTok was the first one to break a million views ""
""My 153rd post on Instagram was the first one to break a million views ""
""""
""Consistency is key! You never know when the algorithm will bless you!""
""""
""#contentcreation",,,MEMBER_NETWORK
2023-08-24 07:04:06,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7100372182345646080,"The 6th speaker series had Bill Inmon and Jitender Aswani and was really 🔥!""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-08-24 00:25:22,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7100271835614744576,The speaker series this week is amazing! #dataengineering ,,,MEMBER_NETWORK
2023-08-23 18:33:20,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7100183242019901441,"My self-paced boot camp teaches much more than just Spark and Flink! ""
""""
""In the combined six-week program, you will:""
""""
""- Learn the tradeoffs between Kimball and One Big Table data modeling. Learn how to leverage complex data types like Arrays and Structs to supercharge your analytics""
""- Create a data pipeline spec that covers quality checks, assumptions, business metrics, and allows stakeholders to give feedback BEFORE you start coding""
""- Build data quality checks into your pipelines using data contracts such as write-audit-publish and write unit and integration tests to catch quality errors before they enter production""
""- Set up experiments using Flask and Statsig to learn about A/B tests and how to collect data in logged-out and logged-in environments""
""- Discover the power of data lake technologies Apache Iceberg. Proper schema evolution, partitioning, and parquet file format compression! ""
""- Collaborate with your group on building on-call run books and learn about data pipeline maintenance ""
""- Learn how to prioritize your tasks for impact, identify low-value tasks, and push back when stakeholders ask you to do them""
""- Visualize data in the right way to create compelling stories that executives want to see. Create exploratory dashboards in Tableau that data analysts can use to discover patterns ""
""- Level up your SQL skills by having a four-hour crash course on GROUPING SETS, window functions, and cumulative table design""
""- Listen to 12 industry-leading experts in Q&A format and get their view on how things are changing in this rapid environment! ""
""""
""You'll learn all this and level up your Spark and Flink skills! Over 60 hours of content is available now! ""
""""
""You can learn more at www.dataengineer.io ""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-08-23 17:38:07,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7100169347494473728,"Data engineering is less risky than data science and often has more ROI! ""
""""
""- Data science and data engineering both involve a tremendous amount of data cleaning. The difference here is the outputs from data engineers are modeled, usable data sets for the rest of the company. The outputs from data scientists are inputs to machine learning models and/or experiments. These models and experiments may or may not produce business-impacting results. The visibility from the modeled data from data engineers has long lasting results. ""
""""
""- Data engineering problems involve less ambiguity than data science problems. Gathering and collecting data while documenting sources is easier than trying to get it all integrated into your system. People underestimate the complexity of getting a machine learning model running in production. ""
""""
""These two factors is why data engineers, on average, saw a pay bump in 2023 and data scientists saw a slight decrease in pay!""
""""
""#dataengineering ""
""#datascience",,,MEMBER_NETWORK
2023-08-23 17:23:54,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7100165768671031297,"Amazing post by Arpit!""
""""
""#softwareengineering",,,MEMBER_NETWORK
2023-08-21 17:23:54,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7099440994483388417,I'll be rebranding my data engineer education products from www.eczachly.com to www.dataengineer.io! I'm really excited about this change! #dataengineering ,,,MEMBER_NETWORK
2023-08-21 07:06:14,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7099285552222392320,Li Yin is cooking up some 🔥 with SylphAI (AI&data professional network),,,MEMBER_NETWORK
2023-08-20 16:40:31,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7099067688093700096,"IO is often the biggest component of cloud costs in data engineering! ""
""""
""How do you minimize it?""
""""
""- too much IO is caused by pipelines that read too much data ""
""This can happen when you’re building look back metrics and you aren’t using cumulative table design. These metrics should be built incrementally instead of scanning 30/60/90 days worth of data every day ""
""""
""- too much IO is caused by tables that are bigger than they need to be ""
""This can happen when you aren’t leveraging sorting the right way when writing out parquet tables. You should take advantage of run-length encoding the most you can by sorting by lowest to highest cardinality dimensions. ""
""""
""- too much IO is caused by data models that aren’t robust enough ""
""Duplicate data models with slightly different metric definitions need to be consolidated and collapsed. Double the data and double the pipelines. Your IO will be excessive!""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-08-19 23:29:39,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7098808263873335296,"I set up a really powerful Iceberg + Spark tutorial using Tabular in 2 hours. ""
""Thanks for unblocking me, Jason and Ryan! Y’all are building something special! ""
""#dataengineering",,,MEMBER_NETWORK
2023-08-17 21:14:10,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7098049390417776641,"Data engineering has many ""this or that"" questions""
""""
""- Python or Scala?""
""If you don't know either, start with Python. If you want to transition to the software/data engineer archetype, pick up Scala later. ""
""""
""- Streaming or Batch?""
""A vast majority of data engineering jobs are batch oriented. This will still be true in ten years. Streaming-oriented data engineering jobs pay better since the skillset is more niche and harder to come by. Remember there's a middle ground with things like microbatch. ""
""""
""- Kimball or One Big Table? ""
""Kimball is better at preserving data integrity and for 80+% of the data modeling use cases out there it is going to be the preferred way to model the data. One Big Table has its place though especially if you're trying to minimize shuffle. I've seen some really big performance gains from switching from Kimbal to OBT but just because I saw them at Airbnb doesn't mean you'll see the same. ""
""""
""- Snowflake or Databricks?""
""I like Databricks a lot since it has the versatility of Apache Spark. That being said, it's a more technical platform that takes much longer to learn and set up. The amount of time it takes to get value out of Snowflake is very little and that's a very impressive quality of Snowflake.""
""""
""- AWS or GCP or Azure?""
""AWS is the clear leader in market share and I have a slight bias towards using it over GCP or Azure. That being said, there will always be great Azure and GCP data engineering jobs as well! ""
""""
""- Airflow or Mage or Prefect or Dagster?""
""Airflow is the 9000-pound gorilla in this fight that is looking to be dethroned. The challengers have some really great features that are making Airflow look dated. I'm teaching Airflow in my boot camp though since it has the highest adoption by far""
""""
""#dataengineering ",,,MEMBER_NETWORK
2023-08-17 20:47:28,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7098042673013424128,"Great post by Bruno! He's been an amazing student in my boot camp! ""
""""
""#dataengineering ",,,MEMBER_NETWORK
2023-08-15 21:42:22,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7097331711457255424,"Data architecture always revolves around pushes or pulls!""
""""
""The ""pull"" architecture is the most common and includes the following technologies:""
""""
""A batch computing engine such as Apache Spark, BigQuery, Snowflake""
""A job orchestrator such as Airflow, Mage, Prefect, or Dagster""
""A place to put the batch of data such as Apache Iceberg, Delta Lake, Snowflake, Druid""
""An API to query the data on demand such as HTTP or SQL""
""""
""The ""push"" architecture also called the ""real-time"" architecture is substantially different and includes the following technologies:""
""""
""A streaming computing engine such as Apache Flink, Spark Structured Streaming ""
""A set of jobs that run 24/7 to process data as it arrives""
""A queue of events that are processed such as Apache Kafka or RabbitMQ ""
""A place to put the streams of data such as Apache Iceberg or Apache Kafka""
""An API to expose the data in real-time such as Websockets or Kafka consumers""
""""
""#dataengineering""
""#softwareengineering ",,,MEMBER_NETWORK
2023-08-15 21:28:01,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7097328099154870274,"People ask me, ""Why do you love JavaScript so much Zach if data engineers never use it?"" ""
""""
""JavaScript is a more fundamental component to any startup you want to build than SQL is. ""
""""
""Do you want to build a website? Okay, use React""
""Do you want to build a server? Okay, use ExpressJS""
""Do you need a mobile app? Okay, use React Native ""
""Do you need a data exchange format? Okay, use JSON (stands for JavaScript object notation)""
""""
""If you don't have a website, server, or app, how do you start generating data? ""
""""
""Data engineers are hired onto a startup much later because they're only needed after things are bigger and more complex. ""
""""
""And if you want to unblock yourself so you can get to that point where things are bigger and more complex, learn JavaScript!!! ""
""""
""#dataengineering ""
""#softwareengineering ""
""",,,MEMBER_NETWORK
2023-08-15 19:24:28,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7097297007072931840,"Data engineering has a huge community! Especially outside LinkedIn! ""
""""
""Here are some communities you need to join: ""
""""
""- Xinran’s Data Engineer Things""
""Slack channel is here http://join.det.life""
""""
""- Benjamin’s Seattle Data Guy ""
""Discord is here: https://lnkd.in/er6bcJBj""
""""
""- Zach’s EcZachly Inc ""
""Discord is here: https://lnkd.in/e_qtv8w7""
""""
""- Chip’s MLOps community ""
""Discord is here: https://lnkd.in/eqDeG--R ""
""""
""- Li’s SylphAI (AI&data professional network) ""
""Discord is here: https://lnkd.in/ePGFVY5A""
""""
""#dataengineering ""
""#machinelearning",,,MEMBER_NETWORK
2023-08-15 17:49:19,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7097273063678869505,"Great list of things to get started in AI by Li Yin ""
""""
""#dataengineering ""
""#machinelearning",,,MEMBER_NETWORK
2023-08-15 03:42:04,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7097059847388499968,"If you keep buying my boot camps I’ll keep recklessly spending it on neon signs! ""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-08-15 00:19:47,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7097008940655280128,This role looks really great! #dataengineering,,,MEMBER_NETWORK
2023-08-14 20:42:03,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7096954143679926272,"I'm going to be a guest speaker for the first Data Engineer Things Book Club AMA Session on Aug 18!""
""""
""The Data Engineer Things Book Club is currently reading Fundamentals of Data Engineering, by Joe Reis 🤓 and Matthew Housley It's not too late to join now!""
""""
""Feel free to join the AMA even if you are not reading the book! ""
""""
""(Signup link in the comment)""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-08-14 06:36:55,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7096741461827731456,"Dear JetBrains, ""
""""
""I’d make videos on how to use your IDEs effectively to fight the good fight against VS Code. I’ve been an avid fan of your products since 2013 when I made the switch from Eclipse to IntelliJ. ""
""""
""Nowadays, I use: ""
""""
""DataGrip for SQL dev ""
""WebStorm for web dev ""
""IntelliJ or PyCharm for data engineering ""
""""
""End-to-end I use y’all’s products. ""
""""
""I haven’t touched a Microsoft product for development unless you count LinkedIn, GitHub and SQL Server/Windows ten years ago so I promise I’m not tainted! ""
""""
""#softwareengineering ""
""#dataengineering",,,MEMBER_NETWORK
2023-08-14 00:19:01,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7096646358916337664,"Deborah is having a great time learning in the self-paced version of EcZachly Inc’s second boot camp!""
""""
""You can get it here www.EcZachly.com""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-08-13 23:15:31,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7096630380497158144,"I met Nuseir Yassin founder of Nas Daily Studios""
""Turns out he used to be a data engineer too! ""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-08-13 22:19:08,https://www.linkedin.com/feed/update/urn%3Ali%3AugcPost%3A7096616191770693632,"Curious what people think about data engineering interviews ""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-08-13 06:50:02,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7096382371893022720,"Week 5 of 6 of the EcZachly Inc boot camp starts on Monday. For the analytics track, the title is ""KPIs and Experimentation""""
""""
""The analytics track is doing the following things: ""
""""
""Day 1:""
""""
""Lecture:""
""Learning about why data engineering and experimentation are closely connected""
""The different types of metrics and where data engineers should pass to analytics partners. ""
""Deep dive into how to split up your groups and proper experimentation design ""
""""
""Lab: ""
""Set up a Flask API using Statsig to run a live experiment end-to-end. Talk about the difference between logged-out and logged-in experiments.""
""""
""Day 2:""
""""
""Lecture:""
""Talk about statistical significance and when an experiment should be launched. ""
""Talk about how metrics can be gamed and need counter metrics. ""
""Talk about how experiments can go wrong such as the novelty effect""
""""
""Lab:""
""A product sense lab on how to think like a product manager and have a better business impact with the metrics you define""
""""
""#dataengineering ",,,MEMBER_NETWORK
2023-08-13 01:56:12,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7096308429748637696,"I got experimentation wired up in 90 minutes for my website: www.eczachly.com thanks to Statsig! ""
""""
""I'll be teaching my boot camp students the importance of data engineering, metrics, and experimentation in the coming week of the boot camp. ""
""""
""The active experiment I have running on my website is the signup button is red 80% of the time and blue 20% of the time.""
""""
""The companies that can perform the most experiments in parallel are the companies that are winning!""
""""
""#dataengineering ""
""#analyticsengineering ""
""#datascience ",http://www.eczachly.com,,MEMBER_NETWORK
2023-08-12 18:06:05,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7096190120206241792,"Y’all gotta check this out! ""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-08-11 23:19:31,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7095906609784053761,"People underestimate the consistency needed to be successful. ""
""""
""It took me 250 LinkedIn posts to get to 10k followers. ""
""Another 250 posts and I was at 100k. ""
""""
""90% of podcasts don’t make it past episode 3. ""
""99% of podcasts don’t make it past episode 20. ""
""""
""You’ll see a dramatic increase in your fitness with one month of consistency at the gym. Most people in the US are overweight or obese. ""
""""
""When it gets hard, don’t give up. That’s exactly when you need to double down to break through and see success! ""
""""
""It’s a mindset that will make you feel powerful! ""
""""
""#mentalhealth",,,MEMBER_NETWORK
2023-08-11 16:14:32,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7095799661088706561,"I’m consistently impressed by the engineering caliber and dedication of my boot camp attendees. ""
""""
""If you’re hiring for data engineering roles and want a chance to talk to over 150 highly motivated, talented data engineers. ""
""""
""DM me!""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-08-11 02:06:14,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7095586177394642944,"I'm doing a real-time streaming lab today for my boot camp. Please vote between these tech creators by visiting their pages""
""""
""Zach Wilson https://lnkd.in/gCi_-yRx""
""Sarah Floris https://lnkd.in/gX7cfwhF""
""Lulu https://lnkd.in/g33ggb-5 ""
""""
""You don't have to do anything besides clicking on the links!""
""""
""#dataengineering ",,,MEMBER_NETWORK
2023-08-10 23:24:06,https://www.linkedin.com/feed/update/urn%3Ali%3AugcPost%3A7095545377377964032,,,,MEMBER_NETWORK
2023-08-10 21:58:04,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7095523723507609600,"Programming languages I used in big tech based on how much I liked them:""
""""
""- Kotlin ⭐️⭐️⭐️⭐️⭐️""
""- Scala ⭐️⭐️⭐️⭐️⭐️""
""- SQL ⭐️⭐️⭐️⭐️⭐️""
""- Python ⭐️⭐️⭐️⭐️""
""- GroovyScript ⭐️⭐️⭐️⭐️""
""- TypeScript ⭐️⭐️⭐️⭐️""
""- Bash ⭐️⭐️⭐️""
""- Java ⭐️⭐️⭐️""
""- JavaScript ⭐️⭐️⭐️""
""- Ruby ⭐️""
""""
""#softwareengineering",,,MEMBER_NETWORK
2023-08-10 21:13:00,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7095512383602765825,"Data contracts are such a hot topic right now in data engineering! ""
""""
""Imagine if we could prevent all bad data from leaking into production with them? ""
""""
""This holy grail is actually not possible and can lead to data engineer burnout! ""
""""
""Your data quality efforts need to focus on ROI to make it so your data engineers don't run away from their jobs! ""
""""
""#dataengineering ",https://open.substack.com/pub/eczachly/p/writing-data-to-production-is-a-contract?utm_campaign=post&utm_medium=web,,MEMBER_NETWORK
2023-08-10 21:03:30,https://www.linkedin.com/feed/update/urn%3Ali%3AugcPost%3A7095509990853021696,"I’m going to do an event on September 5th with:""
""""
""Ryan Peterman (staff SWE at Meta)""
""Lee McKeeman (staff SWE at Google) ""
""Rahul Pandey (former staff SWE at Meta) ""
""Carly Taylor (ML manager at Activision)""
""""
""We’ll talk about the various ways people get to staff engineer! It’ll be a wonderful event. Excited to see you there!! ""
""""
""#softwareengineering ""
""#dataengineering",,,MEMBER_NETWORK
2023-08-10 18:03:51,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7095464783272955904,"Why aren’t there many entry-level data engineering roles? ""
""""
""There’s a few reasons for this: ""
""""
""- data engineers produce data that is relied on by many people. Relying on a zero years of experience person to do that is a little riskier. ""
""""
""- data engineering requires a unique blend of communication skills and technical skills, like data science, that makes it harder for juniors to ramp up effectively. ""
""""
""- many companies only need one or two data engineers. So there’s no mentorship/growth path for juniors at these companies. Therefore they prefer to hire senior engineers. ""
""""
""Why else do you think there aren’t many Junior data engineering positions? ""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-08-09 23:50:00,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7095189504067588096,This is covered in depth in EcZachly Inc boot camp!,,,MEMBER_NETWORK
2023-08-09 19:47:01,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7095128355447984128,"India is on the verge of bulldozing the US in tech. It seems far fetched right now but just like LLMs, India is on an exponential trajectory.""
""""
""#india",,,MEMBER_NETWORK
2023-08-09 17:06:55,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7095088067748757506,"Don’t lose yourself climbing the ladder! ""
""""
""Remember to go to the beach sometimes. ""
""Remember to go to the mountains sometimes. ""
""Remember to laugh really hard sometimes. ""
""Remember to dance badly sometimes. ""
""Remember to check in with body sometimes. ""
""Remember to be grateful for this beautiful life!""
""""
""#mentalhealth",,,MEMBER_NETWORK
2023-08-09 15:51:09,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7095069000107241472,Still got 2 1/2 weeks left in EcZachly Inc’s boot camp too!,,,MEMBER_NETWORK
2023-08-09 04:32:37,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7094898241275658240,"Need to find a mentor in the data and AI space? ""
""""
""Follow Data Engineer Things, they offer tons of data engineering mentoring. Founder is Xinran Waibel, Netflix DE ""
""""
""Follow SylphAI, they’re an AI community that mentors and helps you reach your goals. Founder is Li Yin, ex-Meta research scientist""
""""
""Follow Illuminate AI, they do AI mentorship. Founder is Aishwarya Srinivasan, Google data scientist""
""""
""#dataengineering ""
""#machinelearning ""
""#artificialintelligence",,,MEMBER_NETWORK
2023-08-08 18:45:26,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7094750471185182720,"Learning technical things beyond data pipelines will make you a better data engineer!""
""""
""- live servers have highly quality requirements!""
""If a server goes down, your business dies. ""
""If a data pipeline is delayed, an analyst is sad. ""
""""
""Learning to deal with higher stakes technical requirements will help you see how to build higher quality data pipelines! ""
""""
""Higher quality meaning: ""
""- tested in CI/CD""
""You should have unit and integration tests for your queries so you don’t push a bad change to your pipeline. ""
""""
""- monitored in production ""
""Is your pipeline telemetry changing? Is skew hurting the performance? Can you make things more efficient? ""
""""
""- documented for other engineers ""
""How do you troubleshoot when things break? Who do you talk to when quality errors arise? ""
""""
""You’d be surprised how much full-stack engineering made me a better data engineer. There aren’t enough data engineers who care about this stuff which leads to the perception that data engineers are less technical than software engineers! ""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-08-08 18:13:19,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7094742389843529728,"There is such a thing as too much data quality! ""
""""
""Everything in engineering, including data quality, comes with trade offs. ""
""""
""Symptoms of too much data quality: ""
""""
""- noisy on-call ""
""You have checks for every possible anomaly under the sun. The more checks, the more likely they fail. The more tax you pay maintaining the quality of your pipelines. ""
""""
""Every DQ check has a probability of false positive that takes away from real engineering time. Thinking about the ROI on these checks before implementing will help you strike the right balance. ""
""""
""- slow pipeline design phases ""
""Trying to incorporate every request and constraint into your design document can be taxing. Acceping that data model creation is iterative will help you move faster here. ""
""""
""Don’t cut corners though! Do your best to incorporate as many stakeholder requirements as you can and cut the ones that provide the lowest ROI.""
""""
""""
""These problems are actually pretty rare in industry since 95% of companies index too lightly on data quality. But just like everything, you don’t want to index to heavily in the other direction either!""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-08-07 21:52:11,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7094435079472517120,"Dear LinkedIn, ""
""""
""Can you please put the engagement metrics into your Shares.csv GDPR export file? ""
""""
""Twitter does it. ""
""YouTube does it. ""
""Instagram does it. ""
""TikTok does it. ""
""""
""You’re literally the only platform that doesn’t do it! ""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-08-07 18:42:14,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7094387278663323648,"Things you should never see in production SQL pipelines: ""
""""
""- SELECT * ""
""- RIGHT JOIN""
""- GROUP BY 1,2,3 / ORDER BY 1,2,3""
""- Derived columns without aliases ""
""- Nested subqueries ""
""""
""What would you add? ""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-08-07 15:23:32,https://www.linkedin.com/feed/update/urn%3Ali%3AugcPost%3A7094337272275222528,"Week 4 of EcZachly Inc’s boot camp starts today. The themes are analytical patterns and streaming pipelines! ""
""""
""#dataengineering",,,MEMBER_NETWORK
2023-08-06 16:49:19,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7093996471263391744,"Rules of LinkedIn etiquette:""
""""
""- using more than 3 hashtags looks bad ""
""- don’t ask for a resume review or a referral on the first DM, I don’t know you ""
""- don’t cold sell someone on the first DM, you’ll waste your Inmail credits ""
""- don’t tag more than 3-4 people in a post unless it’s a group event ""
""- if you comment, have it be something positive or meaningful ""
""- entry-level positions require zero years of experience ""
""- treating people like humans instead of job-givers will get you much further ""
""- have a profile picture that’s public, clear and up-to-date ""
""- use your headline to clarify what you do ""
""""
""Any more you’d add? ""
""""
""#linkedin",,,MEMBER_NETWORK
2023-08-06 16:31:30,https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7093991989511147520,"If you’re using the RDD API directly in Spark, you’re doing it wrong!""
""""