My name is Kevin Kho. I am currently working on Fugue, a minimal interface to bring Python, Pandas, and SQL code to Spark, Dask, and Ray. Recently, I was at Prefect as an Open Source Community Engineer where I managed the Slack community and created content. Before working on open-source tooling, I was a data scientist for four years across Paylocity and Itron.
I am currently working as an AI Engineer in Drata helping apply AI towards the compliance industry.
📭 Contact me!
Feel free to reach out to me for anything data related. I talk to people about big data, data artichecture, data engineering, and data careers. Always happy to speak at meetups or company meetings about the things I'm working on.
Website: https://kevinkho.com/
Email: kdykho@gmail.com
LinkedIn: https://www.linkedin.com/in/kvnkho
🌎 Location
I am currently based out of Chicago. Always happy to meet people in person.
📝 Blogs
I mainly write about the things I am working on. Here are some:
- Interoperable Python and SQL in Jupyter Notebooks
- Why Pandas-like Interfaces are Sub-optimal for Distributed Computing
- Using Pandera on Spark for Data Validation through Fugue
- The Simple Guide to Productionizing Data Workflows with Docker
- Introducing Fugue — Reducing PySpark Developer Friction
My Medium Profile will have all of my articles.
📢 Conference Talks
I've given a couple of talks about Fugue, Prefect, and distributed computing. Here are some:
- SciPy 2022 - Introduction to Workflow Orchestration with Prefect
- PyCon US 2022 - Comparing the Different Ways to Scale Python and Pandas Code
- Databricks Summit 2022 - FugueSQL - The Enhanced SQL Interface for Pandas and Spark DataFrames
- PyCon US 2021 - Large Scale Data Validation with Spark and Dask
- PyData Global 2021 - An Intro to Workflow Management with Prefect
🎤 Podcasts
💙 Community
I am involved in some other things:
- DataKind - I volunteered for two projects helping non-profits with data science/data engineering work
- Orlando Machine Learning and Data Science - I organized/co-organized this Meetup for 4 years
- Adventurous Analytics - I advise non-profit data science consulting projects, primarily around the Florida foster care system
- Conference Involvement:
- SciPy 2022 Data Life Cycle Track Co-chair
- PyData Seattle 2023 Organizing Committee
🤓 Other Interests
- Mechanical Keyboards
- Basketball
- Kpop