Skip to content

Commit

Permalink
Revise FAQs and README (#2909)
Browse files Browse the repository at this point in the history
* Revise FAQs and README

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Add back the data layers FAQ as I've no idea where else it fits

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* minor changes from review

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

---------

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>
  • Loading branch information
stichbury authored Aug 8, 2023
1 parent 654ede7 commit 6773529
Show file tree
Hide file tree
Showing 9 changed files with 45 additions and 72 deletions.
13 changes: 3 additions & 10 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,20 +17,13 @@ The Kedro team pledges to foster and maintain a friendly community. We enforce a

You can find the Kedro community on our [Slack organisation](https://slack.kedro.org/), which is where we share news and announcements, and general chat. You're also welcome to post links here to any articles or videos about Kedro that you create, or find, such as how-tos, showcases, demos, blog posts or tutorials.

We also curate a [GitHub repo that lists content created by the Kedro community](https://github.com/kedro-org/awesome-kedro).
We also curate a [GitHub repo that lists content created by the Kedro community](https://github.com/kedro-org/awesome-kedro). If you've made something with Kedro, simply add it to the list with a PR!

## Contribute to the project

There are quite a few ways to contribute to the project, find inspiration from the table below.
There are quite a few ways to contribute to Kedro, sich as answering questions about Kedro to help others, fixing a typo on the documentation, reporting a bug, reviewing pull requests or adding a feature.

|Activity|Description|
|-|-|
|Community Q&A|We encourage you to ask and answer technical questions on [GitHub discussions](https://github.com/kedro-org/kedro/discussions) or [Slack](https://slack.kedro.org/), but the former is often preferable since it will be picked up by search engines.|
|Report bugs and security vulnerabilities |We use [GitHub issues](https://github.com/kedro-org/kedro/issues) to keep track of known bugs and security vulnerabilities. We keep a close eye on them and update them when we have an internal fix in progress. Before you report a new issue, do your best to ensure your problem hasn't already been reported. If it has, just leave a comment on the existing issue, rather than create a new one. <br /> If you have already checked the existing [GitHub issues](https://github.com/kedro-org/kedro/issues) and are still convinced that you have found odd or erroneous behaviour then please file a new one.|
|Propose a new feature|If you have new ideas for Kedro functionality then please open a [GitHub issue](https://github.com/kedro-org/kedro/issues) and describe the feature you would like to see, why you need it, and how it should work.|
|Review pull requests|Check the [Kedro repo to find open pull requests](https://github.com/kedro-org/kedro/pulls) and contribute a review!|
|Contribute a fix or feature|If you're interested in contributing fixes to code or documentation, first read our [guidelines for contributing developers](https://docs.kedro.org/en/stable/contribution/developer_contributor_guidelines.html) for an explanation of how to get set up and the process you'll follow. Once you are ready to contribute, a good place to start is to take a look at the `good first issues` and `help wanted issues` on [GitHub](https://github.com/kedro-org/kedro/issues).|
|Contribute to the documentation|You can help us improve the [Kedro documentation online](https://docs.kedro.org/en/stable/). Send us feedback as a [GitHub issue](https://github.com/kedro-org/kedro/issues) or start a documentation discussion on [GitHub](https://github.com/kedro-org/kedro/discussions).You are also welcome to make a raise a PR with a bug fix or addition to the documentation. First read the guide [Contribute to the Kedro documentation](https://docs.kedro.org/en/stable/contribution/documentation_contributor_guidelines.html).
Take a look at some of our [contribution suggestions on the Kedro GitHub Wiki](https://github.com/kedro-org/kedro/wiki/Contribute-to-Kedro)!


## Join our Technical Steering Committee
Expand Down
64 changes: 10 additions & 54 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
[![Conda version](https://img.shields.io/conda/vn/conda-forge/kedro.svg)](https://anaconda.org/conda-forge/kedro)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/kedro-org/kedro/blob/main/LICENSE.md)
[![Slack Organisation](https://img.shields.io/badge/slack-chat-blueviolet.svg?label=Kedro%20Slack&logo=slack)](https://slack.kedro.org)
[![Slack Archive](https://img.shields.io/badge/slack-archive-blueviolet.svg?label=Kedro%20Slack%20)](https://linen-slack.kedro.org/)
![CircleCI - Main Branch](https://img.shields.io/circleci/build/github/kedro-org/kedro/main?label=main)
![Develop Branch Build](https://img.shields.io/circleci/build/github/kedro-org/kedro/develop?label=develop)
[![Documentation](https://readthedocs.org/projects/kedro/badge/?version=stable)](https://docs.kedro.org/)
Expand All @@ -14,7 +15,7 @@

## What is Kedro?

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular. You can find out more at [kedro.org](https://kedro.org).

Kedro is an open-source Python framework hosted by the [LF AI & Data Foundation](https://lfaidata.foundation/).

Expand Down Expand Up @@ -51,12 +52,10 @@ _A pipeline visualisation generated using [Kedro-Viz](https://github.com/kedro-o

The [Kedro documentation](https://docs.kedro.org/en/stable/) first explains [how to install Kedro](https://docs.kedro.org/en/stable/get_started/install.html) and then introduces [key Kedro concepts](https://docs.kedro.org/en/stable/get_started/kedro_concepts.html).

- The first example illustrates the [basics of a Kedro project](https://docs.kedro.org/en/stable/get_started/new_project.html) using the Iris dataset
- You can then review the [spaceflights tutorial](https://docs.kedro.org/en/stable/tutorial/tutorial_template.html) to build a Kedro project for hands-on experience
You can then review the [spaceflights tutorial](https://docs.kedro.org/en/stable/tutorial/spaceflights_tutorial.html) to build a Kedro project for hands-on experience

For new and intermediate Kedro users, there's a comprehensive section on [how to visualise Kedro projects using Kedro-Viz](https://docs.kedro.org/en/stable/visualisation/kedro-viz_visualisation.html) and [how to work with Kedro and Jupyter notebooks](https://docs.kedro.org/en/stable/notebooks_and_ipython/kedro_and_notebooks).
For new and intermediate Kedro users, there's a comprehensive section on [how to visualise Kedro projects using Kedro-Viz](https://docs.kedro.org/en/stable/visualisation/index.html) and [how to work with Kedro and Jupyter notebooks](https://docs.kedro.org/en/stable/notebooks_and_ipython/index.html). We also recommend the [API reference documentation](/kedro) for additional information.

Further documentation is available for more advanced Kedro usage and deployment. We also recommend the [glossary](https://docs.kedro.org/en/stable/resources/glossary.html) and the [API reference documentation](/kedro) for additional information.

## Why does Kedro exist?

Expand All @@ -68,64 +67,21 @@ Kedro is built upon our collective best-practice (and mistakes) trying to delive
- To increase efficiency, because applied concepts like modularity and separation of concerns inspire the creation of
**reusable analytics code**

Find out more about how Kedro can answer your use cases from the [product FAQs on the Kedro website](https://kedro.org/#faq).

## The humans behind Kedro

The [Kedro product team](https://docs.kedro.org/en/stable/contribution/technical_steering_committee.html#kedro-maintainers) and a number of [open source contributors from across the world](https://github.com/kedro-org/kedro/releases) maintain Kedro.

## Can I contribute?

Yes! Want to help build Kedro? Check out our [guide to contributing to Kedro](https://github.com/kedro-org/kedro/blob/main/CONTRIBUTING.md).
Yes! We welcome all kinds of contributions. Check out our [guide to contributing to Kedro](https://github.com/kedro-org/kedro/wiki/Contribute-to-Kedro).

## Where can I learn more?

There is a growing community around Kedro. Have a look at the [Kedro FAQs](https://docs.kedro.org/en/stable/faq/faq.html#how-can-i-find-out-more-about-kedro) to find projects using Kedro and links to articles, podcasts and talks.

## Who likes Kedro?

There are Kedro users across the world, who work at start-ups, major enterprises and academic institutions like [Absa](https://www.absa.co.za/),
[Acensi](https://acensi.eu/page/home),
[Advanced Programming Solutions SL](https://www.linkedin.com/feed/update/urn:li:activity:6863494681372721152/),
[AI Singapore](https://makerspace.aisingapore.org/2020/08/leveraging-kedro-in-100e/),
[AMAI GmbH](https://www.am.ai/),
[Augment Partners](https://www.linkedin.com/posts/augment-partners_kedro-cheat-sheet-by-augment-activity-6858927624631283712-Ivqk),
[AXA UK](https://www.axa.co.uk/),
[Belfius](https://www.linkedin.com/posts/vangansen_mlops-machinelearning-kedro-activity-6772379995953238016-JUmo),
[Beamery](https://medium.com/hacking-talent/production-code-for-data-science-and-our-experience-with-kedro-60bb69934d1f),
[Caterpillar](https://www.caterpillar.com/),
[CRIM](https://www.crim.ca/en/),
[Dendra Systems](https://www.dendra.io/),
[Element AI](https://www.elementai.com/),
[GetInData](https://getindata.com/blog/running-machine-learning-pipelines-kedro-kubeflow-airflow),
[GMO](https://recruit.gmo.jp/engineer/jisedai/engineer/jisedai/engineer/jisedai/engineer/jisedai/engineer/jisedai/blog/kedro_and_mlflow_tracking/),
[Indicium](https://medium.com/indiciumtech/how-to-build-models-as-products-using-mlops-part-2-machine-learning-pipelines-with-kedro-10337c48de92),
[Imperial College London](https://github.com/dssg/barefoot-winnie-public),
[ING](https://www.ing.com),
[Jungle Scout](https://junglescouteng.medium.com/jungle-scout-case-study-kedro-airflow-and-mlflow-use-on-production-code-150d7231d42e),
[Helvetas](https://www.linkedin.com/posts/lionel-trebuchon_mlflow-kedro-ml-ugcPost-6747074322164154368-umKw),
[Leapfrog](https://www.lftechnology.com/blog/ai-pipeline-kedro/),
[McKinsey & Company](https://www.mckinsey.com/alumni/news-and-insights/global-news/firm-news/kedro-from-proprietary-to-open-source),
[Mercado Libre Argentina](https://www.mercadolibre.com.ar),
[Modec](https://www.modec.com/),
[Mosaic Data Science](https://www.youtube.com/watch?v=fCWGevB366g),
[NaranjaX](https://www.youtube.com/watch?v=_0kMmRfltEQ),
[NASA](https://github.com/nasa/ML-airport-taxi-out),
[NHS AI Lab](https://nhsx.github.io/skunkworks/synthetic-data-pipeline),
[Open Data Science LatAm](https://www.odesla.org/),
[Prediqt](https://prediqt.co/),
[QuantumBlack](https://medium.com/quantumblack/introducing-kedro-the-open-source-library-for-production-ready-machine-learning-code-d1c6d26ce2cf),
[ReSpo.Vision](https://neptune.ai/customers/respo-vision),
[Retrieva](https://tech.retrieva.jp/entry/2020/07/28/181414),
[Roche](https://www.roche.com/),
[Sber](https://www.linkedin.com/posts/seleznev-artem_welcome-to-kedros-documentation-kedro-activity-6767523561109385216-woTt),
[Société Générale](https://www.societegenerale.com/en),
[Telkomsel](https://medium.com/life-at-telkomsel/how-we-build-a-production-grade-data-pipeline-7004e56c8c98),
[Universidad Rey Juan Carlos](https://github.com/vchaparro/MasterThesis-wind-power-forecasting/blob/master/thesis.pdf),
[UrbanLogiq](https://urbanlogiq.com/),
[Wildlife Studios](https://wildlifestudios.com),
[WovenLight](https://www.wovenlight.com/) and
[XP](https://youtu.be/wgnGOVNkXqU?t=2210).

Kedro won [Best Technical Tool or Framework for AI](https://awards.ai/the-awards/previous-awards/the-4th-ai-award-winners/) in the 2019 Awards AI competition and a merit award for the 2020 [UK Technical Communication Awards](https://uktcawards.com/announcing-the-award-winners-for-2020/). It is listed on the 2020 [ThoughtWorks Technology Radar](https://www.thoughtworks.com/radar/languages-and-frameworks/kedro) and the 2020 [Data & AI Landscape](https://mattturck.com/data2020/). Kedro has received an [honorable mention in the User Experience category in Fast Company’s 2022 Innovation by Design Awards](https://www.fastcompany.com/90772252/user-experience-innovation-by-design-2022).
There is a growing community around Kedro. We encourage you to ask and answer technical questions on [Slack](https://slack.kedro.org/) and bookmark the [Linen archive of past discussions](https://linen-slack.kedro.org/).

We keep a list of [technical FAQs in the Kedro documentation](https://docs.kedro.org/en/stable/faq/faq.html) and you can find a growing list of blog posts, videos and projects that use Kedro over on the [`awesome-kedro` GitHub repository](https://github.com/kedro-org/awesome-kedro). If you have created anything with Kedro we'd love to include it on the list. Just make a PR to add it!

## How can I cite Kedro?

Expand Down
26 changes: 25 additions & 1 deletion docs/source/faq/faq.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
# Frequently asked questions
# FAQs

This is a growing set of technical FAQs. The [product FAQs on the Kedro website](https://kedro.org/#faq) explain how Kedro can answer the typical use cases and requirements of data scientists, data engineers, machine learning engineers and product owners.

## Visualisation

Expand Down Expand Up @@ -46,3 +48,25 @@
* [How do I create a modular pipeline](../nodes_and_pipelines/modular_pipelines.md#how-do-i-create-a-modular-pipeline)?

* [Can I use generator functions in a node](../nodes_and_pipelines/nodes.md#how-to-use-generator-functions-in-a-node)?

## What is data engineering convention?

[Bruce Philp](https://github.com/bruceaphilp) and [Guilherme Braccialli](https://github.com/gbraccialli-qb) are the
brains behind a layered data-engineering convention as a model of managing data. You can find an [in-depth walk through of their convention](https://towardsdatascience.com/the-importance-of-layered-thinking-in-data-engineering-a09f685edc71) as a blog post on Medium.

Refer to the following table below for a high level guide to each layer's purpose

> **Note**:The data layers don’t have to exist locally in the `data` folder within your project, but we recommend that you structure your S3 buckets or other data stores in a similar way.
![data_engineering_convention](../meta/images/data_layers.png)

| Folder in data | Description |
| -------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Raw | Initial start of the pipeline, containing the sourced data model(s) that should never be changed, it forms your single source of truth to work from. These data models are typically un-typed in most cases e.g. csv, but this will vary from case to case |
| Intermediate | Optional data model(s), which are introduced to type your :code:`raw` data model(s), e.g. converting string based values into their current typed representation |
| Primary | Domain specific data model(s) containing cleansed, transformed and wrangled data from either `raw` or `intermediate`, which forms your layer that you input into your feature engineering |
| Feature | Analytics specific data model(s) containing a set of features defined against the `primary` data, which are grouped by feature area of analysis and stored against a common dimension |
| Model input | Analytics specific data model(s) containing all :code:`feature` data against a common dimension and in the case of live projects against an analytics run date to ensure that you track the historical changes of the features over time |
| Models | Stored, serialised pre-trained machine learning models |
| Model output | Analytics specific data model(s) containing the results generated by the model based on the `model input` data |
| Reporting | Reporting data model(s) that are used to combine a set of `primary`, `feature`, `model input` and `model output` data used to drive the dashboard and the views constructed. It encapsulates and removes the need to define any blending or joining of data, improve performance and replacement of presentation layer without having to redefine the data models |
4 changes: 2 additions & 2 deletions docs/source/get_started/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ You should see an ASCII art graphic and the Kedro version number. For example:

![](../meta/images/kedro_graphic.png)

If you do not see the graphic displayed, or have any issues with your installation, check out the [searchable archive of Slack discussions](https://www.linen.dev/s/kedro), or post a new query on the [Slack organisation](https://slack.kedro.org).
If you do not see the graphic displayed, or have any issues with your installation, check out the [searchable archive of Slack discussions](https://linen-slack.kedro.org/), or post a new query on the [Slack organisation](https://slack.kedro.org).


## How to upgrade Kedro
Expand Down Expand Up @@ -187,4 +187,4 @@ pip install kedro
* Installation prerequisites include a virtual environment manager like `conda`, Python 3.7+, and `git`.
* You should install Kedro using `pip install kedro`.

If you encounter any problems as you set up Kedro, ask for help on Kedro's [Slack organisation](https://slack.kedro.org) or review the [searchable archive of Slack discussions](https://www.linen.dev/s/kedro).
If you encounter any problems as you set up Kedro, ask for help on Kedro's [Slack organisation](https://slack.kedro.org) or review the [searchable archive of Slack discussions](https://linen-slack.kedro.org/).
4 changes: 2 additions & 2 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,8 @@ Welcome to Kedro's documentation!
:target: https://slack.kedro.org
:alt: Kedro's Slack organisation

.. image:: https://img.shields.io/badge/slack-archive-blue.svg?label=Kedro%20Slack%20
:target: https://www.linen.dev/s/kedro
.. image:: https://img.shields.io/badge/slack-archive-blueviolet.svg?label=Kedro%20Slack%20
:target: https://linen-slack.kedro.org/
:alt: Kedro's Slack archive

.. image:: https://img.shields.io/badge/code%20style-black-black.svg
Expand Down
Binary file added docs/source/meta/images/data_layers.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/source/resources/index.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Resources
# FAQs and resources

```{toctree}
:maxdepth: 1
Expand Down
Loading

0 comments on commit 6773529

Please sign in to comment.