Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Messaging - One Sentence Summary for Kedro #2

Closed
1 of 2 tasks
NeroOkwa opened this issue Dec 6, 2022 · 11 comments
Closed
1 of 2 tasks

Messaging - One Sentence Summary for Kedro #2

NeroOkwa opened this issue Dec 6, 2022 · 11 comments
Assignees
Labels
marketing: website copy Copy creation for website & general PMM copy

Comments

@NeroOkwa
Copy link

NeroOkwa commented Dec 6, 2022

Context

How would you describe Kedro in one sentence? This is key in defining how Kedro is perceived by users, the community, and all stakeholders. This is part of other messaging efforts of Kedro: #2099, #2094, #72

When you search for Kedro the result is some version of - Kedro - An open-source Python framework to create reproducible, maintainable, and modular data science code.

Does this capture the value proposition of Kedro succinctly?

Some other examples:

Why is this important?

This would clearly highlight Kedro’s value proposition in one sentence, increasing awareness and adoption.

Next Steps

  • Research and decide on one sentence metaphor for Kedro
  • Publish on Kedro's website, documentation, press kit, and community
@stichbury
Copy link
Contributor

@yetudada recently used the following "Think maintainable datascience code. Think Kedro." Which is 2 sentences, but I really like it.

I think we need a brainstorm on this, although I wouldn't get too hung up on finding just one phrase as I think multiple versions are fine. We probably need a "messaging book" as part of our press kit, which contains all the different pitches for Kedro with different slants for different audiences. So maybe the next steps are

  • Research and decide on how we talk about Kedro: Find our favourite 1 sentence description
  • Create a messaging book that contains the "best of" Kedro descriptions, which can be 1 sentence or more, but are focussed for different target audiences
  • Update our public properties with the messaging
  • Create a set of graphics for use on social media that use the messaging

@NeroOkwa NeroOkwa changed the title Messaging - Metaphor for Kedro Messaging - One Sentence Summary for Kedro Dec 6, 2022
@marc-solomon
Copy link

This looks like a really good initiative!

I'm currently asking this question about Alloy, as part of a wider research initiative.

One question I've found provokes quite profound insights from our users is:

How would you describe Alloy to a [colleague/friend] who hasn’t used it?

@NeroOkwa NeroOkwa transferred this issue from kedro-org/kedro Jan 31, 2023
@stichbury stichbury added the marketing: website copy Copy creation for website & general PMM copy label Jan 31, 2023
@yetudada
Copy link

I just thought of two, that I'm just going to drop here before I forget:

  • "A machine-learning engineering workflow"
  • "Clean code for data scientists"

@stichbury
Copy link
Contributor

ChatGPT came up with this "Our mission is to standardise how data science code is created" while I was playing around recently.

@yetudada
Copy link

Another one "production data science"

@astrojuanlu
Copy link
Member

Looking at Django and thinking:

image

On one hand, there's a slogan or catchphrase. "Clean code for data scientists" and "Production data science" and "Think maintainable datascience code" are good slogans. Good for H1 on the website, marketing, etc.

And on the other hand, there's a one sentence summary. "An open source Python framework to create reproducible, maintainable, and modular data science code" is descriptive and succint, and so I think it is a good summary. A shorter variation: "An open source Python framework for modular data science code". Another idea: "An ecosystem/constellation of Python libraries for agile data science" (ecosystem includes extensions like kedro-viz and others, "agile" might be too overloaded at this point but wanted to throw it anyway).

I think a slogan has to be more evocative and colorful, while a summary or description has to be more conventional.

@yetudada
Copy link

yetudada commented Mar 6, 2023

I'm building on thoughts dropped by @astrojuanlu. We need to support a few things: a slogan, category and short description for technical and non-technical users.

I would like to see if we can vote on five one-liners per category and get users to vote on them.

Slogan suggestions

Current: Maintainable and modular data science solved

  • A clean code toolbox for machine learning
  • A toolbox for production-ready data science
  • The software engineering toolbox for data scientists
  • Automate machine-learning engineering plumbing
  • The missing link between machine learning and software engineering
  • Machine Learning + Software Engineering = Kedro
  • Think maintainable data science
  • Put some software engineering back into machine-learning code
  • Reduce technical debt in machine-learning code with Kedro
  • Scaffolding for data scientists to create maintainable code
  • Software engineering plumbing for successful machine-learning prototypes
  • Write production-ready machine-learning code from the beginning
  • Make machine-learning code that looks like Lego

Category suggestions

Current: "Orchestrator" (incorrect), "Workflow", "AI Framework & Libraries" and "MLOps". Additional complexity, the word "framework" has been co-opted by tools like Hugging Face, PyTorch and Tensorflow.

  • MLE (machine-learning engineering)
  • MLE frameworks
  • MLOPs workflow
  • Pipeline framework
  • ML workflow
  • ML framework
  • ML development framework
  • ML code development tools

One-sentence summary for Kedro (technical audience)

Current: A Python framework for creating reproducible, maintainable and modular data science code.

  • An open source Python framework for modular data science code.
  • An open source Python framework that combines software engineering and machine learning to create code you can trust and maintain.
  • An open source Python toolbox that applies software engineering concepts to machine-learning code so that teams can reduce the technical debt of their machine-learning prototypes.
  • An open source Python toolbox that puts software engineering back into machine-learning code, making it easier to take prototypes into production systems.

One-sentence summary for Kedro (non-technical audience)

Current: A Python framework for creating reproducible, maintainable and modular data science code.

  • Kedro is the skeleton of your ML projects, much like physical scaffolding when constructing a building.
  • While regular machine-learning codebases look more like sandcastles, Kedro encourages users to create castles with Lego. If one part starts crumbling, it's easier to repair and replace. In addition, each lego block follows a uniform format across all Kedro projects - making it easier to reuse. With this metaphor, Kedro projects are more robust against time because of structure, organisation and the design of their Lego blocks.

@stichbury
Copy link
Contributor

Just a few notes/thoughts/ideas

Slogan suggestions

  • A clean code toolbox for machine learning
    I think this would need to be hyphenated to "A clean-code toolbox for machine learning" to avoid ambiguity
  • Automate machine-learning engineering plumbing
    Very hard to parse. But I can't think of much improvement except "Plumb in the engineering for your machine-learning project. Automatically." 😱
  • Write production-ready machine-learning code from the beginning
    "Production-ready machine-learning code from the start" ? "Production-ready machine-learning code from the outset" ?
  • Make machine-learning code that looks like Lego
    Probably avoid this -- it's likely trademarked

Category suggestions

One-sentence summary for Kedro (technical audience)

  • An open source Python framework that combines software engineering and machine learning to create code you can trust and maintain.

You're writing most of the code still, and trust is 🤷 so maybe a shorter version is "An open source Python framework that combines software engineering and machine learning to create maintainable code"

  • An open source Python toolbox that applies software engineering concepts to machine-learning code so that teams can reduce the technical debt of their machine-learning prototypes.

I'd argue that Kedro doesn't reduce technical debt -- if you have tech debt already, Kedro isn't helpful. It reduces the chances of creating technical debt in future projects.

Couple of suggestions:

  • "Kedro uses software engineering best practices to help you build data science code ready for production".
  • "Our mission is to standardise how data science code is created to help you build maintainable production-ready projects"

With both of the above I've gone for using "you" and "our" to make it less formal.

One-sentence summary for Kedro (non-technical audience)

  • Kedro is the skeleton of your ML projects, much like physical scaffolding when constructing a building.
  • "Kedro is the backbone of your ML projects; it's like the scaffolding used in construction". ?

@astrojuanlu
Copy link
Member

Given that the directory structure, guardrails, and decisions that Kedro makes on behalf of users seem to be an important principle, do we want to include the word "opinionated" or synonyms? xref kedro-org/kedro#2388

Examples:

Black, the uncompromising code formatter
Black is the uncompromising Python code formatter. By using it, you agree to cede control over minutiae of hand-formatting. In return, Black gives you speed, determinism, and freedom from pycodestyle nagging about formatting.

https://pypi.org/project/black/

Opinionated lightweight ELT pipeline framework
A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow

https://pypi.org/project/mara-pipelines/

@yetudada
Copy link

yetudada commented Mar 9, 2023

I'm ready to go with the following suggestions in the polls after your awesome additions!

Slogans

  • Maintainable and modular data science solved
  • A clean-code toolbox for machine learning
  • A toolbox for production-ready data science
  • The software engineering toolbox for data scientists
  • Production-ready machine-learning code from the start
  • The missing link between machine learning and software engineering
  • Software engineering plumbing for successful machine-learning products

Categories

  • Machine learning engineering framework
  • Machine learning engineering workflow tool
  • MLOPs framework
  • MLOPs workflow tool
  • ML framework
  • ML workflow tool
  • ML development framework
  • Pipeline framework

What is Kedro for a technical audience?

Kedro is...

  • An opinionated Python framework that combines software engineering and machine learning to create maintainable code
  • An opinionated Python framework for creating reproducible, maintainable and modular data science code.

What is Kedro a non-technical audience?

Kedro is...

  • An open source Python toolbox that applies software engineering concepts to machine-learning code so that teams can reduce the technical debt of their machine-learning products.
  • An open source Python toolbox that puts software engineering back into machine-learning code, making it easier to take prototypes into production systems.
  • The backbone of your ML products; it's like the scaffolding used in construction.

Poll is live for the team!

@yetudada
Copy link

yetudada commented Mar 20, 2023

Here are the results.

Results

Slogan

Votes:

  • Maintainable and modular data science solved (22 🗳️)
  • A toolbox for production-ready machine learning (22 🗳️)
  • The software engineering toolbox for data scientists (20 🗳️)

Comments of interest:

  • I'd add 'Production ready data science with ease' and would argue to not use the word 'machine learning' in any case as there are many data wrangling/engineering for analytics tasks/pipelines which are no ML.
  • I won't refer to kedro as a "toolbox" since it is very opinionated : it is not a stack of utilities you can cherry pick.
  • I think kedro is much more capable than for machine learning. It also handles the heavy data prep work that can underly data workloads that calculate aggregates and apply business logic. So with that, it brings engineering principles to data scientists that might be used to basically MS Excel but in python. It takes the spaghetti out of those more fundamental parts of the data science hierarchy of needs. Second choice would be the first option, but it assumes that the audience knows that maintainable, modular data science code is a problem. Any academic will tell you their code works fine, until they have to run it on somebody else’s computer, or run sensitivity experiments. Not everyone has the scar tissue awareness of collaborative data science.
  • I do not think that kedro is ready for data scientists to be comfortable. Prototyping is still cumbersome. For productionizing ML we expect more coding, so the second choice is more suitable. The third option is bad both ways: we are mostly suitable for ML engineers. Software engineering doesn't necessarily have a good connotation in the data science world (i.e. people that don't understand ML) and data scientists are not ready for using kedro unless already comfortable with good coding.

Final suggestion:

  • Stick with "Maintainable and modular data science solved" and use "A toolbox for production-ready machine learning" in the one-sentence summary of Kedro; "A toolbox for production-ready machine learning" should be "A toolbox for production-ready data science" instead

Category

Votes:

  • ML development framework (35 🗳️)
  • Machine learning engineering framework (25 🗳️)
  • MLOPs framework (4 🗳️)

Comments of interest:

  • Kedro is definitely focused on "development" but not really on the "ops" side so I'd like this option better. (ML development framework)
  • It is not covering a large part of ML ops. It is one of the building blocks (you still need AWS / airflow etc). ML engineering is better than ML development (job connotation better). (Machine learning engineering framework)
  • It's not a MLOps framework (no monitoring, no deployment), it's a DS/ML development framework. (ML development framework)
  • Kedro is not an MLOps framework; at best, it's part of an MLOps ecosystem. You'd expect an MLOps framework to help you with deployment, feature store, model serving, monitoring, etc.

Final suggestion:

  • "ML development framework" but it might need to be "DS development framework"

Describing Kedro to a technical user

Votes:

  • An opinionated Python framework for creating reproducible, maintainable and modular machine-learning code. (39 🗳️)
  • An opinionated Python framework that applies software engineering principles to machinelearning code. (25 🗳️)

Comments of interest:

  • Replace ML with DS
  • For me setting up a templated package with tests, makefile, pyproject and /src is not the big deal. I was doing that already on my own. But helping me keep code more maintainable due to organizing it in nodes/ pipelines and the vizualization is the big win for kedro. This is not about software engineering principles.

Final result:

  • "An opinionated Python framework for creating reproducible, maintainable and modular machine-learning code." is the winner; it becomes "An opinionated Python framework for creating reproducible, maintainable and modular data science code."
  • Additionally, we can extend this definition to be "An opinionated Python framework for creating reproducible, maintainable and modular data science code. It reduces technical debt when moving prototypes into production." and this absorbs some of the language in the definition for a non-technical user

Describing Kedro to a non-technical user

Votes:

  • A Python toolbox that puts software engineering back into machine-learning code, making it easier to take prototypes into production systems. (35 🗳️)
  • The backbone of your ML prototypes; it's like the scaffolding used in construction. (16 🗳️)
  • A Python toolbox that applies software engineering concepts to machine-learning code so that teams can reduce the technical debt of their machine-learning prototypes. (13 🗳️)

Comments of interest:

  • Id think that structure, clarity and reusability of a kedro'd codebase will be much better than without. replace ml with ds. 'puts back' sounds somewhat strange. what about 'enforces better coding practices'
  • I'd definitely emphasizes on the transition to production if I talk to a non technical PM. When talking to an engineering manager, I'd more likely emphasizes on the first sentence about "reducing technical debt"
  • Only change would be that software engineering is not going “back into” machine learning code. I would argue that it was never there in the first place (see kaggle).
  • Between first two: A python toolbox that applies software engineering concepts to machine-learning code so that teams can reduce technical debt when switching from prototypes to production. The points are: - reduce tech debt - go from prototype to production - Kedro is not helping in prototyping! It actually makes it slower!
  • The point is we want to avoid the gap between prototypes and production. If you only focus on the prototyping/development, I think people will not see the added value over the flexibility of a notebook
  • I would refer target Data Science in general, not only Machine Learning, as there exists many data engineering pipelines that could benefit of kedro without applying ML models
  • not sure a non technical product manager will really figure out technical debt. - why not mention it covers from exploratory to production?

Final result:

  • "A Python toolbox that puts software engineering back into machine-learning code, making it easier to take prototypes into production systems." won but it will be changed to "A Python toolbox that applies software engineering principles to data science code, making it easier to transition from prototype to production."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
marketing: website copy Copy creation for website & general PMM copy
Projects
Archived in project
Development

No branches or pull requests

5 participants