Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs/prefect 2 #2725

Closed
wants to merge 0 commits into from
Closed

Conversation

jmalovera10
Copy link
Contributor

@jmalovera10 jmalovera10 commented Jun 26, 2023

NOTE: Kedro datasets are moving from kedro.extras.datasets to a separate kedro-datasets package in
kedro-plugins repository. Any changes to the dataset implementations
should be done by opening a pull request in that repository.

Description

#2431

Development notes

Checklist

  • Read the contributing guidelines
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the RELEASE.md file
  • Added tests to cover my changes

Copy link
Member

@astrojuanlu astrojuanlu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this contribution @jmalovera10! 🎉

Rendered docs: https://kedro--2725.org.readthedocs.build/en/2725/deployment/prefect.html#prefect-2-0

The first thought that came to mind is whether we really need to keep the Prefect 1.0 docs around. My vote would be for "no" - and users who want to see the 1.0 guide can always browse older versions of the Kedro docs.

Apart from that, I gave the instructions a very quick test with the spaceflights starter, and this looks very promising (first time I see the Prefect 2.0 UI, really cool!). I got stuck in a couple of places:

  • The guide tells you to create a <project_root>/register_prefect_flow.py, but it doesn't say whether you should run it or not. By naïvely reading the text, it looks like registration happens automatically.
  • To run the script successfully, I had to do PREFECT_API_URL="http://127.0.0.1:4200/api" python register_prefect_flow.py --work_pool_name "default-agent-pool", otherwise I was getting 404 errors.

Finally, when actually running the flow, it's taking ages to load the Kedro context:

image

(4 minutes in and it's still doing it.)

Looking forward to seeing this merged!

@astrojuanlu
Copy link
Member

Hah, funny - the agent terminal was waiting for me to opt-in for the telemetry:

image

Typing N didn't have any effect though. After cancelling the run and uninstalling kedro-telemetry, now things are going as expected:

image

@stichbury
Copy link
Contributor

stichbury commented Jun 26, 2023

Hi @jmalovera10 Thank you so much for this contribution!

I think we should retain the Prefect 1.0 docs but have a short section at the top of the doc that explains when you'd want to use Prefect 1.0 and when you'd use Prefect 2.0 (if there is a binary choice). I am making the assumption that there will be cases where people will explicitly opt for 1.0 but I don't know if that's the case (would there be a reason of compatibility for example?). I'd then have a pair of bullets that are anchors to each of the version sections, as I've suggested in the file.

Other than that, I've no major edits to suggest. There are some stylistic changes we could make but I can make those separately in another PR later.

@noklam
Copy link
Contributor

noklam commented Jun 26, 2023

If we'd like to keep both, can we have a small catalog at the top so it's easy to jump to 1.0 or 2.0 depends on what the user need?

Something like

  • Using Kedro with Prefect 1.0 (On click will jump to #kedro-with-prefect1.0)
  • Using Kedro with Prefect 2.0 (On click will jump to #kedro-with-prefect2.0)

@stichbury
Copy link
Contributor

If we'd like to keep both, can we have a small catalog at the top so it's easy to jump to 1.0 or 2.0 depends on what the user need?

Something like

  • Using Kedro with Prefect 1.0 (On click will jump to #kedro-with-prefect1.0)
  • Using Kedro with Prefect 2.0 (On click will jump to #kedro-with-prefect2.0)

I think we collided in our comments there. I've added just that!

@jmalovera10
Copy link
Contributor Author

jmalovera10 commented Jun 27, 2023

Thanks a lot for this contribution @jmalovera10! 🎉

Rendered docs: https://kedro--2725.org.readthedocs.build/en/2725/deployment/prefect.html#prefect-2-0

The first thought that came to mind is whether we really need to keep the Prefect 1.0 docs around. My vote would be for "no" - and users who want to see the 1.0 guide can always browse older versions of the Kedro docs.

Apart from that, I gave the instructions a very quick test with the spaceflights starter, and this looks very promising (first time I see the Prefect 2.0 UI, really cool!). I got stuck in a couple of places:

  • The guide tells you to create a <project_root>/register_prefect_flow.py, but it doesn't say whether you should run it or not. By naïvely reading the text, it looks like registration happens automatically.
  • To run the script successfully, I had to do PREFECT_API_URL="http://127.0.0.1:4200/api" python register_prefect_flow.py --work_pool_name "default-agent-pool", otherwise I was getting 404 errors.

Finally, when actually running the flow, it's taking ages to load the Kedro context:

(4 minutes in and it's still doing it.)

Looking forward to seeing this merged!

@astrojuanlu thank you for your feedback! 😄

Indeed I can make more explicit that <project_root>/register_prefect_flow.py script should be executed and the PREFECT_API_URL should be set to target our localhost server. Maybe I could add a Setup section to elaborate on some prerequisites? What do you think?

About the delay you experienced, I had a similar issue with my project's pipeline but it got solved by setting up a .telemetry file in the project's root. Maybe I could give a heads up in the docs that you can add a .telemetry file in order to opt-in/out for telemetry.

@jmalovera10
Copy link
Contributor Author

jmalovera10 commented Jun 27, 2023

Hi @jmalovera10 Thank you so much for this contribution!

I think we should retain the Prefect 1.0 docs but have a short section at the top of the doc that explains when you'd want to use Prefect 1.0 and when you'd use Prefect 2.0 (if there is a binary choice). I am making the assumption that there will be cases where people will explicitly opt for 1.0 but I don't know if that's the case (would there be a reason of compatibility for example?). I'd then have a pair of bullets that are anchors to each of the version sections, as I've suggested in the file.

Other than that, I've no major edits to suggest. There are some stylistic changes we could make but I can make those separately in another PR later.

@stichbury and @noklam thank you both for your comments!

Now that you mention it I'm having trouble justifying whether I should keep the Prefect 1.0 section or not. From the blogs I've read, I found that they are freezing legacy Prefect 1.0 Cloud accounts which would make you think that they are encouraging Prefect 2.0 over 1.0. On the web page and community posts they encourage you to opt-in for Prefect 2.0 from the start. From a developer perspective, if I am going to make a brand new deployment with Prefect, unless I have a serious compatibility issue (i.e the Kedro functions used in the deployment script are not supported by my current version), I could benefit more from Prefect 2.0 capabilities and support.

Maybe I am with @astrojuanlu on this one and there should only be the Prefect 2.0 section (and the UI is way cooler 👀). @stichbury what do you think?

I can also ask in the Prefect Slack Channel for more guidance on this if you feel that we need more input to make a decision😸

@astrojuanlu
Copy link
Member

Indeed I can make more explicit that <project_root>/register_prefect_flow.py script should be executed and the PREFECT_API_URL should be set to target our localhost server. Maybe I could add a Setup section to elaborate on some prerequisites? What do you think?

I was wondering whether this should live as a hook, but we don't want to wait until a kedro run to register this flow. Maybe we should do like kedro-airflow and register a new CLI command, something like:

kedro prefect create/register

This is done by declaring a kedro.project-commands entry point:

https://github.com/kedro-org/kedro-plugins/blob/3d0dd7c0d34a394b292919a70fb516d81ccf46c0/kedro-airflow/pyproject.toml#L25-L26

and a Click CLI:

https://github.com/kedro-org/kedro-plugins/blob/3d0dd7c0d34a394b292919a70fb516d81ccf46c0/kedro-airflow/kedro_airflow/plugin.py#L14-L44

Could be a good opportunity to create your first Kedro plugin @jmalovera10 😃 what do you think?

@stichbury
Copy link
Contributor

Maybe I am with @astrojuanlu on this one and there should only be the Prefect 2.0 section (and the UI is way cooler 👀). @stichbury what do you think?

I am happy to remove the v1 docs (maybe have a link to them in the Kedro v0.18.11 docs so anyone who desperately wants them can easily navigate back to find them). Go ahead!

@jmalovera10
Copy link
Contributor Author

Could be a good opportunity to create your first Kedro plugin @jmalovera10 😃 what do you think?

I think it could be an interesting project! Prefect 2.0 is very versatile and has many options to deploy infrastructures, schedules, events, and much more. Having a plugin that helps you make deployments seamlessly on Prefect could be a big improvement on developer experience. Maybe starting with something small like this deployment example could lead to more interesting deployment capabilities in the future. @astrojuanlu, you gave me a lot to think about!

But for now I want to start small and see how it goes (it is my first time contributing to open source) and then I will give it a try to build my first plugin 😄

Copy link
Member

@astrojuanlu astrojuanlu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me! Step by step 💪🏽

I think this is ready for a more detailed review

Comment on lines 27 to 29
```text
consent: false
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how I feel telling our users to opt-out telemetry like this 😬 but asking them to write consent: true feels a bit forceful. Maybe we should link somewhere else in the docs that explains what to do with the telemetry? cc @stichbury

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you consent to the telemetry, the rest of the instructions also work, correct?

So as long as we say to add your preferred response true or false then we aren't dictating anything. See my suggestion...

@astrojuanlu
Copy link
Member

cc @mehrzadai @ofir-insait @hugocool what do you think of this approach?

astrojuanlu
astrojuanlu previously approved these changes Jun 28, 2023
@@ -1,20 +1,20 @@
# Prefect

This page explains how to run your Kedro pipeline using [Prefect Core](https://www.prefect.io/products/core/), an open-source workflow management system.
This page explains how to run your Kedro pipeline using [Prefect](https://www.prefect.io/products/core/), an open-source workflow management system.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This page explains how to run your Kedro pipeline using [Prefect](https://www.prefect.io/products/core/), an open-source workflow management system.
This page explains how to run your Kedro pipeline using [Prefect](https://www.prefect.io/products/core/), an open-source workflow management system.
Kedro supports Prefect version 1.0 and Prefect version 2.0 so you can choose which you prefer to use. ADD ANY EXPLANATION FOR PREFERENCES FOR A VERSION HERE IF RELEVANT.
This page documents how to use Kedro with each version.
* [Prefect version 1.0](#prefect-1-0)
* [Prefect version 2.0](#prefect-2-0)

Comment on lines 27 to 29
```text
consent: false
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you consent to the telemetry, the rest of the instructions also work, correct?

So as long as we say to add your preferred response true or false then we aren't dictating anything. See my suggestion...

docs/source/deployment/prefect.md Outdated Show resolved Hide resolved
docs/source/deployment/prefect.md Outdated Show resolved Hide resolved
@stichbury
Copy link
Contributor

I am happy with the updates. I haven't tested end-to-end but the content is sound at the presentation level so I've approved it.


```{important}
If you don't create the `.telemetry` configuration file, the pipeline will get frozen because it will prompt you to opt-in for Kedro Telemetry.
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove this entire note as I've covered it above. If you think it needs a callout, you could add it around the text I've suggested. I've not done this because it's impossible to format it nicely with GitHub suggestions, and will leave you to decide if it's necessary.

@jmalovera10
Copy link
Contributor Author

jmalovera10 commented Jun 30, 2023

@astrojuanlu @stichbury I need help 😢 I had a problem with an unsigned commit and tried to repair it but ended up closing the PR because my local repository diverged from Kedro's.

@jmalovera10
Copy link
Contributor Author

Will open a new PR referencing this one with the last changes we discussed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants