Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for custom front matter keys in notebook-based posts #14

Closed

Conversation

natikgadzhi
Copy link
Contributor

@natikgadzhi natikgadzhi commented Jan 28, 2020

We discussed this with @hamelsmu on Twitter yesterday — this is a bit more complex than I thought.

Context

When we process a notebook post with nbdev, we use the default nbdev Jinja template to convert it to html (and misteriously use .md extension for the resulting file, but that's another issue). The default template doesn't recognize any custom keys in the front matter, it just uses the keys utilized by nbdev for lib / docs development: title, summary, toc, etc.

Hamel writes what makes perfect sense in the docs: you should be able to use any keys you want and pass those to posts, and then use those in your post templates in Jekyll (which are Liquid templates. Similar, but not the same as Jinja).

Problem is, this currently doesn't work.

Proposal

I've implemented the custom keys support in this PR and I'll use something like that in my blog (fork). The basic idea is to:

  1. Provide a custom Jinja template to convert ìpynbtohtml`.
  2. Ask nbdev to use that template in nb2post.py

The template would iterate through everything that's passed in it as resources and output all the custom keys.

One caveat is that nbdev passes a bunch of internals into the template, and we don't want to put that in the template. I filter those out by default, and if you need them for any reason, you can use them. For example, I left nb_path exposed in case you want to put a link to the notebook on github in the rendered post.

What do you think?

UPD few things to finish this PR and get it merged as discussed with @jph00 and @hamelsmu below:

  • Add Colab and Github buttons support via a snippet that can be {{ included }} in any template later.
  • Clean up the template logic so that it doesn't spit out a bunch of empty lines.
  • Make sure escaping works correctly in case the Notebook post spits out double curly braces.

@natikgadzhi natikgadzhi requested a review from jph00 January 28, 2020 19:53
@natikgadzhi natikgadzhi force-pushed the feature/custom-front-matter branch from e8a6a00 to 71043b9 Compare January 28, 2020 20:09
@hamelsmu
Copy link
Member

hamelsmu commented Jan 28, 2020

@xnutsive I can perhaps offer some suggestions or clarification ( although I am still learning things)

If you look at the default template used by nbdev jekyll-md.tpl you will see that classes are injected to contain code and markdown blocks, which are subsequently used by this repo's custom CSS file main.scss. It seems that is missing from your template, and perhaps may break the styling? ( What happened when you ran a notebook through your template and tested it end-to-end? )

  • A link to the nb-path is certainly appealing. It might be worth to include by default two additional links in the notebook (perhaps badges): (1) See Notebook on GitHub and (2) Launch on Colab

and misteriously use .md extension for the resulting file, but that's another issue

In my experiments, files must be *.md in _posts, even though they contain mostly HTML due to the way blog post templates in Jekyll works.

My humble suggestion is we take your proposal, test it outside this repo end-to-end and use the new flexibility to include additional useful features such as links to collab etc, and open a PR at that point? I'm happy to help with this if appropriate.

Most importantly, we should wait for Jeremy to comment to see how to proceed etc. (It might be the case that more discussion could be useful before writing any more code).

@natikgadzhi
Copy link
Contributor Author

@hamelsmu thank you for reviewing and commenting this! I agree, it's important to get this right and not break things, so testing outside of the repo before this is merged in sounds like a good idea. I'm writing a couple articles with this setup and I'll link them once they're published, should be today and tomorrow, but the more tests the better.

If you look at the default template used by nbdev jekyll-md.tpl you will see that classes are injected to contain code and markdown blocks, which are subsequently used by this repo's custom CSS file main.scss. It seems that is missing from your template, and perhaps may break the styling? ( What happened when you ran a notebook through your template and tested it end-to-end? )

As far as I understand, nbdev uses jekyll.tpl by default in HTMLExporter (which is the default exporter).

The template committed is jekyll.tpl with rewritten front matter. I've also dropped the sidebar key, it seems like it's irrelevant for a blog setup, and it was hardcoded in the template.

In my experiments, files must be *.md in _posts, even though they contain mostly HTML due to the way blog post templates in Jekyll works.

I agree, since markdown parses mostly allow html, it makes sense to convert to html, that doesn't break anything.

I'll post an end to end example later tonight ;-)

@hamelsmu
Copy link
Member

As far as I understand, nbdev uses jekyll.tpl by default in HTMLExporter (which is the default exporter).

You are right 🙇 I was mistaken about this

@natikgadzhi
Copy link
Contributor Author

natikgadzhi commented Jan 29, 2020

Here's an example! https://respawn.io/2020/Hello-world/
(🤦🏼‍♂️ yeah, the blog title is a work in progress. Ditto permalink structure).

Looks like the nb cells, python code, and charts work fine. Surprised by how fast this converts and deploys in the gh action. @hamelsmu, thank you so much for putting this together. 👏

@hamelsmu
Copy link
Member

hamelsmu commented Jan 29, 2020

Some questions (feel free to ignore)

  1. Is it possible to get the "View on GitHub" View on Collab" automatically somehow, since I think you mentioned something about the nb_path being passed?
  2. If you could do this automatically, perhaps you could use badges for colab and something similar for GitHub (I would have to search for that one).

I'm not sure offhand how to automatically inject the links automatically, perhaps could inject this into your template with {{ page.variable }} type of thing in Liquid but I haven't played around with it. Perhaps it is more transparent to do this injection via the Jekyll side rather than inside the notebook conversion. Update: I see you are already doing this here !

Also, I do not deserve credit for fast_template, I only made some small tweaks that automate what is already here before, real thanks goes to Jeremy

@hamelsmu
Copy link
Member

hamelsmu commented Jan 29, 2020

One more thing I should probably mention is I don't really make decisions on what features should / should not be included here. Which is why it could be a good idea to talk to Jeremy first. ( I don't want to give you the wrong impression that I have maintainer rights etc )

@natikgadzhi
Copy link
Contributor Author

One more thing I should probably mention is I don't really make decisions on what features should / should not be included here. Which is why it could be a good idea to talk to Jeremy first. ( I don't want to give you the wrong impression that I have maintainer rights etc )

I understand that, and I appreciate you putting your time and energy in implementing the notebook conversion gh actions, and in reviewing and discussing my tweaks!

If you could do this automatically, perhaps you could use badges for colab and something similar for GitHub (I would have to search for that one).

Why, yes, we could do a snippet in _includes that would show the Colab link as a badge, so that folks can put it anywhere they want on a page and not mess with composing the right URLs (like I did in the layout you linked above). It'd work similar to {{ screenshot }}.

Not sure if {{ colab_badge }}would be something as widely used as {{ image }} though. @jph00, should I add it, or hold off for now?

@jph00
Copy link
Member

jph00 commented Jan 29, 2020

I'm not as familiar with gh pages/actions as Hamel, so I'm not sure I have a huge amount to add. In general, I'm very happy to add features that don't significantly increase the complexity of the project, and (most importantly) don't in any way impact things for people that don't need those features.

Badges for colab and (perhaps) nbviewer and mybinder I think would be pretty useful, if they can be added within the above constraints.

@natikgadzhi
Copy link
Contributor Author

In general, I'm very happy to add features that don't significantly increase the complexity of the project, and (most importantly) don't in any way impact things for people that don't need those features.

I believe in this approach, too. To ensure this addition complies with the rule of thumb:

  • Adds a feature: this PR mostly fixes a bug and makes it so folks will be able to use custom meta fields in their notebooks/pages. It works exactly as Hamel designed it and described it in the documentation.
  • Doesn't add too much complexity: this PR brings the notebook2html template.tpl with 5 lines of custom Jinja template code to implement the bugfix. Side effect: if for any reason nbdev's version of that notebook changes, fast_template will continue to work consistently.
  • Doesn't break the workflow for folks not using this: It shouldn't. Folks who don't use _notebooks are not affected at all, and folks who do use _notebooks should not notice any change in behavior.

Note: ideally, I want to test this with at least one more post myself before we merge this, just to be sure that no metadata is broken.

Badges for colab and (perhaps) nbviewer and mybinder I think would be pretty useful, if they can be added within the above constraints.

Perfect! Thanks for the heads up, I'll take a look and try to implement it and add them to this PR. I'll comment when I'm done, likely by Saturday morning PST.

@hamelsmu
Copy link
Member

@xnutsive I just want to comment that your new template is incredibly helpful! I am already using it in another project and it works great for passing extra front matter, thank you so much for adding this.

Incase it is helpful, this repo comes pre-loaded with Primer.css so for the "View on GitHub Button" can be accomplished like this:

<div class="clearfix">
  <a class="btn btn-sm btn-with-count" href="#url" role="button">
    {% octicon mark-github %}
    <span>View On GitHub</span>
  </a>
</div>

image

Hope that helps

@hamelsmu
Copy link
Member

hamelsmu commented Jan 30, 2020

Hi @xnutsive I've used this template enstively over the past couple of days and have some questions.

  1. Is there a way to get rid of the extra spacing in the front matter? I have tweaked the Jinja to no end, but have not been able to get rid of these spaces, for example, this is what I get. I must admit I have studied the spacing rules of jinja and the minus sign syntax, but perhaps it is eluding me still. Either way, perhaps the extra spaces can be removed somehow?
---
layout: notebooks
parent: Reports
keywords: fastai
summary: Show descriptive statistics of the dataset
title: Descriptive Statistics
nb_path: notebooks/EDA.ipynb





---
  1. I've discovered that you have to escape double curly brances from the generated HTML version of the notebook because there is an edge-case where you could have python code with double curly braces (which bit me, recently) which will cause a liquid error. This is what I do in my template to get around this
<div class="container" id="notebook-container">
{{ "{% raw %}" }}
    {{ super()  }}
{{ "{% endraw %}" }}
</div>

@natikgadzhi
Copy link
Contributor Author

Seeing how quickly fast_template evolves, I'll try to wrap things up here quicker. I've updated the initial message with a todo list for myself.

@hamelsmu, thanks for your comments, for trying it out and testing it!

  1. Re, blank space: I've played with jinja for good 30 minutes and couldn't get rid of it, but it's definitely possible. See these dashes in some template tags, but not others? {%- <-- this thingy -%} — it tells jinja to trim whitespace to the left or to the right of the tag, but how exactly it works for control ìfandfor` tags is still a mystery to me. I agree this doesn't look great, I'll invest more time and try to fix it.

  2. I think it makes sense to add something to mitigate the problem in this PR and merge it together. I understand the problem with escaping you've mentioned, and I'll think more about how to do this while still being sure it's safe for all the scenarios.

And number 3 being add the github and colab buttons as {{ includes }}.

Hope to finish this tomorrow, I'm constrained on time this week a bit.

@jph00
Copy link
Member

jph00 commented Jan 30, 2020 via email

@natikgadzhi natikgadzhi force-pushed the feature/custom-front-matter branch from 71043b9 to ad15f0b Compare February 2, 2020 03:38
@natikgadzhi
Copy link
Contributor Author

natikgadzhi commented Feb 2, 2020

@hamelsmu, heads up! I've addressed the whitespace problem, and added colab and github button snippets. At this point, I think if someone really wants to customize them further (or just use links, not buttons) they'd be able to just look in the code.

As for the escaping problem, it's super interesting. So we have two templating engines here:

  1. Jinja2 (in the python script) parses the notebook, and generates the .md file with the post html.
  2. Then Jekyll Liquid takes this .md and plugs it into the layout.

I've tried to debug it first, and found out that:

  • If I use two curly braces in a Python code cell in the notebook — it works correctly, since nbdev is probably smart enough to escape that.
  • But if I have {{ this is going to explode}} in a Markdown cell — that explodes.
  • Unless I put another two markdown cells around the above with {% raw %} {% endraw %}

But you referred to a problem with a code cell, which makes me think maybe I still miss something.

I can just throw the raw wrapper tag in the template, but I'm not that sure we should do that by default — maybe give users the flexibility, but document this behavior instead.

What do you think if we merge this PR, and I create another issue for raw / endraw with markdown and take on investigating and fixing it?

@hamelsmu
Copy link
Member

hamelsmu commented Feb 2, 2020

@xnutsive 👍 I agree with you it totally makes sense to tackle the escaping issue in a different PR. This LGTM so far 🙇 except for one minor comment below where might be good to double-check

Once you/me open an issue re: escaping I'll drop a sample notebook that explodes in there for us to debug, but for now we can take that conversation outside this PR.

Thanks so much for working on this

@@ -9,6 +9,7 @@
title: Edit _config.yml to set your title!
description: This is where the description of your site will go. You should change it by editing the _config.yml file. It can be as long as you like! Happy blogging... ❤
github_username: jph00
github_repo: blog # used for puslishing links back to your repo in the blog.
Copy link
Member

@hamelsmu hamelsmu Feb 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the addition of this break anything? Lets say if the person who uses this template doesn't name their repo blog ? I think that may conflict with the setup instructions?

The instructions in the blog post tell people to setup the blog so it will run off off root

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a new setting, the only thing that would be broken as you pointed out is the github and colab links, unless users actually go and update their _config.yml.

If they use a different repo name, and then use {% github_notebook_link %}, the page would build, but the link would be broken.

I think once someone starts investigating, they'll find the setting in a few minutes (since the snippet users {{site.github_repo}}right after {{site.github_user}}, and it's a common Jekyll config lookup pattern. But ideally, this can be documented, and thus it'll be even easier for users.

One alternative is to do github_repo : %CHANGE_ME_IN_CONFIG_YML% — that would be broken as well, but that would point people to where they should fix it. I think about.md does this, and description: setting by default says something in the nature of This is where the description text is used, change me in config.yml.

Should I change it to something descriptive like that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. In this case you might not need the the github_repo variable at all. This template comes pre-loaded with the github-metadata plugin, which allows you to reference the full path as

{{ site.url }}{{ site.baseurl }}

For example, I am doing this for word docs

@jph00
Copy link
Member

jph00 commented Feb 3, 2020

@hamelsmu @xnutsive sent you both invites to https://github.com/fastai/fastpages . Will close this. I'm on a book deadline until after Feb 10 so feel free to do whatever you like in that repo in the meantime, and I'll try to catch up later!

I suggest in fastpages we use remote_theme, and have it point by default at a fork of minima which we can all maintain at fastai/minima. That way we can have things just as we want them.

@hamelsmu could you do a PR to remove all the stuff for _word/_notebook etc in this repo?

@jph00 jph00 closed this Feb 3, 2020
@@ -0,0 +1,4 @@
<a class="btn btn-sm text-gray-dark" href="https://github.com/{{site.github_username}}/{{site.github_repo}}/tree/{{page.branch | default: "master" }}/{{page.nb_path}}" role="button">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<a class="btn btn-sm text-gray-dark" href="https://github.com/{{site.github_username}}/{{site.github_repo}}/tree/{{page.branch | default: "master" }}/{{page.nb_path}}" role="button">
<a class="btn btn-sm text-gray-dark" href="{{ site.url }}{{ site.baseurl }}/tree/{{page.branch | default: "master" }}/{{page.nb_path}}" role="button">

I believe this will work per my earlier comment re: the special plugin already built in

@@ -0,0 +1,3 @@
<a class="btn btn-sm text-gray-dark" href="https://colab.research.google.com/github/{{site.github_username}}/{{site.github_repo}}/blob/{{page.branch | default: "master"}}/{{page.nb_path}}" role="button">
Copy link
Member

@hamelsmu hamelsmu Feb 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<a class="btn btn-sm text-gray-dark" href="https://colab.research.google.com/github/{{site.github_username}}/{{site.github_repo}}/blob/{{page.branch | default: "master"}}/{{page.nb_path}}" role="button">
<a class="btn btn-sm text-gray-dark" href="https://colab.research.google.com/github/{{ site.github.repository_nwo }}/blob/{{page.branch | default: "master"}}/{{page.nb_path}}" role="button">

See my comment regarding the plugin, might want to test this works, but i got {{ site.github.repository_nwo } from here

Copy link
Member

@hamelsmu hamelsmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a some suggestions on how you can utilize the plugin in the path construction

@hamelsmu
Copy link
Member

hamelsmu commented Feb 3, 2020

@jph00 Sure I'll send the PR now

@hamelsmu
Copy link
Member

hamelsmu commented Feb 3, 2020

@xnutsive I'm working on the new repo now. I'm going to put the base information in there, and will comment on this thread to get this PR in there!

Or if you prefer I can copy-paste your code, however I would prefer your name shows up on the contributions history. I'll comment again on this thread when I'm done with setup (could take me 2-3 days to complete this step. ) . I'm going to try and set it up a bit differently based on some problems I observed with this template, and add tests etc to facilitate PRs.

Looking forward to working with you on the new repo 🙇

@natikgadzhi
Copy link
Contributor Author

@xnutsive I'm working on the new repo now. I'm going to put the base information in there, and will comment on this thread to get this PR in there!

Noice!

I agree that keeping fast_template simple and to the point, completely documented by Jeremy's four posts, and spinning out a separate repo with more flexible and more advanced mechanisms is a great way to do it, and looking forward to participate! Thank you @jph for inviting me :)

As for this PR — I don't mind to wait for a bit and then port it to the new repo at all. No rush. It's also OK if you want to quickly copy and paste the code as a boilerplate / initial version for the repo, and you can always credit me in the readme if you see fit.

As for other contributions, and plan of work, if there's a google doc already — could you please invite me as well, using nate@respawn.io? Alternatively, if you think it's a good idea, we could start a discussion or a series of topics on forums.fast.ai, in the nbdev section?

I'm very excited to have the opportunity to learn from you, and to build together — you can count on something in the nature of 4-8 hours a week. Feel free to offload any features / chores / issues or bugs you'd want me to focus on.

@hamelsmu
Copy link
Member

hamelsmu commented Feb 4, 2020

Thanks @xnutsive look forward to working with you too, its great to have someone in the community that is excited about this.

OK if you want to quickly copy and paste the code as a boilerplate / initial version for the repo, and you can always credit me in the readme if you see fit.

Would rather have you do a PR if possible b/c I'm lazy and don't want to maintain a README of contributors over time 😄 (I'm still working on building the structure of the repo, I'm currently working on it in a fork )

As for other contributions, and plan of work, if there's a google doc already ... Alternatively, if you think it's a good idea, we could start a discussion or a series of topics on forums.fast.ai, in the nbdev section?

Its a good idea to use the fast ai forums, I'll start a thread there and tag you today. Sorry I was supposed to do this earlier and forgot.

I'm very excited to have the opportunity to learn from you, and to build together — you can count on something in the nature of 4-8 hours a week. Feel free to offload any features / chores / issues or bugs you'd want me to focus on.

I'm here to learn, too. GitHub as allowed me to work on this task for a part of my workday every day :) so hopefully I will be able to help/resolve issues

@jph00
Copy link
Member

jph00 commented Feb 4, 2020 via email

@hamelsmu
Copy link
Member

hamelsmu commented Feb 8, 2020

@xnutsive ok I believe that the new repo is ready for a PR! It took me some time (longer than expected) to setup good CI tools there to try to make testing / maintenance easier. Your PR will be a good chance to test out automation there.

One request can you put your new template in _action_files rather than _notebooks? Many thanks!

@hamelsmu
Copy link
Member

@xnutsive I'm working on integrating this into fastpages now, as it will enable lots of cool things.

I will mention you in an acknowledgment section of the README, as well as tag you for review on the PR. Thank you so much again for this work, it is very helpful

@natikgadzhi
Copy link
Contributor Author

natikgadzhi commented Feb 13, 2020 via email

@hamelsmu
Copy link
Member

@xnutsive please don't worry about it. I have lots of time to work on this, and I am here to help out. I am making the changes to fastpages and will tag you for review, will probably be done by end of day tommorrow. Thank you again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants