Skip to content

Commit

Permalink
Merge pull request #18 from mwouts/v0.4.0
Browse files Browse the repository at this point in the history
V0.4.0
  • Loading branch information
mwouts authored Jul 18, 2018
2 parents 1808cbd + 096ea5e commit 31c4203
Show file tree
Hide file tree
Showing 11 changed files with 169 additions and 96 deletions.
14 changes: 14 additions & 0 deletions HISTORY.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,20 @@ Release History
dev
+++

0.4.0 (2018-07-18)
+++++++++++++++++++

**Improvements**

- `.py` format for notebooks is lighter and pep8 compliant

**BugFixes**

- Default nbrmd config not added to notebooks (#17)
- `nbrmd_formats` becomes a configurable traits (#16)
- Removed `nbrmd_sourceonly_format` metadata. Source notebook is current notebook
when not `.ipynb`, otherwise the first notebook format in `nbrmd_formats` (not
`.ipynb`) that is found on disk

0.3.0 (2018-07-17)
+++++++++++++++++++
Expand Down
15 changes: 7 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,13 @@
[![pyversions](https://img.shields.io/pypi/pyversions/nbrmd.svg)](https://pypi.python.org/pypi/nbrmd)


This is a utility that allows to open and run R markdown notebooks in Jupyter, and save Jupyter notebooks as R markdown.

You will be interested in this if
This is a utility that allows to open and run R markdown notebooks in Jupyter, and save Jupyter notebooks as R markdown. You will be interested in this if
- you want to version your notebooks and occasionally have to merge versions
- you want to use RStudio's advanced rendering of notebooks to PDF, HTML or [HTML slides](https://rmarkdown.rstudio.com/ioslides_presentation_format.html)
- or, you have a collection of markdown or R markdown notebooks and you want to open them in Jupyter

Note that if you prefer to save notebooks as python scripts, this is also possible. In that case, have a look at the [nbsrc](https://github.com/mwouts/nbsrc) package.

## What is R markdown?

R markdown (extension `.Rmd`) is a *source only* format for notebooks.
Expand Down Expand Up @@ -79,7 +79,7 @@ You need to choose whever to configure this per notebook, or globally.

The R markdown content manager includes a pre-save hook that will keep up-to date versions of your notebook
under the file extensions specified in the `nbrmd_formats` metadata. Edit the notebook metadata in Jupyter and
append a list for the desired formats, like this:
select the desired formats, like this:
```
{
"kernelspec": {
Expand All @@ -89,8 +89,7 @@ append a list for the desired formats, like this:
"language_info": {
(...)
},
"nbrmd_formats": [".ipynb", ".Rmd"],
"nbrmd_sourceonly_format": ".Rmd"
"nbrmd_formats": "ipynb,Rmd"
}
```

Expand All @@ -99,15 +98,15 @@ append a list for the desired formats, like this:
If you want every notebook to be saved as both `.Rmd` and `.ipynb` files, then change your jupyter config to
```python
c.NotebookApp.contents_manager_class = 'nbrmd.RmdFileContentsManager'
c.ContentsManager.default_nbrmd_formats = ['.ipynb', '.Rmd']
c.ContentsManager.default_nbrmd_formats = 'ipynb,Rmd'
```

If you prefer to update just `.Rmd`, change the above accordingly (you will
still be able to open regular `.ipynb` notebooks).

## Recommendations for version control

I recommend that you set `nbrmd_formats` to `[".ipynb", ".Rmd"]`, either
I recommend that you set `nbrmd_formats` to `"ipynb,Rmd"`, either
in the default configuration, or in the notebook metadata (see above).

When you save your notebook, two files are generated,
Expand Down
48 changes: 37 additions & 11 deletions nbrmd/cells.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ def cell_to_text(self,
else:
options = metadata_to_json_options(metadata)
if options != '{}':
lines.append('#+ ' + options)
lines.append('# + ' + options)
lines.extend(source)
else:
lines.extend(self.markdown_escape(
Expand All @@ -54,7 +54,7 @@ def cell_to_text(self,
source = ['']
lines.extend(self.markdown_escape(source))

# Two blank lines between consecutive markdown cells
# Two blank lines between consecutive markdown cells in Rmd
if self.ext == '.Rmd' and next_cell \
and next_cell.cell_type == 'markdown':
lines.append('')
Expand All @@ -68,7 +68,7 @@ def cell_to_text(self,
_start_code_rmd = re.compile(r"^```\{(.*)\}\s*$")
_start_code_md = re.compile(r"^```(.*)$")
_end_code_md = re.compile(r"^```\s*$")
_option_code_rpy = re.compile(r"^#\+(.*)$")
_option_code_rpy = re.compile(r"^(#|# )\+(.*)$")
_blank = re.compile(r"^\s*$")


Expand All @@ -80,11 +80,22 @@ def start_code_rpy(line):
return _option_code_rpy.match(line)


def next_uncommented_is_code(lines):
for line in lines:
if line.startswith('#'):
continue
return not _blank.match(line)

return False


def text_to_cell(self, lines):
if self.start_code(lines[0]):
return self.code_to_cell(lines, parse_opt=True)
elif self.prefix != '' and not lines[0].startswith(self.prefix):
return self.code_to_cell(lines, parse_opt=False)
elif self.ext == '.py' and next_uncommented_is_code(lines):
return self.code_to_cell(lines, parse_opt=False)
else:
return self.markdown_to_cell(lines)

Expand All @@ -95,8 +106,8 @@ def parse_code_options(line, ext):
elif ext == '.R':
return rmd_options_to_metadata(_option_code_rpy.findall(line)[0])
else: # .py
return 'python', json_options_to_metadata(_option_code_rpy.findall(
line)[0])
return 'python', json_options_to_metadata(_option_code_rpy.match(
line).group(2))


def code_to_cell(self, lines, parse_opt):
Expand Down Expand Up @@ -130,7 +141,7 @@ def code_to_cell(self, lines, parse_opt):
if parse_opt and pos == 0:
continue

if self.prefix != '' and line.startswith(self.prefix):
if self.ext == '.R' and line.startswith(self.prefix):
if prev_blank:
return new_code_cell(
source='\n'.join(lines[parse_opt:(pos - 1)]),
Expand All @@ -142,15 +153,30 @@ def code_to_cell(self, lines, parse_opt):
r.metadata['noskipline'] = True
return r, pos

if _blank.match(line):
if prev_blank:
if prev_blank:
if _blank.match(line):
# Two blank lines => end of cell
# Two blank lines at the end == empty code cell
return new_code_cell(
source='\n'.join(lines[parse_opt:(pos - 1)]),
metadata=metadata), min(pos + 1, len(lines) - 1)
prev_blank = True
else:
prev_blank = False

# are all the lines from here to next blank
# escaped with the prefix?
if self.prefix == '#':
found_code = False
for next in lines[pos:]:
if next.startswith('#'):
continue
found_code = not _blank.match(next)
break

if not found_code:
return new_code_cell(
source='\n'.join(lines[parse_opt:(pos - 1)]),
metadata=metadata), pos

prev_blank = _blank.match(line)

# Unterminated cell?
return new_code_cell(
Expand Down
60 changes: 33 additions & 27 deletions nbrmd/contentsmanager.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
import notebook.transutils
from notebook.services.contents.filemanager import FileContentsManager
from tornado.web import HTTPError
from traitlets import Unicode
from traitlets.config import Configurable

import os
import nbrmd
Expand All @@ -25,14 +27,21 @@ def _reads(s, as_version, **kwargs):


def check_formats(formats):
if not isinstance(formats, list):
formats = formats.split(',')

formats = [fmt if fmt.startswith('.') else '.' + fmt
for fmt in formats if fmt != '']

allowed = nbrmd.notebook_extensions
if not isinstance(formats, list) or not set(formats).issubset(allowed):
raise TypeError(u"Notebook metadata 'nbrmd_formats' "
u"should be subset of {}".format(str(allowed)))
raise TypeError("Notebook metadata 'nbrmd_formats' "
"should be subset of {}, but was {}"
"".format(str(allowed), str(formats)))
return formats


class RmdFileContentsManager(FileContentsManager):
class RmdFileContentsManager(FileContentsManager, Configurable):
"""
A FileContentsManager Class that reads and stores notebooks to classical
Jupyter notebooks (.ipynb), R Markdown notebooks (.Rmd),
Expand All @@ -45,8 +54,11 @@ class RmdFileContentsManager(FileContentsManager):
def all_nb_extensions(self):
return ['.ipynb'] + self.nb_extensions

default_nbrmd_formats = ['.ipynb']
default_nbrmd_sourceonly_format = None
default_nbrmd_formats = Unicode(
u'ipynb',
help='Save notebooks to these file extensions. '
'Can be any of ipynb,Rmd,py,R, comma separated',
config=True)

def _read_notebook(self, os_path, as_version=4,
load_alternative_format=True):
Expand All @@ -67,24 +79,23 @@ def _read_notebook(self, os_path, as_version=4,
nbrmd_formats = (nb.metadata.get('nbrmd_formats') or
self.default_nbrmd_formats)

nbrmd_formats = check_formats(nbrmd_formats)

if ext not in nbrmd_formats:
nbrmd_formats.append(ext)

nbrmd_formats = check_formats(nbrmd_formats)

# Source format is taken in metadata, contentsmanager, or is current
# ext, or is first non .ipynb format that is found on disk
source_format = (nb.metadata.get('nbrmd_sourceonly_format') or
self.default_nbrmd_sourceonly_format)

if source_format is None:
if ext != '.ipynb':
source_format = ext
else:
for fmt in nbrmd_formats:
if fmt != '.ipynb' and os.path.isfile(file + fmt):
source_format = fmt
break
# Source format is current ext, or is first non .ipynb format
# that is found on disk
source_format = None
if ext != '.ipynb':
source_format = ext
else:
for fmt in nbrmd_formats:
if fmt != '.ipynb' and os.path.isfile(file + fmt):
source_format = fmt
break

nb_outputs = None
if source_format is not None and ext != source_format:
Expand All @@ -106,17 +117,10 @@ def _read_notebook(self, os_path, as_version=4,
as_version=as_version,
load_alternative_format=False)

# We store in the metadata the alternative and sourceonly formats
trusted = self.notary.check_signature(nb)
nb.metadata['nbrmd_formats'] = nbrmd_formats
nb.metadata['nbrmd_sourceonly_format'] = source_format

if nb_outputs is not None:
combine.combine_inputs_with_outputs(nb, nb_outputs)
trusted = self.notary.check_signature(nb_outputs)

if trusted:
self.notary.sign(nb)
if self.notary.check_signature(nb_outputs):
self.notary.sign(nb)

return nb

Expand All @@ -127,6 +131,8 @@ def _save_notebook(self, os_path, nb):
formats = (nb.get('metadata', {}).get('nbrmd_formats') or
self.default_nbrmd_formats)

formats = check_formats(formats)

if org_ext not in formats:
formats.append(org_ext)

Expand Down
27 changes: 14 additions & 13 deletions nbrmd/header.py
Original file line number Diff line number Diff line change
Expand Up @@ -92,21 +92,22 @@ def header_to_metadata_and_cell(self, lines):
start = 0

for i, line in enumerate(lines):
if not line.startswith(self.prefix):
if i == 0 and line.startswith('#!'):
metadata['executable'] = line[2:]
if i == 0 and line.startswith('#!'):
metadata['executable'] = line[2:]
start = i + 1
continue
if i == 0 or (i == 1 and not _encoding_re.match(lines[0])):
encoding = _encoding_re.match(line)
if encoding:
if encoding.group(1) != 'utf-8':
raise ValueError('Encodings other than utf-8 '
'are not supported')
if line != _utf8_header:
metadata['encoding'] = line
start = i + 1
continue
if i == 0 or (i == 1 and not _encoding_re.match(lines[0])):
encoding = _encoding_re.match(line)
if encoding:
if encoding.group(1) != 'utf-8':
raise ValueError('Encodings other than utf-8 '
'are not supported')
if line != _utf8_header:
metadata['encoding'] = line
start = i + 1
continue

if not line.startswith(self.prefix):
break

line = self.markdown_unescape(line)
Expand Down
2 changes: 1 addition & 1 deletion nbrmd/nbrmd.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@


def markdown_comment(ext):
return '' if ext == '.Rmd' else "#'" if ext == '.R' else "##"
return '' if ext == '.Rmd' else "#'" if ext == '.R' else "#"


class TextNotebookReader(NotebookReader):
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

setup(
name='nbrmd',
version='0.3.0',
version='0.4.0',
author='Marc Wouts',
author_email='marc.wouts@gmail.com',
description='Jupyter from/to R markdown notebooks',
Expand Down
41 changes: 29 additions & 12 deletions tests/python_notebook_sample.py
Original file line number Diff line number Diff line change
@@ -1,20 +1,37 @@
## First
## markdown
## cell
# # Specifications for Jupyter notebooks as python scripts

## # Part A
# ## Markdown (and raw) cells

## Some python code below
# Markdown cells are escaped with a single quote. Two consecutive
# cells are separated with a blank line. Raw cells are not
# distinguished from markdown.

# Pandas
import pandas as pd
# ## Code cells

df = pd.Series({'A': 1, 'B': 2})
# Code cells are separated by one blank line from markdown cells.
# If a code cells follows a comment, then that comment belong to the
# code cell.

## # Part B
# For instance, this is a code cell that starts with a
# code comment, split on multiple lines
1 + 2

## Now we have a python cell
## with metadata in json format, escaped with #+
# Code cells are terminated with either
# - end of file
# - two blank lines if followed by an other code cell
# - one blank line if followed by a markdown cell

# Code cells can have blank lines, but no two consecutive blank lines (that's
# a cell break!). Below we have a cell with multiple instructions:

a = 3

a + 1

# ## Metadata in code cells

# In case a code cell has metadata information, it
# is represented in json format, escaped with '#+' or '# +'

# + {"scrolled": true}
df.plot()
a + 2
Loading

0 comments on commit 31c4203

Please sign in to comment.