Skip to content
This repository was archived by the owner on Sep 3, 2022. It is now read-only.

Commit a3639bd

Browse files
authored
Mergemaster/cloudml (#134)
* Add gcs_copy_file() that is missing but is referenced in a couple of places. (#110) * Add gcs_copy_file() that is missing but is referenced in a couple of places. * Add DataFlow to pydatalab dependency list. * Fix travis test errors by reimplementing gcs copy. * Remove unnecessary shutil import. * Flake8 configuration. Set max line length to 100. Ignore E111, E114 (#102) * Add datalab user agent to CloudML trainer and predictor requests. (#112) * Update oauth2client to 2.2.0 to satisfy cloudml in Cloud Datalab (#111) * Update README.md (#114) Added docs link. * Generate reST documentation for magic commands (#113) Auto generate docs for any added magics by searching through the source files for lines with register_line_cell_magic, capturing the names for those magics, and calling them inside an ipython kernel with the -h argument, then storing that output into a generated datalab.magics.rst file. * Fix an issue that %%chart failed with UDF query. (#116) * Fix an issue that %%chart failed with UDF query. The problem is that the query is submitted to BQ without replacing variable values from user namespace. * Fix chart tests by adding ip.user_ns mock. * Fix charting test. * Add missing import "mock". * Fix chart tests. * Fix "%%bigquery schema" issue -- the command generates nothing in output. (#119) * Add some missing dependencies, remove some unused ones (#122) * Remove scikit-learn and scipy as dependencies * add more required packages * Add psutil as dependency * Update packages versions * Cleanup (#123) * Remove unnecessary semicolons * remove unused imports * remove unncessary defined variable * Fix query_metadata tests (#128) Fix query_metadata tests * Make the library pip-installable (#125) This PR adds tensorflow and cloudml in setup.py to make the lib pip-installable. I had to install them explicitly using pip from inside the setup.py script, even though it's not a clean way to do it, it gets around the two issues we have at the moment with these two packags: - Pypi has Tensorflow version 0.12, while we need 0.11 for the current version of pydatalab. According to the Cloud ML docs, that version exists as a pip package for three supported platforms. - Cloud ML SDK exists as a pip package, but also not on Pypi, and while we could add it as a dependency link, there exists another package on Pypi called cloudml, and pip ends up installing that instead (see #124). I cannot find a way to force pip to install the package from the link I included. * Set command description so it is displayed in --help. argparser's format_help() prints description but not help. (#131)
1 parent eb46fd9 commit a3639bd

File tree

23 files changed

+151
-46
lines changed

23 files changed

+151
-46
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,3 +7,4 @@ MANIFEST
77
build
88
.coverage
99
dist
10+
datalab.magics.rst

.travis.yml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,14 @@
11
language: python
22
python:
3-
- "2.7"
4-
- "3.4"
3+
- 2.7
4+
- 3.5
55

66
before_install:
7+
- sudo apt-get install -y python-setuptools
78
- npm install -g typescript
89
- tsc --module amd --noImplicitAny --outdir datalab/notebook/static datalab/notebook/static/*.ts
910
- pip install -U pip
11+
- pip install -U setuptools
1012
- pip install .
1113

1214
script: python ./tests/main.py

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,3 +60,5 @@ You will also need to set the project ID to use; either set a `PROJECT_ID`
6060
environment variable to the project name, or call `set_datalab_project_id(name)`
6161
from within your notebook.
6262

63+
## Documentation
64+
You can read the Sphinx generated docs at: [http://googledatalab.github.io/pydatalab/](http://googledatalab.github.io/pydatalab/)

datalab/bigquery/commands/_bigquery.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1009,6 +1009,14 @@ def _repr_html_table_schema(schema):
10091009
_HTML_TEMPLATE = """
10101010
<div class="bqsv" id="%s"></div>
10111011
<script>
1012+
require.config({
1013+
map: {
1014+
'*': {
1015+
datalab: 'nbextensions/gcpdatalab'
1016+
}
1017+
},
1018+
});
1019+
10121020
require(['datalab/bigquery', 'datalab/element!%s',
10131021
'datalab/style!/nbextensions/gcpdatalab/bigquery.css'],
10141022
function(bq, dom) {

datalab/mlalpha/_cloud_predictor.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,7 @@ def predict(self, data):
8181

8282
request = self._api.projects().predict(body={'instances': data},
8383
name=self._full_version_name)
84+
request.headers['user-agent'] = 'GoogleCloudDataLab/1.0'
8485
result = request.execute()
8586
if 'predictions' not in result:
8687
raise Exception('Invalid response from service. Cannot find "predictions" in response.')

datalab/mlalpha/_cloud_runner.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -86,4 +86,5 @@ def run(self, job_id=None):
8686
discoveryServiceUrl=_CLOUDML_DISCOVERY_URL)
8787
request = cloudml.projects().jobs().create(body=job,
8888
parent='projects/' + context.project_id)
89+
request.headers['user-agent'] = 'GoogleCloudDataLab/1.0'
8990
return request.execute()

datalab/mlalpha/_dataset.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -293,7 +293,7 @@ def _scatter3d_plot(self, names, x, y, z, color):
293293
iplot(fig)
294294

295295
def _plot_x(self, names, x):
296-
self._histogram(names, x);
296+
self._histogram(names, x)
297297
if x != self._target_name:
298298
self._scatter_plot(names, x, self._target_name, self._target_name)
299299

datalab/mlalpha/commands/_mlalpha.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,14 +18,13 @@
1818

1919

2020
import collections
21-
import datetime
2221
import fnmatch
2322
import google.cloud.ml
2423
import json
2524
import math
2625
import os
2726
import plotly.graph_objs as go
28-
from plotly.offline import init_notebook_mode, iplot
27+
from plotly.offline import iplot
2928
import urllib
3029
import yaml
3130

@@ -271,7 +270,7 @@ def _train(args, cell):
271270
urllib.urlencode(log_url_query_strings)
272271
html += '<p>Click <a href="%s" target="_blank">here</a> to view cloud log. <br/>' % log_url
273272
html += 'Start TensorBoard by running "%tensorboard start --logdir=&lt;YourLogDir&gt;".</p>'
274-
return IPython.core.display.HTML(html);
273+
return IPython.core.display.HTML(html)
275274
else:
276275
# local training
277276
package_path = None

datalab/stackdriver/monitoring/_query_metadata.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ def as_dataframe(self, max_rows=None):
6464
"""
6565
max_rows = len(self._timeseries_list) if max_rows is None else max_rows
6666
headers = [{
67-
'resource': ts.resource.__dict__, 'metric': ts.metric.__dict__}
67+
'resource': ts.resource._asdict(), 'metric': ts.metric._asdict()}
6868
for ts in self._timeseries_list[:max_rows]]
6969

7070
if not headers:

datalab/storage/_item.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -196,7 +196,6 @@ def read_lines(self, max_lines=None):
196196

197197
max_to_read = self.metadata.size
198198
bytes_to_read = min(100 * max_lines, self.metadata.size)
199-
lines = []
200199
while True:
201200
content = self.read_from(byte_count=bytes_to_read)
202201

datalab/utils/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,4 +21,4 @@
2121
from ._lru_cache import LRUCache
2222
from ._lambda_job import LambdaJob
2323
from ._utils import print_exception_with_last_stack, get_item, compare_datetimes, \
24-
pick_unused_port, is_http_running_on
24+
pick_unused_port, is_http_running_on, gcs_copy_file

datalab/utils/_utils.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222
import httplib
2323

2424
import pytz
25+
import subprocess
2526
import socket
2627
import traceback
2728
import types
@@ -110,3 +111,13 @@ def is_http_running_on(port):
110111
return True
111112
except Exception as e:
112113
return False
114+
115+
116+
def gcs_copy_file(source, dest):
117+
""" Copy file from source to destination. The paths can be GCS or local.
118+
119+
Args:
120+
source: the source file path.
121+
dest: the destination file path.
122+
"""
123+
subprocess.check_call(['gsutil', '-q', 'cp', source, dest])

datalab/utils/commands/_commands.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,4 +79,4 @@ def subcommand(self, name, help):
7979
"""Creates a parser for a sub-command. """
8080
if self._subcommands is None:
8181
self._subcommands = self.add_subparsers(help='commands')
82-
return self._subcommands.add_parser(name, help=help)
82+
return self._subcommands.add_parser(name, description=help, help=help)

datalab/utils/commands/_utils.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -207,10 +207,11 @@ def get_data(source, fields='*', env=None, first_row=0, count=-1, schema=None):
207207
Exception if the request could not be fulfilled.
208208
"""
209209

210+
ipy = IPython.get_ipython()
210211
if env is None:
211212
env = {}
213+
env.update(ipy.user_ns)
212214
if isinstance(source, basestring):
213-
ipy = IPython.get_ipython()
214215
source = datalab.utils.get_item(ipy.user_ns, source, source)
215216
if isinstance(source, basestring):
216217
source = datalab.bigquery.Table(source)

docs/Makefile

Lines changed: 29 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -55,38 +55,42 @@ help:
5555
clean:
5656
rm -rf $(BUILDDIR)/*
5757

58-
html:
58+
pre-build:
59+
@echo "Generate reST for magic commands:"
60+
ipython gen-magic-rst.ipy
61+
62+
html: pre-build
5963
$(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html
6064
@echo
6165
@echo "Build finished. The HTML pages are in $(BUILDDIR)/html."
6266

63-
dirhtml:
67+
dirhtml: pre-build
6468
$(SPHINXBUILD) -b dirhtml $(ALLSPHINXOPTS) $(BUILDDIR)/dirhtml
6569
@echo
6670
@echo "Build finished. The HTML pages are in $(BUILDDIR)/dirhtml."
6771

68-
singlehtml:
72+
singlehtml: pre-build
6973
$(SPHINXBUILD) -b singlehtml $(ALLSPHINXOPTS) $(BUILDDIR)/singlehtml
7074
@echo
7175
@echo "Build finished. The HTML page is in $(BUILDDIR)/singlehtml."
7276

73-
pickle:
77+
pickle: pre-build
7478
$(SPHINXBUILD) -b pickle $(ALLSPHINXOPTS) $(BUILDDIR)/pickle
7579
@echo
7680
@echo "Build finished; now you can process the pickle files."
7781

78-
json:
82+
json: pre-build
7983
$(SPHINXBUILD) -b json $(ALLSPHINXOPTS) $(BUILDDIR)/json
8084
@echo
8185
@echo "Build finished; now you can process the JSON files."
8286

83-
htmlhelp:
87+
htmlhelp: pre-build
8488
$(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) $(BUILDDIR)/htmlhelp
8589
@echo
8690
@echo "Build finished; now you can run HTML Help Workshop with the" \
8791
".hhp project file in $(BUILDDIR)/htmlhelp."
8892

89-
qthelp:
93+
qthelp: pre-build
9094
$(SPHINXBUILD) -b qthelp $(ALLSPHINXOPTS) $(BUILDDIR)/qthelp
9195
@echo
9296
@echo "Build finished; now you can run "qcollectiongenerator" with the" \
@@ -95,15 +99,15 @@ qthelp:
9599
@echo "To view the help file:"
96100
@echo "# assistant -collectionFile $(BUILDDIR)/qthelp/api.qhc"
97101

98-
applehelp:
102+
applehelp: pre-build
99103
$(SPHINXBUILD) -b applehelp $(ALLSPHINXOPTS) $(BUILDDIR)/applehelp
100104
@echo
101105
@echo "Build finished. The help book is in $(BUILDDIR)/applehelp."
102106
@echo "N.B. You won't be able to view it unless you put it in" \
103107
"~/Library/Documentation/Help or install it in your application" \
104108
"bundle."
105109

106-
devhelp:
110+
devhelp: pre-build
107111
$(SPHINXBUILD) -b devhelp $(ALLSPHINXOPTS) $(BUILDDIR)/devhelp
108112
@echo
109113
@echo "Build finished."
@@ -112,85 +116,85 @@ devhelp:
112116
@echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/api"
113117
@echo "# devhelp"
114118

115-
epub:
119+
epub: pre-build
116120
$(SPHINXBUILD) -b epub $(ALLSPHINXOPTS) $(BUILDDIR)/epub
117121
@echo
118122
@echo "Build finished. The epub file is in $(BUILDDIR)/epub."
119123

120-
latex:
124+
latex: pre-build
121125
$(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
122126
@echo
123127
@echo "Build finished; the LaTeX files are in $(BUILDDIR)/latex."
124128
@echo "Run \`make' in that directory to run these through (pdf)latex" \
125129
"(use \`make latexpdf' here to do that automatically)."
126130

127-
latexpdf:
131+
latexpdf: pre-build
128132
$(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
129133
@echo "Running LaTeX files through pdflatex..."
130134
$(MAKE) -C $(BUILDDIR)/latex all-pdf
131135
@echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex."
132136

133-
latexpdfja:
137+
latexpdfja: pre-build
134138
$(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
135139
@echo "Running LaTeX files through platex and dvipdfmx..."
136140
$(MAKE) -C $(BUILDDIR)/latex all-pdf-ja
137141
@echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex."
138142

139-
text:
143+
text: pre-build
140144
$(SPHINXBUILD) -b text $(ALLSPHINXOPTS) $(BUILDDIR)/text
141145
@echo
142146
@echo "Build finished. The text files are in $(BUILDDIR)/text."
143147

144-
man:
148+
man: pre-build
145149
$(SPHINXBUILD) -b man $(ALLSPHINXOPTS) $(BUILDDIR)/man
146150
@echo
147151
@echo "Build finished. The manual pages are in $(BUILDDIR)/man."
148152

149-
texinfo:
153+
texinfo: pre-build
150154
$(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo
151155
@echo
152156
@echo "Build finished. The Texinfo files are in $(BUILDDIR)/texinfo."
153157
@echo "Run \`make' in that directory to run these through makeinfo" \
154158
"(use \`make info' here to do that automatically)."
155159

156-
info:
160+
info: pre-build
157161
$(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo
158162
@echo "Running Texinfo files through makeinfo..."
159163
make -C $(BUILDDIR)/texinfo info
160164
@echo "makeinfo finished; the Info files are in $(BUILDDIR)/texinfo."
161165

162-
gettext:
166+
gettext: pre-build
163167
$(SPHINXBUILD) -b gettext $(I18NSPHINXOPTS) $(BUILDDIR)/locale
164168
@echo
165169
@echo "Build finished. The message catalogs are in $(BUILDDIR)/locale."
166170

167-
changes:
171+
changes: pre-build
168172
$(SPHINXBUILD) -b changes $(ALLSPHINXOPTS) $(BUILDDIR)/changes
169173
@echo
170174
@echo "The overview file is in $(BUILDDIR)/changes."
171175

172-
linkcheck:
176+
linkcheck: pre-build
173177
$(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) $(BUILDDIR)/linkcheck
174178
@echo
175179
@echo "Link check complete; look for any errors in the above output " \
176180
"or in $(BUILDDIR)/linkcheck/output.txt."
177181

178-
doctest:
182+
doctest: pre-build
179183
$(SPHINXBUILD) -b doctest $(ALLSPHINXOPTS) $(BUILDDIR)/doctest
180184
@echo "Testing of doctests in the sources finished, look at the " \
181185
"results in $(BUILDDIR)/doctest/output.txt."
182186

183-
coverage:
187+
coverage: pre-build
184188
$(SPHINXBUILD) -b coverage $(ALLSPHINXOPTS) $(BUILDDIR)/coverage
185189
@echo "Testing of coverage in the sources finished, look at the " \
186190
"results in $(BUILDDIR)/coverage/python.txt."
187191

188-
xml:
192+
xml: pre-build
189193
$(SPHINXBUILD) -b xml $(ALLSPHINXOPTS) $(BUILDDIR)/xml
190194
@echo
191195
@echo "Build finished. The XML files are in $(BUILDDIR)/xml."
192196

193-
pseudoxml:
197+
pseudoxml: pre-build
194198
$(SPHINXBUILD) -b pseudoxml $(ALLSPHINXOPTS) $(BUILDDIR)/pseudoxml
195199
@echo
196200
@echo "Build finished. The pseudo-XML files are in $(BUILDDIR)/pseudoxml."
@@ -202,7 +206,7 @@ prepublish:
202206
cd ../../datalab-docs && git clone https://github.com/GoogleCloudPlatform/datalab.git html && \
203207
git checkout gh-pages
204208

205-
publish:
209+
publish: pre-build
206210
$(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html
207211
cd ../../datalab-docs/html && git add . && git commit -m "Updated" && git push --force origin gh-pages
208212

docs/README

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,8 @@
1-
To use, install the prerequisites:
1+
To use, install the prerequisites and the pydatalab module:
22

33
pip install sphinx sphinx_rtd_theme sphinxcontrib-napoleon
4+
pip install .. # from docs directory
45

6+
then in the docs directory, do 'make html' (or epub, or text, etc).
57

6-
then in the docs directory, do 'make html' (or epub, or pdf, etc).
7-
8-
Output will be in the docs/_build directory.
9-
8+
Output will be in $BUILDDIR, defaulting to ../../datalab-docs.

docs/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -145,7 +145,7 @@
145145
# Add any paths that contain custom static files (such as style sheets) here,
146146
# relative to this directory. They are copied after the builtin static files,
147147
# so a file named "default.css" will overwrite the builtin "default.css".
148-
html_static_path = ['_static']
148+
#html_static_path = []
149149

150150
# Add any extra paths that contain custom files (such as robots.txt or
151151
# .htaccess) here, relative to this directory. These files are copied

docs/gen-magic-rst.ipy

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
import subprocess, pkgutil, importlib, sys
2+
from cStringIO import StringIO
3+
4+
# ignore mlalpha and tensorboard for now because of their tensorflow dependency
5+
# until tensorboard is pip installable and can be listed as a pydatalab dependency
6+
IGNORED_MAGICS = ['mlalpha', 'tensorboard']
7+
8+
# import submodules
9+
submodules = [s for _,s,_ in pkgutil.iter_modules(['../datalab'])]
10+
11+
for m in submodules:
12+
name = 'datalab.' + m + '.commands'
13+
try:
14+
importlib.import_module(name)
15+
except:
16+
sys.stderr.write('WARNING, could not find module ' + name + '. Ignoring..\n')
17+
18+
magic_regex = "find ../datalab -name '*.py' -exec perl -e '$f=join(\"\",<>); print \"$1\n\" if $f=~/register_line_cell_magic\ndef ([^\(]+)/m' {} \;"
19+
magics = subprocess.check_output(magic_regex, shell=True)
20+
21+
reSTfile = open('datalab.magics.rst', 'w')
22+
indent = '\n '
23+
24+
reSTfile.write('datalab.magics\n')
25+
reSTfile.write('=================\n\n')
26+
27+
for m in magics.split():
28+
if m in IGNORED_MAGICS:
29+
sys.stderr.write('Ignoring magic ' + m + '\n')
30+
else:
31+
reSTfile.write('.. attribute:: ' + m + '\n')
32+
reSTfile.write('.. parsed-literal::\n')
33+
# hijack stdout since the ipython kernel call writes to stdout/err directly
34+
# and does not return its output
35+
tmpStdout, sys.stdout = sys.stdout, StringIO()
36+
get_ipython().magic(m + ' -h')
37+
resultout = sys.stdout.getvalue().splitlines()
38+
sys.stdout = tmpStdout
39+
reSTfile.writelines(indent + indent.join(resultout) + '\n\n')

0 commit comments

Comments
 (0)