Needed: DataLab integration with Google BigTable, Google DataProc (Spark)

We use Jupyter notebooks to access BigTable data like so:
```
from google.cloud import bigtable
from google.cloud import happybase
client = bigtable.Client(project=project_id, admin=True)
instance = client.instance(instance_id)
connection = happybase.Connection(instance=instance)
table = connection.table(table_name)

for key, row in table.scan:
```
(we then convert this in **Pandas DataFrames**)

In regards to **DataLab** and **DataProc** integration - **Jupyter Spark** integration http://blog.insightdatalabs.com/jupyter-on-apache-spark-step-by-step/ is a thing in Data Science - so how can we leverage DataLab notebooks over Spark jobs running on DataProc (eg **stepwise pyspark job definitions, visualising job results**)?

Also , how do we leverage **IPython Parallel** https://ipyparallel.readthedocs.io/en/latest/ and Jupyter **Cluster notebook extensions** in DataLab ?




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Needed: DataLab integration with Google BigTable, Google DataProc (Spark) #41

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Needed: DataLab integration with Google BigTable, Google DataProc (Spark) #41

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions