Description
Let's say I have:
/creds1.json
/creds2.json
dataset1
dataset2
gcloud.datastore
How do I pull down Person:1
from dataset1
, retrieve the 'name'
property, and write it back to Log:2
in dataset2
?
Here's my best guess so far:
from gcloud import credentials
from gcloud import datastore
creds1 = credentials.get_for_service_account_json('/creds1.json')
creds2 = credentials.get_for_service_account_json('/creds2.json')
connection1 = datastore.Connection(credentials=creds1)
connection2 = datastore.Connection(credentials=creds1)
person1_key = datastore.Key('Person', 1, dataset_id='dataset1')
log2_key = datastore.Key('Log', 2, dataset_id='dataset2')
person1 = datastore.get(datastore.Key(person1_key), connection=connection1, dataset_id='dataset1')
log2 = datastore.get(datastore.Key(log2_key), connection=connection2, dataset_id='dataset2')
log2['data'] = person1['name']
datastore.put(log2, connection=connection2, dataset_id='dataset2')
I only got that by digging through tons of code. It made me sad.
What I want to write:
from gcloud.datastore import get_connection
dataset1 = get_connection(credentials_json='/creds1.json').get_dataset('dataset1')
dataset2 = get_connection(credentials_json='/creds2.json').get_dataset('dataset2')
person1 = dataset1.get('Person', 1)
log2 = dataset2.get('Log', 2)
log2['data'] = person1['name']
log2.put()
It seems that somewhere along the way, we lost the "hierarchy" of high-level concepts (Datastore -> Connection -> Dataset -> Entity) so that things don't seem to know who their "parent" is in the tree on the way up.
This means things like dataset.get()
because it needs to be provided it's connection and credentials. It seems we've tried to overcome this by storing defaults, but that blows up when you have more than one set of credentials...
Maybe I'm totally misunderstanding?
I'm thinking that it'd be cool if we could allow three things:
- datastore. that accepts all the parameters to be absurdly specific (here is the connection, here is the dataset_it, etc)
- datastore. that has lots of
None
s as default parameters, and we "go get the default" if you left it asNone
(ie,connection=None -> get_default_connection()
) - Datastore -> Connection -> Dataset -> Entity drill down that "pre-fills" these things going up the chain. That is, I can say:
connection = datastore.get_connection(...)
dataset = connection.get_dataset(...)
entity = dataset.get(...)
This means that the datastore
module, Connection
, and Dataset
all would likely have the same methods, just with fewer things you can specify because some of those fieldsare specified when you "ask for" the next level down (ie, a Dataset knows it's dataset_id
, so you don't have an option to provide that) .
Example
datastore.py:
def get(key, connection=None, dataset_id=None):
connection = connection or get_default_connection()
dataset_id = dataset_id or get_default_dataset_id()
# Means I can do: datastore.get(Key('Person', 1), dataset_id='dataset1')
dataset.py:
class Dataset(object):
def __init__(self, dataset_id, connection=None):
self.dataset_id = dataset_id
self.connection = connection
def get(self, key):
return datastore.get(key, dataset_id=self.dataset_id, connection=self.connection)