-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First class Cockroach DB support. #607
Comments
✔️ Historical Issues / Solutions of Note✔️ All issues are resolved in #608 but are here if you need to apply them manually.✔️ 🐛
return f"CURRENT_TIMESTAMP::timestamp + INTERVAL '{interval_string}'"
return "current_timestamp::timestamp" ✔️ 🐛 piccolo/piccolo/engine/postgres.py Line 248 in d010660
#extensions = ["uuid-ossp"] Or add ✔️ 🐛 SET CLUSTER SETTING sql.defaults.experimental_alter_column_type.enabled = true; Same strategy employed by django-cockroach
|
No worries, we'll use this issue instead as it's more detailed. |
I experimented a bit and researched about this.
For Github actions only found an example of how Peewee ORM does it (basically downloading CockroachDB, starting with I was able to run the tests locally using # in columns/base.py
class Default(ABC):
@abstractproperty
def postgres(self) -> str:
pass
@abstractproperty
def sqlite(self) -> str:
pass
@abstractproperty
def cockroach(self) -> str:
pass
# and then in columns/timestamp.py
class TimestampNow(Default):
@property
def postgres(self):
return "current_timestamp"
@property
def sqlite(self):
return "current_timestamp"
@property
def cockroach(self):
return "current_timestamp::timestamp" for the CockroachDB specifics (because without that Postgres tests are failing). The result is that a lot of the tests failed ( Tests run 10 times slower than postgres (even with start-single-node) or I don't know how to fine tune it (I use this command to start single node Sorry for the long comment, I hope this is useful. |
For reference, in my experience, django-cockroach is also a fairly fleshed out postresql ➡️ cockroachdb layer, and may also serve as a good reference point for common solutions.
Yeah I do believe this is to set the correct default value (Ex: |
I think we could do something like this, but Daniel would know better. First create new from __future__ import annotations
import typing as t
from piccolo.engine.postgres import PostgresEngine
class CockroachEngine(PostgresEngine):
def __init__(
self,
config: t.Dict[str, t.Any],
extensions: t.Sequence[str] = (),
) -> None:
self.config = config
self.extensions = extensions
super().__init__(config=config, extensions=extensions)
engine_type = "cockroach"
... and then change Serial column to this class Serial(Column):
"""
An alias to an autoincrementing integer column in Postgres.
"""
@property
def column_type(self):
engine_type = self._meta.engine_type
if engine_type == "postgres":
return "SERIAL"
elif engine_type == "sqlite":
return "INTEGER"
elif engine_type == "cockroach":
return "DEFAULT unique_rowid()"
raise Exception("Unrecognized engine type") And then do the same for UUID column. I have no experience with CockroachDB, but do you know why the tests take so long? I think this is a problem for GH actions. |
@sinisaos Yeah, something like that will work. This needs changing too: Lines 272 to 294 in ed93feb
So the logic for cockroach is:
|
Would bump that to at least 2GB. CockroachDB tends to choke up on anything less. Tests are running fairly quickly here, but I'm currently hanging on |
On the bright side the tests are running here in 2:13
EDIT: Actually, probably running into the "multiple active portals" issue: MagicStack/asyncpg#738 |
@dantownsend You're right. I just gave an example of what we could do. Maybe you should install CockroachDB when you have time and try to run the tests. |
I currently "work around" the |
@dantownsend any ideas? |
@gnat I browsed some issues on cockroachdb (like this cockroachdb/cockroach#40195). I don't have a definite answer, just some ideas:
I need to play around with it a bit more. |
So, some tests are problematic for Cockroach (anything testing sequential keys starting at 1), because even though we can run Cockroach in "compatibility mode", most Cockroach users will never want to do this because it's terrible for performance in Cockroach (the cluster would have to perform a very expensive "full cluster" consensus on INSERT and essentially have a cluster-wide lock 😱) Notably band = Band(name="Pythonistas", popularity=1000, manager=manager)
band.save().run_sync()
self.assertEqual(
Band.select().run_sync(),
[
{
"id": 1,
"name": "Pythonistas",
"manager": 1,
"popularity": 1000,
}
],
) So we have a few options: Add to the current test suite.
Run compatibility mode.
Remove sequential keys from the current assert checks.
Ignore large swaths of tests for Cockroach.
|
Major work in progress, but figured I'd start pushing up progress so people can try things out as we go. |
Yes, that's a tricky problem. We currently have an approach for just running certain tests for certain databases: Lines 33 to 39 in ed93feb
We can create a @for_engines('sqlite', 'postgres')
class Test1(TestCase):
... But as you said, ignoring large numbers of tests isn't ideal.
It's a smart idea. I'm not sure how best to implement it though. Whenever we create tables in test, we could do it via a custom function which set the tables up correctly if using cockroach. class Test1(Table):
def setUp(self):
create_test_tables(Band) I need to think about it a bit more ... |
@for_engines('sqlite', 'postgres')
class Test1(TestCase):
... This is smart. Allows us to pick and choose specific tests per engine and easily add other engines in the future. Would allow us to repurpose the Cockroach tests for other massively distributed databases as well. If you don't mind, what does the implementation of this Here's an example of extending our test suite with Keeps the test suite comprehensive and the division of tests are clear. I've re-worked that test for Cockroach (it just ignores
from unittest import TestCase
from piccolo.table import create_db_tables_sync, drop_db_tables_sync
from tests.example_apps.music.tables import Band, Manager
from tests.base import cockroach_only, no_cockroach
class TestSave(TestCase):
def setUp(self):
create_db_tables_sync(Manager, Band)
def tearDown(self):
drop_db_tables_sync(Manager, Band)
def test_save_new(self):
"""
Make sure that saving a new instance works.
"""
manager = Manager(name="Maz")
query = manager.save()
print(query)
self.assertTrue("INSERT" in query.__str__())
query.run_sync()
names = [i["name"] for i in Manager.select(Manager.name).run_sync()]
self.assertTrue("Maz" in names)
manager.name = "Maz2"
query = manager.save()
print(query)
self.assertTrue("UPDATE" in query.__str__())
query.run_sync()
names = [i["name"] for i in Manager.select(Manager.name).run_sync()]
self.assertTrue("Maz2" in names)
self.assertTrue("Maz" not in names)
@for_engines('sqlite', 'postgres')
def test_save_specific_columns(self):
"""
Make sure that we can save a subset of columns.
"""
manager = Manager(name="Guido")
manager.save().run_sync()
band = Band(name="Pythonistas", popularity=1000, manager=manager)
band.save().run_sync()
self.assertEqual(
Band.select().run_sync(),
[
{
"id": 1,
"name": "Pythonistas",
"manager": 1,
"popularity": 1000,
}
],
)
band.name = "Pythonistas 2"
band.popularity = 2000
band.save(columns=[Band.name]).run_sync()
# Only the name should update, and not the popularity:
self.assertEqual(
Band.select().run_sync(),
[
{
"id": 1,
"name": "Pythonistas 2",
"manager": 1,
"popularity": 1000,
}
],
)
# Also test it using strings to identify columns
band.name = "Pythonistas 3"
band.popularity = 3000
band.save(columns=["popularity"]).run_sync()
# Only the popularity should update, and not the name:
self.assertEqual(
Band.select().run_sync(),
[
{
"id": 1,
"name": "Pythonistas 2",
"manager": 1,
"popularity": 3000,
}
],
)
@for_engines('cockroach')
def test_save_specific_columns(self):
"""
Make sure that we can save a subset of columns.
"""
manager = Manager(name="Guido")
manager.save().run_sync()
band = Band(name="Pythonistas", popularity=1000, manager=manager)
band.save().run_sync()
self.assertEqual(
Band.select().run_sync()[0] | {
"name":"Pythonistas",
"popularity": 1000,
},
Band.select().run_sync()[0]
)
band.name = "Pythonistas 2"
band.popularity = 2000
band.save(columns=[Band.name]).run_sync()
# Only the name should update, and not the popularity:
self.assertEqual(
Band.select().run_sync()[0] | {
"name":"Pythonistas 2",
"popularity": 1000,
},
Band.select().run_sync()[0]
)
# Also test it using strings to identify columns
band.name = "Pythonistas 3"
band.popularity = 3000
band.save(columns=["popularity"]).run_sync()
# Only the popularity should update, and not the name:
self.assertEqual(
Band.select().run_sync()[0] | {
"name":"Pythonistas 2",
"popularity": 3000,
},
Band.select().run_sync()[0]
) |
Another alternative. We can use Direct comparison to the previous for from unittest import TestCase
from piccolo.table import create_db_tables_sync, drop_db_tables_sync
from tests.example_apps.music.tables import Band, Manager
from tests.base import for_engines
class TestSave(TestCase):
def setUp(self):
create_db_tables_sync(Manager, Band)
def tearDown(self):
drop_db_tables_sync(Manager, Band)
def test_save_new(self):
"""
Make sure that saving a new instance works.
"""
manager = Manager(name="Maz")
query = manager.save()
print(query)
self.assertTrue("INSERT" in query.__str__())
query.run_sync()
names = [i["name"] for i in Manager.select(Manager.name).run_sync()]
self.assertTrue("Maz" in names)
manager.name = "Maz2"
query = manager.save()
print(query)
self.assertTrue("UPDATE" in query.__str__())
query.run_sync()
names = [i["name"] for i in Manager.select(Manager.name).run_sync()]
self.assertTrue("Maz2" in names)
self.assertTrue("Maz" not in names)
def test_save_specific_columns(self):
"""
Make sure that we can save a subset of columns.
"""
manager = Manager(name="Guido")
manager.save().run_sync()
band = Band(name="Pythonistas", popularity=1000, manager=manager)
band.save().run_sync()
if for_engines('sqlite', 'postgres'):
self.assertEqual(
Band.select().run_sync(),
[
{
"id": 1,
"name": "Pythonistas",
"manager": 1,
"popularity": 1000,
}
],
)
if for_engines('cockroach'):
self.assertEqual(
Band.select().run_sync()[0] | {
"name":"Pythonistas",
"popularity": 1000,
},
Band.select().run_sync()[0]
)
band.name = "Pythonistas 2"
band.popularity = 2000
band.save(columns=[Band.name]).run_sync()
# Only the name should update, and not the popularity:
if for_engines('sqlite', 'postgres'):
self.assertEqual(
Band.select().run_sync(),
[
{
"id": 1,
"name": "Pythonistas 2",
"manager": 1,
"popularity": 1000,
}
],
)
if for_engines('cockroach'):
self.assertEqual(
Band.select().run_sync()[0] | {
"name":"Pythonistas 2",
"popularity": 1000,
},
Band.select().run_sync()[0]
)
# Also test it using strings to identify columns
band.name = "Pythonistas 3"
band.popularity = 3000
band.save(columns=["popularity"]).run_sync()
# Only the popularity should update, and not the name:
if for_engines('sqlite', 'postgres'):
self.assertEqual(
Band.select().run_sync(),
[
{
"id": 1,
"name": "Pythonistas 2",
"manager": 1,
"popularity": 3000,
}
],
)
if for_engines('cockroach'):
self.assertEqual(
Band.select().run_sync()[0] | {
"name":"Pythonistas 2",
"popularity": 3000,
},
Band.select().run_sync()[0]
) |
Sorry to interrupt, but we may have additional issues until asyncpg is fully supported in Cockroach. I started running individual tests and a few exceptions are currently not supported. For example, tests for array columns throw exception I also haven't found an alternative to the Postgres piccolo/piccolo/columns/readable.py Lines 40 to 42 in ed93feb
There are probably more such situations that we haven't discovered yet. |
Yeah I encountered that the other day as well, thought I'd come back to it as it seems non-essential, maybe @dantownsend can comment. We may be able to just stub it for cockroach for now or put in a placeholder. I don't see many viable alternatives to postgres @knz Any thoughts? |
Instead of We would have to refactor this: piccolo/piccolo/columns/readable.py Line 13 in ed93feb
But I think it's possible. |
If there are lots of problems with The engines in Piccolo aren't super complex, so it's possible. |
I think it looks something like this: ENGINE = engine_finder()
def for_engines(*engine_names: str):
if ENGINE:
current_engine_name = ENGINE.engine_type
if current_engine_name not in engine_names:
def wrapper(func):
return pytest.mark.skip(
f"Not running for {current_engine_name}"
)(func)
return wrapper
else:
def wrapper(func):
return func
return wrapper
else:
raise ValueError("Engine not found") |
We could do that. We would end up with quite a bit of branching in the tests, but in the end it might be the best solution. I'm still hopefully that there's a super clean solution we haven't thought of yet ... 🤔 |
This is what I'm experiencing. Your decorator creates much flatter tests, and the functions themselves are shorter, which is nice. Also I'm realising I'm needing to do Thankfully many tests pass fine unmodified (ideal situation). |
Interesting. On that note, it seems like @dvarrazzo has committed extra time into ensuring a smooth CRDB story in psycopg: psycopg/psycopg#313 |
|
@dantownsend Any advice on how to handle the default type checks tests in Example failed test: def test_bigint_column(self):
self._test_migrations(
table_snapshots=[
[self.table(column)]
for column in [
BigInt(),
BigInt(default=1),
BigInt(default=integer_default),
BigInt(null=True),
BigInt(null=False),
BigInt(index=True),
BigInt(index=False),
]
],
test_function=lambda x: all(
[
x.data_type == "bigint",
x.is_nullable == "NO",
x.column_default == "0",
]
),
) Fails with:
CockroachDB is adding type casting aka Do I really need to write all new tests with the casts included? Or is there a way to override or work around this? Suggestions appreciated. |
@gnat I think this is my preferred solution at the moment too. |
@dantownsend any good ways to get the engine type from the module scope in |
@gnat A column can find out which engine it is using via this: self._meta.engine_type However, this only works after the If you want to warn the user instead against using Lines 264 to 265 in 7754498
Something like this: if db.engine_type =='cockroachdb':
warnings.warn("We recommend using JSONB, not JSON for CockroachDB") Or we just add it to the docs. We could even convert the type from |
Progress: The problem is when piccolo queries On the bright side, I don't think this is actually a functional problem, and should only effect tests- it clears up some confusion. Same thing with all varchar |
that |
Nice one. You're right. Thanks for the catch @knz ! Very helpful. |
Not sure if anyone can help out here but I wanted to leave some progress notes: As mentioned above, Cockroach assumes So yeah while technically Cockroach is capable of storing these types as That said, I totally understand that Cockroach needs to be using Since we cannot "re-map" types in Piccolo currently when using CockroachEngine #607 (comment) I think the best compromise is to simply throw a warning when users try to use Keep in mind https://github.com/cockroachdb/django-cockroachdb actually re-maps the types, so we're not alone here. See: https://github.com/cockroachdb/django-cockroachdb/blob/master/django_cockroachdb/features.py#L52 This means we're gonna need a custom |
It would be super easy if we had some way to re-map these column types (not just switch the |
Another strategy that may help is re-mapping types from inside |
Alrighty gentlemen, the most recent build is ✔️That said, there's a few hacks in there I'm not too proud of, and I feel like this PR is intensive enough that ideally @dantownsend could give this a refinement pass to bring it over the finish line 🏁 I think Piccolo needs a little bit of architectural evolution for this (and future distributed databases) to be added more cleanly, perhaps for 1.0. This may be a good time to lay that groundwork as we get Cockroach in. PS: I could not add the #607 (comment) warnings because |
Btw, been using the M2M support lately. Works fine except for the Even in the PR's current state, Piccolo is a very strong ORM for Cockroach. |
@gnat Yeah, we need to merge it in soon - we can say Piccolo has experimental / beta support for Cockroach. I just need to try it out a bit more to make sure there are no rough edges which would impact the existing Postgres support. |
Time Travel queries ( Example: @dantownsend would be good to get a review for that particular commit: 50aaa8b Also wrapping up hash sharded indexes, but I'm finding that's a tougher nut to crack. |
Added first pass on the docs: 347a487 |
The new And with that, I think we're done here aside from general review @dantownsend |
Hey @dantownsend gonna need some help with mypy. I'm at a total loss as to what to do with these errors. Everything else is green: https://github.com/gnat/piccolo/actions/runs/3098763410/jobs/5017070875 Satisfying the linters has been a pretty frustrating experience. |
@gnat Yeah, we have a lot of linters! Don't worry too much for now. I'll see if I can figure them out while reviewing it. Some of the MyPy errors seem to be due to duplicate test names - |
Yeah, when wrapping tests with engine decorators, I've been trying to keep the actual test names the same so people know it's a variation of the same test.. But the linters do not like that. Example: @engines_only("postgres", "sqlite")
def test_dump_load(self):
pass @engines_only("cockroach")
def test_dump_load(self):
pass Thoughts? @dantownsend |
@gnat Ah, I see. In that situation then |
@dantownsend All 🟢 across the board. I believe we're essentially ready to ship this. Thanks for your help with the linters today! https://github.com/gnat/piccolo/actions/runs/3099551477/jobs/5018820456 🥳 🎉 |
That's great news! We encourage you to add this tool to the list at https://www.cockroachlabs.com/docs/stable/community-tooling.html (click the Contribute button or add a PR at https://github.com/cockroachdb/docs/) |
* Cockroach config for tests. * Added feedback from #607 * New CockroachDB engine. * No more extension=[] ! * Running tests as postgres. * Initial build for CockroachDB * Running tests. * Run workflow. * Naming consistency. Making roughly equivalent to Postgres. * Timestamp bug. * DDL for Cockroach. for_engines decorator. * Progress. * Added database prep for cluster settings. * Tests passing. * Passing migration type checks. * Cockroach ALTER COLUMN TYPE moved outside of transaction until #49351 is solved. * New test helpers. * Fixtures. * Cockroach specific stuff. * Array tests. * Column tests. * JSON tests. * M2M tables. as_list is effected by cockroachdb/cockroach#71908 * Skip test for numeric. * Pool test. * Transaction tests. * Related readable tests. * Save tests. * To dict tests. * JOIN tests. * Output tests. * Raw tests. * Repr tests. * Select tests. * "As string" tests. * Cockroach having trouble with one specific operator. * Skipping until CRDB issue is resolved. Should work though! * Minor updates. * Test dump load. Other CRDB things. * Migration manager tests. * ALTER tests. * SQL Generator tests. * Serialization tests. * Test helpers. * Cockroach Specific stuff. * Numerics all the same for CRDB. * Cockroach specific stuff. * Cleanup. * Skipping timestamp tests. * DDL resolution stuff for Cockroach. * WIP. * Cohesion and special test tables for Cockroach. * WIP * Special table for Cockroach. * Removed hack. * Should pass now. * Removed some debug stuff. * LGTM stuff. * Feedback. * Feedback. * Feedback. * More readable "postgres" or "cockroach. * Added Cockroach overrides for COLUMN_TYPE_MAP, COLUMN_DEFAULT_PARSER * Restore SmallInt functionality for Cockroach. * Restored type() comparison for reflection / generators. * Updated to latest Cockroach release. * Cleanup. * Cleanup. * Build stuff. * Cockroach Serial now uses unique_rowid() as default value always instead of DEFAULT. * Removed first_id() test helper. * Time Travel queries using AS OF SYSTEM TIME * Added documentation. * Small docs fix. * Test update. * Refactored out unnecessary code because we use unique_rowid() as default for Cockroach Serial and BigSerial. * BIGINT instead of BIGSERIAL for Cockroach. * LGTM stuff * LGTM... * Made compatible with older Python versions. * Get Cockroach passing without Python 3.9 stuff. * Removed unused variable. * Fixed test for SQLite. * Re-add Python 3.7 for CRDB + new CRDB version. * Consolidated cockroach_skip() into engines_skip() everywhere. * Re-add SQLite DDL. * Moved back to original engine DDL selector. * Reverted mistake in test suite. * Remove migration that should not be available. * Moving back to postgres_only for specific test. Probably cannot detect engine type at this stage. * postgres_only() to engine_only() * Set sane default for JSONB because we override column_type in JSON for some engines. * Ran isort. * Black being obnoxious, part 1. * Black being obnoxious Part 2. * Flake8 * Black * Flake8 * Flake8 * Flake8 * isort * Flake8 * Flake8 * Added alternate test names for duplicates to remove F811 * Added alternative name. * mypy * mypy * mypy * mypy * mypy * mypy * mypy * mypy * mypy * Cockroach: Now testing latest stable and upcoming release. * Cockroach: Testing only v22.2 * Cockroach version documentation. * Engine detection now based on type because CockroachEngine inherits PostgresEngine. * postgres_only is now engines_only("postgres", "cockroach") because CockroachEngine was always tested as a child class. * Updated to Cockroach 22.2 beta. * isort. * assertAlmostEqual for Decimal test. Docs update. * Linter * Cockroach 22.2 Beta 2. * Added notes about special Column types for Cockroach * version pin xpresso, so integration tests pass * add small section about running cockroach in contributing docs * fix typo Co-authored-by: Daniel Townsend <dan@dantownsend.co.uk>
OK, this has been merged in now. @gnat Do you want to close this issue, or keep it for tracking future TODOs? We should do a PR to the CockroachDB docs as mentioned above. |
Nice one. Done. I'll create/resolve new tickets as things come up. |
Friction-free Cockroach DB support in Piccolo ORM.
PR: #608
We'd like to see Cockroach Labs promote Piccolo as one of, if not the recommended async ORM.
asyncpg
has some support built-in to help us get there.In the mean time I will be adding notes on how to use CockroachDB with Piccolo, and general progress.
Quick local setup for testing
./cockroach start-single-node --insecure --store=node1 --listen-addr=localhost:26257 --http-addr=localhost:8080 --background
./cockroach sql --insecure
create database ...;
use ...;
from here.killall cockroach
to send graceful termination signal.Default account is
root
without password. Fancy stats dashboard athttp://localhost:8080
Both Piccolo Examples and Piccolo Admin are working.
What we still need
Get build/test suite running on Github Actions for Piccolo QA tooling.Some way to identify we're running under Cockroach DB.Added CockroachDB Engine.Get test suite mostly / fully passing.Basic documentationTime Travel queries using AS OF SYSTEM TIMESome way to do Time Travel SELECT queries (SELECT ... AS OF SYSTEM TIME
)https://www.cockroachlabs.com/docs/stable/as-of-system-time.htmlINDEX ... USING HASH
)🐛 Known issues in CockroachDB or Asyncpg
[ ] ARRAY support between asyncpg↔️ cockroach.
piccolo/tests/columns/test_array.py
as_list=True
does not because it uses ARRAYcannot drop
UNIQUE
constraint "..." usingALTER TABLE DROP CONSTRAINT
, useDROP INDEX CASCADE
insteadtest_alter_column_unique()
intest_migrations.py
column ... is of type int[] and thus is not indexable / column ... is of type varchar[] and thus is not indexable
test_array_column_integer()
andtest_array_column_varchar
intest_migration_manager.py
🌔 Future nice to haves
CockroachDB executes
ALTER COLUMN
queries asynchronously.table <...> is currently undergoing a schema change
if a later operation tries to modify the table before the asynchronous query finishes. CockroachDB may fix this in the future.The text was updated successfully, but these errors were encountered: