-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reading and writing to SQL database #158
Comments
@TheCedarPrince since you are the local FunSQL expert, I wanted to ask, does FunSQL have a |
One thing we will need to think about is how to handle writing of attribute data. SQLite, for example, has a rather restricted set of data types it supports (https://www.sqlite.org/datatype3.html). Unless we use julia_to_sql_type(::Type{T}) where {T} = error("type $(T) not supported for SQL conversion")
julia_to_sql_type(::Type{T}) where {T<:Real} = "REAL"
julia_to_sql_type(::Type{T}) where {T<:AbstractString} = "TEXT"
julia_to_sql_type(::Type{T}) where {T<:Integer} = "INTEGER" So any attributes that are unions, or vectors, etc, are out. One way to go about this is to let the user provide, if they want, for specific |
For acsets, we can more or less name the obs, homs, etc anything we want. But in SQLite (and other implementations), there are quite a few reserved words we cannot use for tables (see https://sqlite.org/lang_keywords.html). So by default we should name the table associated with an object the same name, unless the user gives an override. Also, in SQL (in yet another departure of the SQL implementation from the relational model it supposedly "implements"), the order in which columns are specified in |
Apparently SQLite does not allow multiple tables to be generated in a single statement, the code below (silently, of course) only generates the using SQLite
function get_table(db, table)
query = DBInterface.execute(db, "SELECT * FROM $table")
return DataFrames.DataFrame(query)
end
db = SQLite.DB()
DBInterface.execute(db, """
CREATE TABLE artist(
artistid INTEGER PRIMARY KEY,
artistname TEXT
);
CREATE TABLE track(
trackid INTEGER,
trackname TEXT,
trackartist INTEGER,
FOREIGN KEY(trackartist) REFERENCES artist(artistid)
);
""")
get_table(db, "artist")
get_table(db, "track") One thing to do to get around this is to first create each table "individually" and later add the foreign key columns. Of course here there is another problem, in that using SQLite
function get_table(db, table)
query = DBInterface.execute(db, "SELECT * FROM $table")
return DataFrames.DataFrame(query)
end
db = SQLite.DB()
DBInterface.execute(db, """
CREATE TABLE artist(
artistid INTEGER PRIMARY KEY,
artistname TEXT
);
""")
DBInterface.execute(db, """
CREATE TABLE track(
trackid INTEGER,
trackname TEXT
);
""")
get_table(db, "artist")
get_table(db, "track")
DBInterface.execute(db, """
ALTER TABLE track
ADD COLUMN trackartist INTEGER DEFAULT NULL
REFERENCES artist(artistid);
""") |
Hey @slwu89 , here are some out of order thoughts on this!
FunSQL.jl does not have support for fun_sql = From(...) # Some FunSQL DSL statement
catalog = reflect(
db_conn;
schema = "",
dialect=:sqlite
);
sql = render(catalog, fun_sql);
res = execute(db_conn,
"""
INSERT INTO
table
SELECT
*
FROM
($sql) AS foo;
"""
)
Hm. We may need to create a small sanitizer package or routine that checks to make sure no statements are using reserved language per flavors we want to support.
So one thing I am a strong advocate for here is to use DBInterface.jl interfaces to the DBs we are interested in using. I know that most DB packages within Julia support DBInterface interfaces which is great and should help a lot with this.
Is there any particular reason why we are sticking to SQLite? Could we perhaps use something like DuckDB which acts very similarly to PostgreSQL, supports in-memory databases, and should be fit for high performance use? I've encountered some of the roadblocks you were describing in SQLite and it is not a fault of FunSQL here but rather SQLite being strange in this regard. Here's the DuckDB.jl package and info about constraints. This is just my thought on SQLite as we could switch to something else that "just does" what we want for a proof of concept. Or is SQLite on your side a design requirement? |
Hey @TheCedarPrince, thanks!
It's not a hard design requirement. I just started with it since of previous familiarity (though, obviously, not that familiar as I forgot about all these foot guns). It would be nice to support it nontheless given how widely it's used (and for AlgebraicJulia/AlgebraicRelations.jl#33). But yes for a proof of concept it may be better to switch to something more "fully featured". I'll check out DuckDB!
Yes, that makes sense. I haven't used FunSQL yet in my investigations as I think for this first feature addition I'll just be looking at writing an acset to an SQL db of some sort and reading in from a db a schema and an acset. I think I'll mostly rely on DBInterface to talk to whatever backend is used as the proof of concept. |
Here is an example for creating a connection to DuckDB.jl: using DBInterface
using DuckDB
db = DBInterface.connect(DuckDB.DB, ":memory:") Let me know what else you need! |
OH! Forgot to mention, if you want to write this to a file (like What's neat is that starting as of, if I recall, v0.10.X, all versions of |
@TheCedarPrince testing out DuckDB a bit! Unfortunately it also has the using DuckDB
using DataFrames
using Catlab
# create a new in-memory database
con = DBInterface.connect(DuckDB.DB, ":memory:")
julia_to_sql_type(::Type{T}) where {T} = error("type $(T) not supported for SQL conversion")
julia_to_sql_type(::Type{T}) where {T<:Real} = "REAL"
julia_to_sql_type(::Type{T}) where {T<:AbstractString} = "TEXT"
julia_to_sql_type(::Type{T}) where {T<:Symbol} = "TEXT"
julia_to_sql_type(::Type{T}) where {T<:Integer} = "INTEGER"
@present SchWeightedLabeledGraph <: SchLabeledGraph begin
Weight::AttrType
weight::Attr(E,Weight)
end
# to_graphviz(SchWeightedLabeledGraph)
@acset_type WeightedLabeledGraph(SchWeightedLabeledGraph, index=[:src, :tgt]) <: AbstractLabeledGraph
g = erdos_renyi(WeightedLabeledGraph{Symbol,Float64}, 10, 0.25)
g[:, :label] = Symbol.(collect('a':'z')[1:nv(g)])
g[:, :weight] = floor.(rand(ne(g)) .* 100)
# first make tables of each object and its attributes
for o in objects(acset_schema(g))
o_attrs = attrs(acset_schema(g); from=o)
stmt = Vector{String}()
push!(stmt, "CREATE OR REPLACE TABLE $(o)(_id INTEGER PRIMARY KEY")
for (col, _, type) in o_attrs
push!(stmt, ", $(col) $(julia_to_sql_type(subpart_type(g, type)))")
end
push!(stmt, ");")
DBInterface.execute(con, join(stmt))
end
# second add all homs (REFERENCE)
for (h, h_dom, h_codom) in homs(acset_schema(g))
DBInterface.execute(con, """
ALTER TABLE $(h_dom) ADD $(h) INTEGER REFERENCES $(h_codom)(_id);
""")
end To be exactly clear, the issue is that it's possible to have an acset like this, which is quite difficult to instantiate a database for, it seems. @present SillySch(FreeSchema) begin
(X,Y)::Ob
x::Hom(X,Y)
y::Hom(Y,X)
end
@acset_type SillyType(SillySch)
x = @acset SillyType begin
X=2
Y=3
x=[1,2]
y=[1,1,2]
end |
Update:
Unsure where to go from here. |
We would like to read/write from an SQL database, we can rely on the https://github.com/MechanicalRabbit/FunSQL.jl package to help automate the writing of SQL code.
The text was updated successfully, but these errors were encountered: