Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

date type between mssql and mysql not working. #186

Open
makkaba opened this issue Jun 12, 2020 · 13 comments
Open

date type between mssql and mysql not working. #186

makkaba opened this issue Jun 12, 2020 · 13 comments

Comments

@makkaba
Copy link

makkaba commented Jun 12, 2020

env

embulk v0.9.23
java openjdk version "1.8.0_252"

situation

(MSSQL)
nullable date type
->
(MYSQL)
nullable date type

issue

destination date columns are filled with NULL value.
even all the source columns have values.

in:
  type: sqlserver
  host: ***
  user: ***
  password: ***
  database: ***
  query: |-
    SELECT
      [sdate],
   ***
    FROM *** WITH (NOLOCK)
out:
  type: mysql
  host: ***
  user: ***
  password:***
  database: ***
  table: ***
  mode: merge_direct
  options: {useUnicode: true, characterEncoding: UTF-8}
  column_options:
    sdate: {type: date null}
@hiroyuki-sato
Copy link
Member

Hello, @makkaba

  • Could you double-check whether sdata contains NULL value?
  • Does your table has primary or unique constraints?
  • Could you tell us more detail about the problem? (What kind of data did you insert and update?)

I tested merge_direct feature in the following environment.
It seems worked well.

  • OS: macOS 10.15
  • MySQL: 8.0.19
  • Embulk: embulk 0.9.23
  • embulk-output-mysql: 0.8.7
in:
  type: inline
  schema:
    - { name: id, type: long }
    - { name: string_column, type: string }
    - { name: double_column, type: double }
    - { name: boolean_column, type: boolean }
    - { name: sdate, type: timestamp }
  data:

    # First data insertion.
    - { id: 1, string_column: test, double_column: 0.1, boolean_column: true, sdate: '2016-06-14 15:19:05' }
    - { id: 2, string_column: test, double_column: 0.1, boolean_column: true, sdate: '2016-06-14 15:19:05' }
    - { id: 3 }

    # 2nd data insertion
    #- { id: 1, string_column: test, double_column: 0.1, boolean_column: true, sdate: '2016-06-14 15:19:05' }
    #- { id: 2, string_column: test, double_column: 0.1, boolean_column: true }
    #- { id: 3, string_column: test, double_column: 0.1, boolean_column: true, sdate: '2016-06-14 15:19:05' }
#out: {type: stdout}
out:
  type: mysql
  host: localhost
  user: user
  password: password
  database: embulk_test
  table: merge_direct_test
  mode: merge_direct
  options: {useUnicode: true, characterEncoding: UTF-8}
  create_table_constraint: 'primary key(id)'
  column_options:
    purchase: {type: date null}

first insert results.

show fields from merge_direct_test;
+----------------+------------+------+-----+---------+-------+
| Field          | Type       | Null | Key | Default | Extra |
+----------------+------------+------+-----+---------+-------+
| id             | bigint     | NO   | PRI | NULL    |       |
| string_column  | text       | YES  |     | NULL    |       |
| double_column  | double     | YES  |     | NULL    |       |
| boolean_column | tinyint(1) | YES  |     | NULL    |       |
| sdate          | timestamp  | YES  |     | NULL    |       |
+----------------+------------+------+-----+---------+-------+
5 rows in set (0.00 sec)
select * from merge_direct_test;
+----+---------------+---------------+----------------+---------------------+
| id | string_column | double_column | boolean_column | sdate               |
+----+---------------+---------------+----------------+---------------------+
|  1 | test          |           0.1 |              1 | 2016-06-14 15:19:05 |
|  2 | test          |           0.1 |              1 | 2016-06-14 15:19:05 |
|  3 | NULL          |          NULL |           NULL | NULL                |
+----+---------------+---------------+----------------+---------------------+
3 rows in set (0.00 sec)

2nd data insert.

  • I expect update the following
    • id 3 sdata: NULL -> '2016-06-14 15:19:05'
    • id 2 sdata: 2016-06-14 15:19:05 -> NULL
mysql> select * from merge_direct_test;
+----+---------------+---------------+----------------+---------------------+
| id | string_column | double_column | boolean_column | sdate               |
+----+---------------+---------------+----------------+---------------------+
|  1 | test          |           0.1 |              1 | 2016-06-14 15:19:05 |
|  2 | test          |           0.1 |              1 | NULL                |
|  3 | test          |           0.1 |              1 | 2016-06-14 15:19:05 |
+----+---------------+---------------+----------------+---------------------+
3 rows in set (0.00 sec)

@makkaba
Copy link
Author

makkaba commented Jun 15, 2020

@hiroyuki-sato
Thank you for testing!
but something is different.
in my case, the type is not timestamp.
yyyy-MM-dd style"date" type.

@hiroyuki-sato
Copy link
Member

Hello, @makkaba

It worked too. The following test used date column.

test script

First insert

After insert

mysql> show fields from merge_direct_test;
+----------------+------------+------+-----+---------+-------+
| Field          | Type       | Null | Key | Default | Extra |
+----------------+------------+------+-----+---------+-------+
| id             | bigint     | NO   | PRI | NULL    |       |
| string_column  | text       | YES  |     | NULL    |       |
| double_column  | double     | YES  |     | NULL    |       |
| boolean_column | tinyint(1) | YES  |     | NULL    |       |
| sdate          | date       | YES  |     | NULL    |       |
+----------------+------------+------+-----+---------+-------+
5 rows in set (0.00 sec)
mysql> select * from merge_direct_test;
+----+---------------+---------------+----------------+------------+
| id | string_column | double_column | boolean_column | sdate      |
+----+---------------+---------------+----------------+------------+
|  1 | test          |           0.1 |              1 | 2016-06-14 |
|  2 | test          |           0.1 |              1 | 2016-06-14 |
|  3 | NULL          |          NULL |           NULL | NULL       |
+----+---------------+---------------+----------------+------------+
3 rows in set (0.00 sec)

2nd insert

After update.

I expect to update the following

  • id 2 sdata: 2016-06-14 -> NULL
  • id 3 sdata: NULL -> 2016-06-14
mysql> select * from merge_direct_test;
+----+---------------+---------------+----------------+------------+
| id | string_column | double_column | boolean_column | sdate      |
+----+---------------+---------------+----------------+------------+
|  1 | test          |           0.1 |              1 | 2016-06-14 |
|  2 | test          |           0.1 |              1 | NULL       |
|  3 | test          |           0.1 |              1 | 2016-06-14 |
+----+---------------+---------------+----------------+------------+
3 rows in set (0.00 sec)

@makkaba
Copy link
Author

makkaba commented Jun 15, 2020

@hiroyuki-sato
thank you for research.

I have realized something new.

In amazon linux ubuntu, it works properly. (it means date value is inserted)

But in WSL(Window Subsystem Linux) Ubuntu in my local window machine,
all the date values are filled with NULL.

It can be WSL problem itself. or some environment problem
ex) timezone ..

Is there any factor which affects to convert date type?

@hiroyuki-sato
Copy link
Member

Hello, @makkaba

Did you check consistency in both environments?
Does WSL environment have primary or unique consistency?

Have you ever executed my test script in your WSL environment?
If so, could you tell me the result?

@makkaba
Copy link
Author

makkaba commented Jun 16, 2020

@hiroyuki-sato
Consistency in both environments are not confirmed.
i assumed that it causes problem.

and this is testing in WSL below.

test1

i have tested your test script in WSL environment.
(timestamp -> date)

result

sdate field is filled with NULL value.

test2

chaging the type of sdate with 'date'

in:
  type: inline
  schema:
    - { name: id, type: long }
    - { name: string_column, type: string }
    - { name: double_column, type: double }
    - { name: boolean_column, type: boolean }
    - { name: sdate, type: date }
  data:

    # First data insertion.
    - { id: 1, string_column: test, double_column: 0.1, boolean_column: true, sdate: '2016-06-14' }
    - { id: 2, string_column: test, double_column: 0.1, boolean_column: true, sdate: '2016-06-14' }
    - { id: 3 }

result

org.embulk.exec.PartialExecutionException: org.jruby.exceptions.RaiseException: (RuntimeError) Unknown type :date
        at org.embulk.exec.BulkLoader$LoaderState.buildPartialExecuteException(BulkLoader.java:340)
        at org.embulk.exec.BulkLoader.doRun(BulkLoader.java:566)
        at org.embulk.exec.BulkLoader.access$000(BulkLoader.java:35)
        at org.embulk.exec.BulkLoader$1.run(BulkLoader.java:353)
        at org.embulk.exec.BulkLoader$1.run(BulkLoader.java:350)
        at org.embulk.spi.Exec.doWith(Exec.java:22)
        at org.embulk.exec.BulkLoader.run(BulkLoader.java:350)
        at org.embulk.EmbulkEmbed.run(EmbulkEmbed.java:242)
        at org.embulk.EmbulkRunner.runInternal(EmbulkRunner.java:291)
        at org.embulk.EmbulkRunner.run(EmbulkRunner.java:155)
        at org.embulk.cli.EmbulkRun.runSubcommand(EmbulkRun.java:431)
        at org.embulk.cli.EmbulkRun.run(EmbulkRun.java:90)
        at org.embulk.cli.Main.main(Main.java:64)
        Suppressed: java.lang.NullPointerException
                at org.embulk.exec.BulkLoader.doCleanup(BulkLoader.java:463)
                at org.embulk.exec.BulkLoader$3.run(BulkLoader.java:397)
                at org.embulk.exec.BulkLoader$3.run(BulkLoader.java:394)
                at org.embulk.spi.Exec.doWith(Exec.java:22)
                at org.embulk.exec.BulkLoader.cleanup(BulkLoader.java:394)
                at org.embulk.EmbulkEmbed.run(EmbulkEmbed.java:245)
                ... 5 more
Caused by: org.jruby.exceptions.RaiseException: (RuntimeError) Unknown type :date
        at RUBY.block in initialize(uri:classloader:/gems/embulk-0.9.23-java/lib/embulk/schema.rb:46)
        at org.jruby.RubyArray.each(org/jruby/RubyArray.java:1735)
        at RUBY.initialize(uri:classloader:/gems/embulk-0.9.23-java/lib/embulk/schema.rb:20)
        at RUBY.block in transaction(uri:classloader:/gems/embulk-0.9.23-java/lib/embulk/input_plugin.rb:60)
        at RUBY.resume(/home/ubuntu/.embulk/lib/gems/gems/embulk-input-inline-0.1.0/lib/embulk/input/inline.rb:21)
        at RUBY.transaction(/home/ubuntu/.embulk/lib/gems/gems/embulk-input-inline-0.1.0/lib/embulk/input/inline.rb:17)
        at RUBY.transaction(uri:classloader:/gems/embulk-0.9.23-java/lib/embulk/input_plugin.rb:58)

Error: org.jruby.exceptions.RaiseException: (RuntimeError) Unknown type :date

@hiroyuki-sato
Copy link
Member

Hello, @makkaba

embulk-input-inline doesn't support date type.
Embulk doesn't support date type. You need to change that part to timestamp.

You need to use date type in embulk-output-mysql side only.

You can check the input side with embulk preview.

@hito4t
Copy link
Contributor

hito4t commented Jun 16, 2020

Hello, @makkaba
It may be a case-sensitive problem of column names.
If the actual column name is SDATE, embulk-output-mysql will distinguish it from the column sdate, and values in the sdate column will not inserted into the SDATE column.

@hiroyuki-sato
Thank you for the investigation!

@makkaba
Copy link
Author

makkaba commented Jun 17, 2020

hi @hito4t
unfortunately, this is not a case-sensitive problem.
tested with same yaml file, the results were different.
and i have tried only with lowercase "sdate"

in sqlserver : "sdate"
in mysql: "sdate"
in yaml file: "sdate"

@hito4t
Copy link
Contributor

hito4t commented Jun 17, 2020

@makkaba
Thank you for your reply.

I'd like to know the actual column names on MySQL.
Would you show the result of desc target_table on MySQL client?

ex)

mysql> desc merge_test;
+--------+------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------+------------+------+-----+---------+-------+
| ID | int(11) | NO | PRI | 0 | |
| VALUE1 | varchar(8) | YES | | NULL | |
| VALUE2 | varchar(8) | YES | | NULL | |
+--------+------------+------+-----+---------+-------+
3 rows in set (0.03 sec)

mysql>

@hiroyuki-sato
Copy link
Member

Hello, @makkaba

The above report is the embulk-input-inline exception. If embulk preview works well, execute embulk run again and paste an exception.

And also would you show the result of show variables like '%lower%'; on MySQL client?

mysql> show variables like '%lower%';
+------------------------+-------+
| Variable_name          | Value |
+------------------------+-------+
| lower_case_file_system | OFF   |
| lower_case_table_names | 0     |
+------------------------+-------+
2 rows in set (0.01 sec)

@makkaba
Copy link
Author

makkaba commented Jun 17, 2020

i appreciate for your investigation.
@hiroyuki-sato
@hito4t

actual column name is
"sdate" in mysql.
(and also "sdate" in sqlserver. )


DESC target_tableXXXX;

sdate | date | YES | MUL |   |  
-- | -- | -- | -- | -- | --
timezone | varchar(128) | NO |   |   |  
order | int(128) | NO |   |   |  

show variables like '%lower%';

lower_case_file_system | OFF
-- | --
lower_case_table_names | 0

i assumed that it is problem of WSL environment variables or something.

@hiroyuki-sato
Copy link
Member

Hello, @makkaba Thank you for your reply.

  • Does target_tableXXXX table have the primary key?
  • Could you also paste embulk run execution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants