Skip to content

Commit 1ae8fc4

Browse files
author
marie
committed
fix v1
1 parent a0e75ee commit 1ae8fc4

11 files changed

+62
-64
lines changed

docs/csv/dwh_data/dwh_data.csv

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,12 @@
11
Field,User Guide,ETL Conventions,Datatype,Required,Primary Key,Foreign Key,FK Table
22
data_num,Unique identifier, ,bigint(64),Yes,Yes,No," "
3-
patient_num, , ,bigint(64),`Yes`,No,Yes,dwh_patient
4-
thesaurus_data_num,"The concept identifier associated with the data record (e.g., diagnosis, procedure, lab test).", ,bigint(64),`Yes`,No,Yes,dwh_thesaurus_data
5-
thesaurus_code,"The source vocabulary code corresponding to the associated concept (e.g., ICD-10, LOINC, ATC).", ,varchar(40),`Yes`,No,No," "
3+
patient_num, , ,bigint(64),Yes,No,Yes,dwh_patient
4+
thesaurus_data_num,"The concept identifier associated with the data record (e.g., diagnosis, procedure, lab test).", ,bigint(64),Yes,No,Yes,dwh_thesaurus_data
5+
thesaurus_code,"The source vocabulary code corresponding to the associated concept (e.g., ICD-10, LOINC, ATC).", ,varchar(40),Yes,No,No," "
66

7-
document_date,The date the data was recorded in the source system., ,timestamptz,`Yes`,No,No," "
8-
start_date,"The start date of the clinical event, observation or drugs.", ,timestamptz,`Yes`,No,No," "
7+
document_date,The date the data was recorded in the source system., ,timestamptz,Yes,No,No," "
8+
start_date,"The start date of the clinical event, observation or drugs.", ,timestamptz,Yes,No,No," "
99
end_date,"The end date of the clinical event, observation or drugs, if applicable.", ,timestamptz,No,No,No," "
10-
age_patient,The age of the patient at the time of the data record., ,double(53),No,No,No," "
1110

1211

1312
val_numeric,"A numeric value associated with the data record (e.g., lab result, measurement).", ,double(53),No,No,No," "
@@ -30,5 +29,5 @@ document_num,The identifier of the document grouping multiple related data recor
3029
data_pid,"Optional pseudo-identifier for a data. Mainly included for structural consistency; not required for standard analytical use.",Generated as a hash of *id_data_source* combined with *data_salt*.,varchar(300),No,No,No," "
3130
data_salt,Optional random salt used in the hash algorithm to generate *data_pid*., ,varchar(300),No,No,No," "
3231

33-
upload_id,ETL Pipeline identifier,Defined at the start of the pipeline as `datetime.now().strftime("%Y%m%d%H%M%S")`,bigint(64),No,No,No," "
34-
updated_date,Last modification of this record., ,date,Yes,No,No," "
32+
upload_id,"Identifier of the pipeline integration run, used to differentiate each batch of integrated data.",Defined at the start of the pipeline as `datetime.now().strftime("%Y%m%d%H%M%S")`. <br> For example a batch integrated on 15/09/2025 at 00:00:00 has `upload_id = 20250915000000`.,bigint(64),No,No,No," "
33+
update_date,Date and time of the record’s last update., ,timestamptz,No,No,No," "

docs/csv/dwh_document.csv

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,11 @@
11
Field,User Guide,ETL Conventions,Datatype,Required,Primary Key,Foreign Key,FK Table
22
document_num,Unique identifier, ,bigint(64),Yes,Yes,No," "
3-
patient_num, , ,bigint(64),`Yes`,No,Yes,dwh_patient
3+
patient_num, , ,bigint(64),Yes,No,Yes,dwh_patient
44
title, The title of the document. , ,varchar(400),No,No,No," "
5-
document_date,The date the document was recorded., ,timestamptz,`Yes`,No,No," "
5+
document_date,The date the document was recorded., ,timestamptz,Yes,No,No," "
66
document_type,"The type of the document (e.g. CR, observation, formulaire, ...)", ,varchar(100),No,No,No," "
77
author,The person who created or authored the document, ,varchar(200),No,No,No," "
8-
displayed_text,The content of the document.,Convert to html,text,`Yes`,No,No," "
9-
age_patient,The age of the patient at the time the document was created., ,integer(32),No,No,No," "
8+
displayed_text,The content of the document.,Convert to html,text,Yes,No,No," "
109

1110
stay_num,The visit during which the document was created., ,bigint(64),No,No,Yes,dwh_patient_stay
1211
department_num,The service associated with the document. , ,bigint(64),No,No,Yes,dwh_thesaurus_department
@@ -19,5 +18,5 @@ id_doc_source,Unique identifier in source software, ,varchar(300),No,No,No," "
1918
document_pid,"Optional pseudo-identifier for a document. Mainly included for structural consistency; not required for standard analytical use.",Generated as a hash of *id_doc_source* combined with *document_salt*.,varchar(300),No,No,No," "
2019
document_salt,Optional random salt used in the hash algorithm to generate *document_pid*., ,varchar(300),No,No,No," "
2120

22-
upload_id,ETL Pipeline identifier,Defined at the start of the pipeline as `datetime.now().strftime("%Y%m%d%H%M%S")`,bigint(64),No,No,No," "
23-
update_date,Last modification of this record., ,date,No,No,No," "
21+
upload_id,"Identifier of the pipeline integration run, used to differentiate each batch of integrated data.",Defined at the start of the pipeline as `datetime.now().strftime("%Y%m%d%H%M%S")`. <br> For example a batch integrated on 15/09/2025 at 00:00:00 has `upload_id = 20250915000000`.,bigint(64),No,No,No," "
22+
update_date,Date and time of the record’s last update., ,timestamptz,No,No,No," "

docs/csv/dwh_patient.csv

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,23 @@
11
Field,User Guide,ETL Conventions,Datatype,Required,Primary Key,Foreign Key,FK Table
22
patient_num,Unique identifier, ,bigint(64),Yes,Yes,No," "
3-
lastname, ,*null* in CNIL compliant warehouse,varchar(100),No,No,No," "
3+
lastname, ,*null* in CNIL compliant warehouse,varchar(120),No,No,No," "
44
maiden_name, ,*null* in CNIL compliant warehouse,varchar(120),No,No,No," "
5-
firstname, ,*null* in CNIL compliant warehouse,varchar(100),No,No,No," "
5+
firstname, ,*null* in CNIL compliant warehouse,varchar(120),No,No,No," "
66
birth_date, ,"If the precise date include day or month is not known or not allowed, January is used as the default month, and the first day of the month the default day",timestamptz,No,No,No," "
7-
sex,Biological sex at birth,"*F* (female), *M* (male), empty if unknown",varchar(2),No,No,No," "
7+
sex,Biological sex at birth,"*F* (female), *M* (male), *O* (other), empty if unknown",varchar(2),No,No,No," "
88
nss,Social security number,*null* in CNIL compliant warehouse,varchar(20),No,No,No," "
9-
phone_number, ,*null* in CNIL compliant warehouse,varchar(1000),No,No,No," "
9+
phone_number, ,*null* in CNIL compliant warehouse,varchar(50),No,No,No," "
1010
email, ,*null* in CNIL compliant warehouse,varchar(500),No,No,No," "
1111
residence_address, ,*null* in CNIL compliant warehouse,varchar(1000),No,No,No," "
1212
residence_country, , ,varchar(100),No,No,No," "
1313
residence_city, , ,varchar(200),No,No,No," "
1414
zip_code, , ,varchar(30),No,No,No," "
1515
birth_country, , ,varchar(100),No,No,No," "
16-
birth_city, , ,varchar(100),No,No,No," "
17-
birth_zip_code, , ,varchar(10),No,No,No," "
16+
birth_city, , ,varchar(200),No,No,No," "
17+
birth_zip_code, , ,varchar(30),No,No,No," "
1818
death_code,Vital status,"*null* if alive, *d* if dead",varchar(2),No,No,No," "
1919
death_date,Date of death,"If the precise date include day or month is not known or not allowed, January is used as the default month, and the first day of the month the default day",timestamptz,No,No,No," "
20-
is_merged,Use to indicate this patient was merged in another patient,*true* if merged,boolean,No,No,No," "
20+
is_merged,Indicate this patient was merged in another patient,*true* if merged,boolean,No,No,No," "
2121
merged_into,Patient this one was merged into, ,bigint(64),No,No,Yes,dwh_patient
2222
max_date,The last time this patient came to the healthcare facility,Computed from the other table,date,No,No,No," "
2323

@@ -26,5 +26,5 @@ instance_id,"Code of the healthcare center, see *hospital_instance* for more inf
2626
patient_pid,"Pseudo-identifier for the patient. Used in correspondence tables to retrieve the original patient identity if needed.",Generated as a SHA-256 hash of *id_patient_source* combined with *patient_salt*.,varchar(300),No,No,No," "
2727
patient_salt,Random salt used in the hash algorithm to generate patient_pid., ,varchar(300),No,No,No," "
2828

29-
upload_id,ETL Pipeline identifier,Defined at the start of the pipeline as `datetime.now().strftime("%Y%m%d%H%M%S")`,bigint(64),No,No,No," "
30-
update_date,Last modification of this record., ,timestamptz,No,No,No," "
29+
upload_id,"Identifier of the pipeline integration run, used to differentiate each batch of integrated data.",Defined at the start of the pipeline as `datetime.now().strftime("%Y%m%d%H%M%S")`. <br> For example a batch integrated on 15/09/2025 at 00:00:00 has `upload_id = 20250915000000`.,bigint(64),No,No,No," "
30+
update_date,Date and time of the record’s last update., ,timestamptz,No,No,No," "

docs/csv/dwh_patient_ipphist.csv

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
Field,User Guide,ETL Conventions,Datatype,Required,Primary Key,Foreign Key,FK Table
22
ipphist_num,Unique identifier, ,bigint(64),Yes,Yes,No," "
33
patient_num, , ,bigint(64),Yes,No,Yes,dwh_patient
4-
hospital_patient_id,Patient identifier in source data (IPP), ,varchar(100),`Yes`,No,No," "
4+
hospital_patient_id,Patient identifier in source data (IPP), ,varchar(100),Yes,No,No," "
55
origin_patient_id,Use to separate real identifier from technical identifier (for instance pk in data source with no medical meaning) ,Set to *SIH* for real identifier,varchar(40),No,No,No," "
6-
master_patient_id,Indicate which identifier should be display to users. Each patient should have one and only one master identifier.,*true* if this identifier is the master,boolean,No,No,No," "
6+
master_patient_id,Indicate which identifier should be display to users. Each patient should have one and only one master identifier.,*true* if this identifier is the master,boolean,Yes,No,No," "
77
instance_ipp_id,"Code of the healthcare center, see *hospital_instance* for more informations",Names accross all instance_id fields in other tables should match,varchar(40),No,No,No," "
88
ipp_origin_code,Indicate source software for this record., ,varchar(300),No,No,No," "
9-
upload_id,ETL Pipeline identifier,Defined at the start of the pipeline as `datetime.now().strftime("%Y%m%d%H%M%S")`,bigint(64),No,No,No," "
10-
update_date,Last modification of this record., ,timestamptz,No,No,No," "
9+
upload_id,"Identifier of the pipeline integration run, used to differentiate each batch of integrated data.",Defined at the start of the pipeline as `datetime.now().strftime("%Y%m%d%H%M%S")`. <br> For example a batch integrated on 15/09/2025 at 00:00:00 has `upload_id = 20250915000000`.,bigint(64),No,No,No," "
10+
update_date,Date and time of the record’s last update., ,timestamptz,No,No,No," "

docs/csv/dwh_patient_mvt.csv

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,16 @@
11
Field,User Guide,ETL Conventions,Datatype,Required,Primary Key,Foreign Key,FK Table
22
mvt_num,Unique identifier, ,bigint(64),Yes,Yes,No," "
3-
patient_num, , ,bigint(64),`Yes`,No,Yes,dwh_patient
4-
stay_num,Use this field to link the movement record to its visit., ,bigint(64),`Yes`,No,Yes,dwh_patient_stay
3+
patient_num, , ,bigint(64),Yes,No,Yes,dwh_patient
4+
stay_num,Use this field to link the movement record to its visit., ,bigint(64),Yes,No,Yes,dwh_patient_stay
55

6-
entry_date,Admission date, ,timestamptz,`Yes`,No,No," "
6+
entry_date,Admission date, ,timestamptz,Yes,No,No," "
77
out_date,Discharge date, ,timestamptz,No,No,No," "
88

9-
mvt_entry_mode,Indicates where a person was admitted from,"Values are not standardized yet. <br> Example values : *Transfert*, *Domicile*, *Mutation*",varchar(255),No,No,No," "
10-
mvt_exit_mode,Indicates where a person was discharged to,"Values are not standardized yet. <br> Example values : *Transfert*, *Domicile*, *Mutation*, *Décès*",varchar(300),No,No,No," "
9+
mvt_entry_mode,Indicates where a person was admitted from,"Values are not standardized yet. <br> Example values : *Transfert*, *Domicile*, *Mutation*",varchar(500),No,No,No," "
10+
mvt_exit_mode,Indicates where a person was discharged to,"Values are not standardized yet. <br> Example values : *Transfert*, *Domicile*, *Mutation*, *Décès*",varchar(500),No,No,No," "
1111

12-
type_mvt,"Represents the kind of visit that took place (inpatient, outpatient, emergency, etc.)","**TODO** <br> *C* : Consultation<br> *J* : HDJ<br> *U* : Urgence<br> *H* : Hospitalisation<br> *A* : Ambulatoire<br> *S* : Séance<br> *AM* : Ambulatoire MCO<br> *AP* : Ambulatoire PSY",varchar(30),No,No,No," "
13-
mvt_order,Indicates the sequential order of patient movements within a visit.,"This derived field assigns a numeric ranking to each movement event, allowing reconstruction of the temporal flow of movements during a single visit (e.g., for visualization or analysis).",integer(32),No,No,No," "
12+
type_mvt,"Represents the kind of visit that took place (inpatient, outpatient, emergency, etc.)","*C* : Consultation<br> *J* : HDJ<br> *U* : Urgence<br> *H* : Hospitalisation<br> *A* : Ambulatoire<br> *S* : Séance<br> *AM* : Ambulatoire MCO<br> *AP* : Ambulatoire PSY<br> *E* : Externes",varchar(30),No,No,No," "
13+
mvt_order,Indicates the sequential order of patient movements within a visit.,"This derived field assigns a numeric ranking to each movement event based on ascending `mvt_entry_date`, allowing reconstruction of the temporal flow of movements during a single visit (e.g., for visualization or analysis).",integer(32),No,No,No," "
1414

1515
department_num,Service of admission, ,bigint(64),No,No,Yes,dwh_thesaurus_department
1616
unit_num,Unit of admission, ,bigint(64),No,No,Yes,dwh_thesaurus_unit
@@ -22,5 +22,5 @@ id_mvt_source,Unique identifier in source software, ,varchar(300),No,No,No," "
2222
mvt_pid,"Optional pseudo-identifier for a movement. Mainly included for structural consistency; not required for standard analytical use.",Generated as a hash of *id_mvt_source* combined with *mvt_salt*.,varchar(300),No,No,No," "
2323
mvt_salt,Optional random salt used in the hash algorithm to generate *mvt_pid*., ,varchar(300),No,No,No," "
2424

25-
upload_id,ETL Pipeline identifier,Defined at the start of the pipeline as `datetime.now().strftime("%Y%m%d%H%M%S")`,bigint(64),No,No,No," "
26-
update_date,Last modification of this record., ,timestamptz,No,No,No," "
25+
upload_id,"Identifier of the pipeline integration run, used to differentiate each batch of integrated data.",Defined at the start of the pipeline as `datetime.now().strftime("%Y%m%d%H%M%S")`. <br> For example a batch integrated on 15/09/2025 at 00:00:00 has `upload_id = 20250915000000`.,bigint(64),No,No,No," "
26+
update_date,Date and time of the record’s last update., ,timestamptz,No,No,No," "

docs/csv/dwh_patient_stay.csv

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,15 @@
11
Field,User Guide,ETL Conventions,Datatype,Required,Primary Key,Foreign Key,FK Table
22
stay_num,Unique identifier, ,bigint(64),Yes,Yes,No," "
3-
encounter_num,Encounter number,*null* in CNIL compliant warehouse,varchar(300),Yes,No,No," "
4-
patient_num, , ,bigint(64),`Yes`,No,Yes,dwh_patient
5-
entry_date,Admission date, ,timestamptz,`Yes`,No,No," "
3+
encounter_num,"Unique identifier of the patient visit, as recorded in the source hospital information system",*null* in CNIL compliant warehouse,varchar(300),Yes,No,No," "
4+
patient_num, , ,bigint(64),Yes,No,Yes,dwh_patient
5+
entry_date,Admission date, ,timestamptz,Yes,No,No," "
66
out_date,Discharge date,*out_date* should be greater or equal than *entry_date*,timestamptz,No,No,No," "
7-
entry_mode,Indicates where a person was admitted from,"Values are not standardized yet.<br> Example values : *Transfert*, *Domicile*, *Mutation*",varchar(400),No,No,No," "
7+
entry_mode,Indicates where a person was admitted from,"Values are not standardized yet.<br> Example values : *Transfert*, *Domicile*, *Mutation*",varchar(500),No,No,No," "
88
out_mode,Indicates where a person was discharged to after a visit,"Values are not standardized yet.<br> Example values : *Transfert*, *Domicile*, *Mutation*, *Décès*",varchar(500),No,No,No," "
9-
type_dos,"Represents the kind of visit that took place (inpatient, outpatient, emergency, etc.)","*Consultation*, *HDJ*, *HAD*, *Urgence*, *Hospitalisation*, *Ambulatoire*",varchar(10),No,No,No," "
9+
type_dos,"Represents the kind of visit that took place (inpatient, outpatient, emergency, etc.)","*Consultation*, *HDJ*, *HAD*, *Urgence*, *Hospitalisation*, *Ambulatoire*, *Externes*",varchar(50),No,No,No," "
1010
instance_stay_id,"Code of the healthcare center, see *hospital_instance* for more informations", ,varchar(40),No,No,No," "
1111
stay_origin_code,Indicate source software for this stay, ,varchar(300),No,No,No," "
1212
stay_pid,Pseudo identifier of the stay. Use in correspondence table to retrieve patient identity when needed., ,varchar(300),No,No,No," "
1313
stay_salt,Salt for hash algorithm, ,varchar(300),No,No,No," "
14-
upload_id,ETL Pipeline identifier,Defined at the start of the pipeline as `datetime.now().strftime("%Y%m%d%H%M%S")`,bigint(64),No,No,No," "
15-
update_date,Last modification of this record., ,timestamptz,No,No,No," "
14+
upload_id,"Identifier of the pipeline integration run, used to differentiate each batch of integrated data.",Defined at the start of the pipeline as `datetime.now().strftime("%Y%m%d%H%M%S")`. <br> For example a batch integrated on 15/09/2025 at 00:00:00 has `upload_id = 20250915000000`.,bigint(64),No,No,No," "
15+
update_date,Date and time of the record’s last update., ,timestamptz,No,No,No," "

docs/csv/dwh_thesaurus_data.csv

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
Field,User Guide,ETL Conventions,Datatype,Required,Primary Key,Foreign Key,FK Table
22
thesaurus_data_num,Unique identifier, ,bigint(64),Yes,Yes,No," "
3-
thesaurus_code,"Identifier of the vocabulary or coding system (e.g. *CIM10*, *CCAM*, *ATC*).", ,varchar(30),`Yes`,No,No," "
4-
concept_code,"The concept code represents the identifier of the concept in the source vocabulary. Note that concept codes are not unique across vocabularies.", ,varchar(100),`Yes`,No,No," "
5-
concept_str,Human-readable name or label of the concept., ,varchar(2000),`Yes`,No,No," "
3+
thesaurus_code,"Identifier of the vocabulary or coding system (e.g. *CIM10*, *CCAM*, *ATC*).", ,varchar(30),Yes,No,No," "
4+
concept_code,"The concept code represents the identifier of the concept in the source vocabulary. Note that concept codes are not unique across vocabularies.", ,varchar(100),Yes,No,No," "
5+
concept_str,Human-readable name or label of the concept., ,varchar(2000),Yes,No,No," "
66
description,Extended description or definition of the concept., ,varchar(4000),No,No,No," "
77
measuring_unit,"Measurement unit for quantitative concepts (e.g., mg, mmHg).", ,varchar(50),No,No,No," "
8-
value_type,Type of value expected,"*numeric* : quantitative value <br> *text*: free-text value <br> *present* : presence/absence indicator <br> *liste* : enumerated values",varchar(50),`Yes`,No,No," "
8+
value_type,Type of value expected,"*numeric* : quantitative value <br> *text*: free-text value <br> *present* : presence/absence indicator <br> *liste* : enumerated values",varchar(50),Yes,No,No," "
99
list_values,List of values when type is *liste* else *null*,"Values should be split by *;* (e.g. *principal;associé;relié*"),varchar(4000),No,No,No," "
1010
thesaurus_parent_num,Parent concept in the hierarchy., ,bigint(64),No,No,Yes,dwh_thesaurus_data
1111
count_data_used,Count of *dwh_data* associated with this concept, ,integer(32),No,No,No," "
12-
updated_date,Last modification of this record., ,date,Yes,No,No," "
13-
upload_id,ETL Pipeline identifier,Defined at the start of the pipeline as `datetime.now().strftime("%Y%m%d%H%M%S")`,bigint(64),No,No,No," "
12+
update_date,Date and time of the record’s last update., ,timestamptz,No,No,No," "
13+
upload_id,"Identifier of the pipeline integration run, used to differentiate each batch of integrated data.",Defined at the start of the pipeline as `datetime.now().strftime("%Y%m%d%H%M%S")`. <br> For example a batch integrated on 15/09/2025 at 00:00:00 has `upload_id = 20250915000000`.,bigint(64),No,No,No," "

0 commit comments

Comments
 (0)