Rectified file with new data by jittymolmathew92 · Pull Request #4 · D-Jeffrey/gedcom-samples

jittymolmathew92 · 2026-05-01T16:07:14Z

On behalf of GrampsWeb, I was in a research of the existing Queens.ged file. And identified several issues and updated the file according to the analysis. Updated the clean file by removing all unformatted date data, person don't have father and mother data, special characters, empty notes etc. The detailed report also attaching for your reference and new updated & cleaned file is there in the PR for your reference.

Queen_clean_import_report.md

jittymolmathew92 · 2026-05-01T16:19:33Z

@D-Jeffrey Could you please very the newly cleaned Queen.ged file? It was a different and exciting experience for me and waiting for you verification and comments on the same.

D-Jeffrey · 2026-05-03T19:15:44Z

Reviewing. The analyst is interesting and valid.

Any project that imports GED files should be able to handle hand-crafted records—data exported from other systems or maintained by a family member using their own tracking method. A key part of testing is whether the importer fails gracefully and how it reports or handles those failures. I know that GrampsWeb created Notes as part of the import. I’m also aware of the parent issue in this file, and I’ve seen other files with special-character problems as well. My main question is: are you aiming to support real-world test data, or only “clean”/perfect data, in GrampsWeb? The Queen dataset isn’t a great example for demonstrating lineage or family relationships because it includes many unique, disconnected individuals.

For my purposes have a less than perfect GED file is preferred. When such records are loaded, how does GrampsWeb respond—does the user need to correct everything and produce a perfect file before the import will work? For example, I’ve had exports from MyHeritage that incorrectly included HTML tags in the NOTE/CONC fields. My goal was to stress-test my project with complex, problematic data. As I work through these issues, I see several ways we could help users successfully bring their data into the application.

I'm concerning whether to move towards perfect data or hold onto the glitchy data which is the way I found it on the Internet.

DavidMStraub · 2026-05-04T07:13:51Z

Hi @D-Jeffrey, let me provide some more background. In the process of experimenting with new example trees for Gramps (and Gramps Web - it's the same for desktop and web), @jittymolmathew92 imported the Queen.ged file. Gramps (again, same import code for web and desktop) can import it, but lots of dates are recognized as "text only", which makes them readable but not sortable etc.

This PR contains a file that's normalized to GEDCOM 5.5 standard (I know there isn't really a standard 😉) to avoid such problems. Whether this file is useful for this repo or not I'm not sure, totally up to you. For Gramps, it's just the starting point, a lot more needs to be added manually to make it a useful example database (sources, media, etc.).

D-Jeffrey · 2026-05-05T04:54:15Z

I hear what the request is. If you have a tool which does the work to automatically correct the GED file and produce the markdown, that is really interesting (and needs more testing). If you want to take a copy of the queen.ged or other files, that is fine. I did not create most of them, so I look for no credit. I'm not trying to be difficult. I do think copying and recopying records should be done carefully, especially if others are going to access them and use it for their information source. In my own private tree building, I drop brother and sisters who are not in the generational linage. I have seen others erroneously merge together family, and it is nightmare to unwind. When a genealogy line works, it is a great feeling, when the dates or names get merges or miss-written, the great feel disappears.

My version of the file has 8 less lines than the other sources of it on the Internet here [https://duncan.familygenes.ca/tng/members_data/0033ab/gedcom/Queen_Eliz_II.ged and here https://kingscoronation.com/queen-elizabeth-ii-gedcom-download/ because that was the only way I could get my input to work. The other examples in my collection were from Source Forge https://sourceforge.net/projects/godskingsheroes/ which seems to take it from https://famousfamilytrees.blogspot.com/. And I suspect those where assembled from hand me down sources.

The PR is for
1957 additions & 14545 deletions. The MD Log was very informative for many details and understated other details.

The Markdown stated in point 4. Vendor tags retained (not removed). Which is not true. All 2 RIN were removed. And I agree they add no value, but that is not what was in the log.

I'm not a fan of fixing the dates in an arbitrary way
2 DATE 1030 or 36 -> 2 DATE 1030
or
2 DATE abt. 1066 or 1094 -> 2 DATE ABT 1066

Those kind of records, could be properly corrected with research, but just picking the first date is not something that should be encouraged.

or
2 DATE 935/950 -> 2 DATE (which I assume is not standard)

Whole records were removed

0 @I206@ INDI
1 RIN MH:I206
1 _UID 22FA49DC-F510-4728-A5D1-8BC383379898
1 _UPD 17 MAR 2013 16:54:57 GMT+9.5
1 SEX F
1 BIRT
2 _UID D43205FC-84DF-4820-85A6-F63CFA051894
2 RIN MH:IF1454
2 DATE 705
1 DEAT
2 _UID 5B869AD0-A572-4754-A2C7-BF3448A2D442
2 RIN MH:IF1455
2 DATE 770
1 FAMS @F94@
1 NOTE Birthdate: 	705
2 CONT <p>Birthplace: 	Oppland, Norway</p>
2 CONT <p>Death: 	Died 770 in Norway</p>

0 @I208@ INDI
1 RIN MH:I208
1 _UID A3DD36C9-672F-4765-9C3D-3DDB109B2FA0
1 _UPD 17 MAR 2013 16:56:25 GMT+9.5
1 SEX F
1 BIRT
2 _UID 5515CBB1-0464-479D-90A6-B8A59504F5A9
2 RIN MH:IF1458
2 DATE 605
1 DEAT
2 _UID D3384CFD-4346-44B8-8D55-9726ACC8C1E5
2 RIN MH:IF1459
1 FAMS @F95@
1 NOTE Nicknames: 	"Svidri Heitson's Wife", "Svidre Heitsons kone"
2 CONT <p>Birthdate: 	circa 605</p>
2 CONT <p>Birthplace: 	Of, , , Norway</p>
2 CONT <p>Death: 	(Date and location )</p>

My decision will be to keep the files as I found them so that other developer to have example files which are not clean and not test suite checkbox ready.

As I said, if you want to take the files and clean them in your repo, you are more than welcome to do so, but they will not be representitive of what users may be struggling with, using their own data.

Creating a tool to help, using good pratices (if that is what you have) or teaching users how to correct situations (not files this large), may be the opportunity.

DavidMStraub · 2026-05-05T06:08:20Z

I actually agree it wouldn't make sense to replace the original file, I just recommended to share the cleaned-up file in case it is useful to anyone, but a different repo might be a better place to do so.

D-Jeffrey

Remove this update to keep the orginal file

D-Jeffrey

Keep the addition of the Queen_clean.ged 8bd4fc5, but removed the othe two 09bfe56 and 021fdae

D-Jeffrey · 2026-05-09T13:48:41Z

@jittymolmathew92 I'm looking for a cherry-pick 8bd4fc5 or a drop 09bfe56 & drop 021fdae

If you make those changed then you and @DavidMStraub will get queen_clean.ged and everyone will be happy.

jittymolmathew92 added 3 commits May 1, 2026 17:21

Rectified file with new data.

8bd4fc5

Updated version of queen.ged

021fdae

Removed same file with different name

09bfe56

D-Jeffrey marked this pull request as ready for review May 3, 2026 19:15

D-Jeffrey marked this pull request as draft May 3, 2026 19:18

D-Jeffrey requested review from D-Jeffrey May 3, 2026 19:19

D-Jeffrey marked this pull request as ready for review May 8, 2026 02:45

D-Jeffrey requested changes May 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rectified file with new data#4

Rectified file with new data#4
jittymolmathew92 wants to merge 3 commits into
D-Jeffrey:mainfrom
jittymolmathew92:main

jittymolmathew92 commented May 1, 2026 •

edited

Loading

Uh oh!

jittymolmathew92 commented May 1, 2026

Uh oh!

D-Jeffrey commented May 3, 2026

Uh oh!

DavidMStraub commented May 4, 2026

Uh oh!

D-Jeffrey commented May 5, 2026

Uh oh!

DavidMStraub commented May 5, 2026

Uh oh!

D-Jeffrey left a comment

Uh oh!

D-Jeffrey left a comment

Uh oh!

D-Jeffrey commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jittymolmathew92 commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jittymolmathew92 commented May 1, 2026

Uh oh!

D-Jeffrey commented May 3, 2026

Uh oh!

DavidMStraub commented May 4, 2026

Uh oh!

D-Jeffrey commented May 5, 2026

Uh oh!

DavidMStraub commented May 5, 2026

Uh oh!

D-Jeffrey left a comment

Choose a reason for hiding this comment

Uh oh!

D-Jeffrey left a comment

Choose a reason for hiding this comment

Uh oh!

D-Jeffrey commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jittymolmathew92 commented May 1, 2026 •

edited

Loading