Skip to content

Duplicate extracted licenses when parsing from RDF #97

Closed
@xavierfigueroav

Description

@xavierfigueroav

The field Document.extracted_licenses contains duplicate ExtractedLicense objects when they are parsed from RDF files.

It can be noticed by running parse_rdf.py.
Input: SPDXRdfExample.rdf
Output:

doc comment: This is a sample spreadsheet
Creators:
        Person: Gary O'Neall
        Tool: SourceAuditor-V1.2
        Organization: Source Auditor Inc.
Document review information:
        Reviewer: Person: Suzanne Reviewer
        Date: 2011-03-13 00:00:00
        Comment: Another example reviewer.
        Reviewer: Person: Joe Reviewer
        Date: 2010-02-10 00:00:00
        Comment: This is just an example.  Some of the non-standard licenses look like they are actually BSD 3 clause licenses
Creation comment: This is an example of an SPDX spreadsheet format
Package Name: SPDX Translator
Package Version: Version 0.9.2
Package Download Location: http://www.spdx.org/tools
Package Homepage: None
Package Checksum: 2fd4e1c67a2d28fced849ee1bb76e7391b93eb12
Package verification code: 4e3211c67a2d28fced849ee1bb76e7391b93feba
Package excluded from verif: SpdxTranslatorSpdx.txt,SpdxTranslatorSpdx.rdf
Package license concluded: LicenseRef-4 AND LicenseRef-2 AND Apache-1.0 AND LicenseRef-3 AND LicenseRef-1 AND Apache-2.0 AND MPL-1.1
Package license declared: MPL-1.1 AND Apache-2.0 AND LicenseRef-3 AND LicenseRef-2 AND LicenseRef-4 AND LicenseRef-1
Package licenses from files:
        LicenseRef-1
        LicenseRef-3
        Apache-1.0
        MPL-1.1
        LicenseRef-4
        LicenseRef-2
        Apache-2.0
Package Copyright text:  Copyright 2010, 2011 Source Auditor Inc.
Package summary: SPDX Translator utility
Package description: This utility translates and SPDX RDF XML document to a spreadsheet, translates a spreadsheet to an SPDX RDF XML document and translates an SPDX RDFa document to an SPDX RDF XML document.
Package Files:
        File name: Jenna-2.6.3/jena-2.6.3-sources.jar
        File type: ARCHIVE
        File Checksum: 3ab4e1c67a2d28fced849ee1bb76e7391b93f125
        File license concluded: LicenseRef-1
        File license info in file: LicenseRef-1
        File artifact of project name: Jena
        File name: src/org/spdx/parser/DOAPProject.java
        File type: SOURCE
        File Checksum: 2fd4e1c67a2d28fced849ee1bb76e7391b93eb12
        File license concluded: Apache-2.0
        File license info in file: Apache-2.0
        File artifact of project name:
Document Extracted licenses:
        Identifier: LicenseRef-4
        Name: None
        Identifier: LicenseRef-2
        Name: None
        Identifier: LicenseRef-3
        Name: CyberNeko License
        Identifier: LicenseRef-1
        Name: None
        Identifier: LicenseRef-3
        Name: CyberNeko License
        Identifier: LicenseRef-2
        Name: None
        Identifier: LicenseRef-4
        Name: None
        Identifier: LicenseRef-1
        Name: None
Annotations:
        Annotator: Person: Jim Reviewer
        Annotation Date: 2012-06-13 00:00:00
        Annotation Comment: This is just an example. Some of the non-standard licenses look like they are actually BSD 3 clause licenses
        Annotation Type: REVIEW
        Annotation SPDX Identifier: https://spdx.org/spdxdocs/spdx-example-444504E0-4F89-41D3-9A0C-0305E82C3301#SPDXRef-45

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions