Skip to content

Fix #454: Add end-to-end tests for the plugin-related SoftwareMetadata API#456

Open
sdruskat wants to merge 69 commits intorefactor/423-implement-public-apifrom
feature/454-e2e-test-plugin-api
Open

Fix #454: Add end-to-end tests for the plugin-related SoftwareMetadata API#456
sdruskat wants to merge 69 commits intorefactor/423-implement-public-apifrom
feature/454-e2e-test-plugin-api

Conversation

@sdruskat
Copy link
Copy Markdown
Contributor

No description provided.

@sdruskat sdruskat changed the base branch from develop to refactor/data-model November 21, 2025 10:12
@sdruskat sdruskat changed the base branch from refactor/data-model to refactor/423-implement-public-api November 21, 2025 10:17
@sdruskat sdruskat closed this Nov 21, 2025
@sdruskat sdruskat force-pushed the feature/454-e2e-test-plugin-api branch from f45d9b4 to 9be8041 Compare November 21, 2025 10:23
@sdruskat sdruskat deleted the feature/454-e2e-test-plugin-api branch November 21, 2025 10:24
@sdruskat sdruskat restored the feature/454-e2e-test-plugin-api branch November 21, 2025 10:24
@sdruskat sdruskat reopened this Nov 21, 2025
@sdruskat
Copy link
Copy Markdown
Contributor Author

Re-opening after rebase on correct branch.

@SKernchen
Copy link
Copy Markdown
Contributor

SKernchen commented Mar 20, 2026

Findings from user test (tested in poetry environment on command line):
process step:

  • process step fails when one harvester failed in the harvest step
    (maybe process should not look into toml?)

deposit step:
error:

  • 'ld_list' object has no attribute 'startswith'

  • for file add log.info where to find file would be good

@notactuallyfinn
Copy link
Copy Markdown
Collaborator

Could you please provide more information on this error?

'ld_list' object has no attribute 'startswith'

@SKernchen
Copy link
Copy Markdown
Contributor

Could you please provide more information on this error?

'ld_list' object has no attribute 'startswith'

That is the only information provided for this error:
1774021073.908437: hermes.cli: ERROR | An error occurred during execution of deposit (Find details in './hermes.log') 1774021073.910363: hermes.cli: DEBUG | Original exception was: 'ld_list' object has no attribute 'startswith'
I do not know where it is coming from. Everything with startswith seemed fine to me 🤷‍♀️

@notactuallyfinn
Copy link
Copy Markdown
Collaborator

notactuallyfinn commented Mar 23, 2026

Okay, I think I found the error and hopefully fixed it. Please let me know if the commit didn't fix it.
(Can't verify, I never ran into that error.)

@SKernchen
Copy link
Copy Markdown
Contributor

Yes, this error is gone. Now I'm getting
1774263836.359856: hermes.cli: DEBUG | Original exception was: The given license in CodeMeta must be of type str. Licenses of type 'CreativeWork' are not supported.
Though the license is correct.

@SKernchen
Copy link
Copy Markdown
Contributor

Yes, this error is gone. Now I'm getting 1774263836.359856: hermes.cli: DEBUG | Original exception was: The given license in CodeMeta must be of type str. Licenses of type 'CreativeWork' are not supported. Though the license is correct.

Ok so this error comes from invenio.py line 216:

        if not isinstance(license_url, str):
            raise RuntimeError(
                "The given license in CodeMeta must be of type str. "
                "Licenses of type 'CreativeWork' are not supported."
            )

where license_url is in my case {"@id": "{license}"}, so this error triggers

@SKernchen
Copy link
Copy Markdown
Contributor

SKernchen commented Mar 23, 2026

After giving the correct input (using license_url["@id]) it worked, but if your case has different input, we should find the source for this problem.

@notactuallyfinn
Copy link
Copy Markdown
Collaborator

notactuallyfinn commented Mar 23, 2026

According to schema the values for the license property are URLs or CreativeWork objects.
You passed the URL as {"@id": your_url} which is not a value for URLs (that should be {"@value": your_url}, (shouldn't it?), which then gets replaced by your_url by the ld_container). Therefore {"@id": your_url} is handled as a value of type CreativeWork, which is rejected because the conversion is non-trivial.
That is the reason for the new error.
I'll add an exemption for objects of this format {"@id": your_url} for now.
(I'm aware that in tests the format {"@id": your_url} is required. That should/ will be changed.)

@SKernchen
Copy link
Copy Markdown
Contributor

Yeah but why is it passed like that, is the real question right.

"http://schema.org/license": [{"@id": "https://spdx.org/licenses/Apache-2.0"}]

is it in the expanded.json from the curate step

@notactuallyfinn
Copy link
Copy Markdown
Collaborator

notactuallyfinn commented Mar 23, 2026

I don't know why this was (and still is) a requirement for the harvest and process steps.
It worked previously because {"@id": value} was returned by ld_dicts as value, which is not good (the information that this is an id is lost). This was changed so that now an ld_dict with this data_dict ({"@id": value}) is returned.
But that leads to the problem that {"@id": your_url} is no longer returned as a string leading to the error you saw.
I think the options now are to either always try if the id of a CreativeWork object is a valid url for a license or just disallow the use of CreativeWork objects and expect the license to be passed as a value of a URL (i.e. {"@value": your_url}).

@led02
Copy link
Copy Markdown
Member

led02 commented Mar 23, 2026

It worked previously because {"@id": value} was returned by ld_dicts as value, which is not good (the information that this is an id is lost).

This information about the license being an ID is actually contained in the codemeta context: https://github.com/codemeta/codemeta/blob/13bb230d250775026691a102a24b97dea7669bf6/codemeta.jsonld#L44

@SKernchen
Copy link
Copy Markdown
Contributor

Nice now it works:
More points I found:

  • With the postprocess plugin (invenio_rdm) in hermes.toml removes license header and sorts postprocess above deposit (and add "," to every list)
  • harvest: breaks if one plugin is not found and don't run the others (if broken plugin is first)
  • process broke with 'str' object has no attribute 'append' with a codemeta with conflicts in it
  • deposit without --initial broke with Error while executing invenio_rdm: Found two different record ids or conflicting metadata.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants