Skip to content

F/2408 add proxied csv distributions #15

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 22, 2022

Conversation

sonofflynn89
Copy link
Collaborator

Part of https://devtopia.esri.com/dc/hub/issues/2408

The focus of this pr is to add the csv distribution to proxied csv datasets.

While working, I discovered that csv distribution urls for proxied csvs require slugs. Using an id instead of a slug will get a 4xx error due to the server architecture.

To get around this, I refactored the plugin to use the getContentSiteUrls, just like the dcat-us 1.1 plugin.

- Use req.hostname as the base for url properties, not siteModel.item
- Use hub-common's getContentSiteUrls when creating the landingPage property
- Add the csv distribution when the dataset is a proxied csv
@@ -11,6 +11,7 @@
"url": "git@github.com:koopjs/koop-output-dcat-ap-201.git"
},
"devDependencies": {
"@esri/arcgis-rest-feature-layer": "^3.2.1",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@drspacemanphd again, this is to fix the same problem we're seeing in dcat-us 1.1

@@ -1,4 +1,4 @@
{
"port": "8080",
"arcgisPortal": "https://www.arcgis.com"
"arcgisPortal": "https://qaext.arcgis.com"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to leave this change in here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, nice catch!

Comment on lines +113 to +115
// TODO: We only pass in hostname because some site item urls are out of sync, causing invalid urls for
// landingPage and identifier. If we can resolve the syncing issues, we can omit hostname and just use
// the absolute url we get from getContentSiteUrls()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting

attributes: hubDataset
} as DatasetResource);
const { relative: relativePath } = getContentSiteUrls(content, siteModel);
const landingPage = siteUrl.startsWith('https://') ? siteUrl + relativePath : `https://${siteUrl}${relativePath}`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the scheme-checking logic?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewctate unfortunately this was the result of a bug from a customer's site. Their site item's URL was wacky and out of sync so Mark had to make this fix in dcat-us. Since we are changing dcat-ap to use slugs like dcat-us, it made sense to port the fix over to this repo in one go.

@@ -33,9 +37,16 @@ export const defaultCalculatedFields = [
'provenance'
];

export function getDcatDataset(hubDataset: any, orgBaseUrl: string, orgTitle: string, siteUrl: string) {
export function getDcatDataset(hubDataset: any, orgBaseUrl: string, orgTitle: string, siteUrl: string, siteModel: IModel) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine for now, but just note that 5 function parameters is pretty unwieldy.

Copy link
Collaborator Author

@sonofflynn89 sonofflynn89 Feb 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, it's awful. I'm hoping we can clean things up when we try and unify logic between the two plugins.


// Required fields from the API
export const defaultRequiredFields = [
'id',
'access', // needed for proxied csv's
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't supply a UserSession, which we don't (for now), all results back are public.

Copy link
Collaborator Author

@sonofflynn89 sonofflynn89 Feb 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As expected! The isProxiedCSV helper reads item.access as part of its flow, so I figured it would be cleaner to include it back as part of the api response instead of stabbing in on right before the helper.


// Required fields from the API
export const defaultRequiredFields = [
'id',
'access', // needed for proxied csv's
'size', // needed for proxied csv's
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a heads up, the Portal Search API does NOT return size with search results, last time I checked. This technically is a Hub enrichment.

@sonofflynn89 sonofflynn89 merged commit eafbb63 into main Feb 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants