Skip to content

Conversation

@AleDore
Copy link
Contributor

@AleDore AleDore commented Jul 9, 2020

Municipalities porting and refactor

@pagopa-github-bot
Copy link
Contributor

pagopa-github-bot commented Jul 9, 2020

Affected stories

  • 🌟 #171683039: Come APP voglio poter accedere al dettaglio delle municipalità

New dependencies added: csv-parse, node-fetch and @types/node-fetch.

csv-parse

Author: David Worms

Description: CSV parsing implementing the Node.js `stream.Transform` API

Homepage: https://csv.js.org/parse/

Createdover 6 years ago
Last Updated2 days ago
LicenseMIT
Maintainers1
Releases99
Direct Dependencies
Keywordscsv, parse, parser, convert, tsv and stream
README

CSV Parser for Node.js

Build Status

Part of the CSV module, this project is a parser converting CSV text input into arrays or objects. It implements the Node.js stream.Transform API. It also provides a simple callback-based API for convenience. It is both extremely easy to use and powerful. It was first released in 2010 and is used against big data sets by a large community.

Documentation

Features

  • Follow the Node.js streaming API
  • Simplicity with the optional callback API
  • Support delimiters, quotes, escape characters and comments
  • Line breaks discovery
  • Support big datasets
  • Complete test coverage and samples for inspiration
  • No external dependencies
  • Work nicely with the csv-generate, stream-transform and csv-stringify packages
  • MIT License

node-fetch

Author: David Frank

Description: A light-weight module that brings window.fetch to node.js

Homepage: https://github.com/bitinn/node-fetch

Createdover 5 years ago
Last Updatedabout 1 month ago
LicenseMIT
Maintainers4
Releases58
Direct Dependencies
Keywordsfetch, http and promise

@types/node-fetch

Author: Unknown

Description: TypeScript definitions for node-fetch

Homepage: http://npmjs.com/package/@types/node-fetch

Createdalmost 4 years ago
Last Updated3 months ago
LicenseMIT
Maintainers1
Releases33
Direct Dependencies@types/node and form-data
README

Installation

npm install --save @types/node-fetch

Summary

This package contains type definitions for node-fetch (https://github.com/bitinn/node-fetch).

Details

Files were exported from https://github.com/DefinitelyTyped/DefinitelyTyped/tree/master/types/node-fetch.

Additional Details

Credits

These definitions were written by Torsten Werner, Niklas Lindgren, Vinay Bedre, Antonio Román, Andrew Leedham, Jason Li, Brandon Wilson, Steve Faulkner, ExE Boss, Alex Savin, Alexis Tyler, and Jakub Kisielewski.

Generated by 🚫 dangerJS

@codecov-commenter
Copy link

codecov-commenter commented Jul 9, 2020

Codecov Report

Merging #59 into master will decrease coverage by 1.31%.
The diff coverage is 89.01%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #59      +/-   ##
==========================================
- Coverage   86.15%   84.83%   -1.32%     
==========================================
  Files          25       31       +6     
  Lines         816     1108     +292     
  Branches       50       64      +14     
==========================================
+ Hits          703      940     +237     
- Misses        112      167      +55     
  Partials        1        1              
Impacted Files Coverage Δ
SendUserDataDownloadMessageActivity/handler.ts 29.72% <ø> (ø)
utils/file.ts 84.00% <84.00%> (ø)
...eleteOrchestrator/GetUserDataProcessing/handler.ts 88.46% <88.46%> (ø)
ExtractUserDataActivity/handler.ts 80.09% <100.00%> (+2.62%) ⬆️
utils/conversions.ts 88.63% <100.00%> (ø)
UserDataDownloadOrchestrator/handler.ts 92.59% <0.00%> (ø)
utils/appinsights.ts 84.21% <0.00%> (ø)
UserDataProcessingTrigger/index.ts 100.00% <0.00%> (ø)
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cdae189...5b65a46. Read the comment docs.

package.json Outdated
"io-ts": "1.8.5",
"italia-ts-commons": "^8.1.0",
"randomstring": "^1.1.5",
"request": "^2.88.2",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may we use node-fetch instead of request ? it comes from io-functions-commons so we can avoid adding a new dep (moreover, request package is obsolete)

utils/file.ts Outdated
() => getBlobAsText(blobService, containerName, blobName).then(e => e),
toError
).foldTaskEither(
err => fromEither(left(err)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
err => fromEither(left(err)),
fromLeft,

tryCatch(
() => getBlobAsText(blobService, containerName, blobName).then(e => e),
toError
).foldTaskEither(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we need this call to foldTaskEither here? it looks like it's mapping identities

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getBlobAsText returns Promise<Either<Error, BlobResult>> so I've used the same code to turn it into a TaskEither -> see

const fromPromiseEither = <L, R>(

utils/file.ts Outdated
).then(e => e),
toError
).foldTaskEither(
err => fromEither(left(err)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as previous

_.foldL(
() => fromLeft(new Error("Municipalities with catastale is empty")),
__ =>
parseCsv(__, optionMunicipalitiesWithCatastale).foldTaskEither(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

)
.chain(rawFile =>
rawFile.foldL(
() => fromEither(leftE(new Error("abolished municipalities is empty"))),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fromEither(left) === fromLeft()

.chain(result =>
array.sequence(taskEither)(
result
.filter(r => r[9] !== "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a comment here on what's happening would be useful or better you can isolate this filter into its own method with a meaningful name

),
municipality =>
serializeMunicipalityToJson(blobService, {
codiceCatastale: r[9],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe an object with a map between indexex (integers) and column names (constants) can be useful here

Comment on lines +93 to +100
if (record.length < 13) {
return leftE([
{
context: [],
value: "record has not the right length"
}
]);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this? I think just the validation below is enough (it would return a well-formed validation error indicating that some values are undefined).
So you can keep all the parsing logic in the Municipality decoder

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this check is trying to avoid column shifting problems, because if a record length is larger than expected, we can assign a wrong value to a property. Moreover we use a positional notation to extract values from csv files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a record length is larger than expected

Actually the check is for records shorter than expected, isn't it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the logic is the same, maybe we can turn this check into record.length !== <expected_record_length>

__ =>
parseCsv(__, optionMunicipalitiesWithCatastale).foldTaskEither(
e => fromEither(leftE(e)),
matrixString => fromEither(rightE(matrixString))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
matrixString => fromEither(rightE(matrixString))
matrixString => taskEither.of(matrixString)

() => fromLeft(new Error("Municipalities with catastale is empty")),
__ =>
parseCsv(__, optionMunicipalitiesWithCatastale).foldTaskEither(
e => fromEither(leftE(e)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
e => fromEither(leftE(e)),
e => fromLeft(e),

Comment on lines 168 to 182
.foldTaskEither<Error, Map<string, string>>(
err => fromLeft(err),
municipalitiesCatastaleRows =>
fromEither(
rightE(
municipalitiesCatastaleRows.reduce(
(map: Map<string, string>, row) => {
map.set(row[1].toLowerCase(), row[0]);
return map;
},
new Map<string, string>()
)
)
)
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is foldTaskEither needed? as you are not lifting left type, I think you can use map

    .map(
      municipalitiesCatastaleRows =>
          municipalitiesCatastaleRows.reduce(
            (map: Map<string, string>, row) => {
              map.set(row[1].toLowerCase(), row[0]);
              return map;
            },
            new Map<string, string>()
          )
       );

(to be verified)

Copy link
Contributor Author

@AleDore AleDore Jul 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I' ve changed it into a chain

);

const fromAbolishedMunicipalityToSerializableMunicipality = (
abolishedMunicipality: t.TypeOf<typeof AbolishedMunicipality>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
abolishedMunicipality: t.TypeOf<typeof AbolishedMunicipality>,
abolishedMunicipality: AbolishedMunicipality,

Comment on lines 188 to 199
return {
codiceCatastale,
municipality: {
codiceProvincia: "",
codiceRegione: "",
denominazione: abolishedMunicipality.comune,
denominazioneInItaliano: abolishedMunicipality.comune,
denominazioneRegione: "",
siglaProvincia: abolishedMunicipality.provincia
}
} as ISerializableMunicipality;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid of this cast. The object isn't recognised as ISerializableMunicipality because we're not sure municipality is a Municipality.

Consider decode the municipality field value:

   municipality: Municipality.decode({
      codiceProvincia: "",
      codiceRegione: "",
      denominazione: abolishedMunicipality.comune,
      denominazioneInItaliano: abolishedMunicipality.comune,
      denominazioneRegione: "",
      siglaProvincia: abolishedMunicipality.provincia
    }).getOrElseL(_ => { throw new Error(/* err message */); }

Or even, why not make ISerializableMunicipality a decoder itself?

)
.chain(rawFile =>
rawFile.foldL(
() => fromEither(leftE(new Error("abolished municipalities is empty"))),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
() => fromEither(leftE(new Error("abolished municipalities is empty"))),
() => fromLeft(new Error("abolished municipalities is empty")),

.chain(rawFile =>
rawFile.foldL(
() => fromEither(leftE(new Error("abolished municipalities is empty"))),
json => fromEither(rightE(JSON.parse(json)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
json => fromEither(rightE(JSON.parse(json)))
json => taskEither.of(JSON.parse(json)))

Besides of that, be careful because JSON.parse can throw! Consider use parseJSON

@AleDore AleDore requested review from balanza and gunzip July 10, 2020 13:04
Copy link
Contributor

@gunzip gunzip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm apart from minor changes

{
"bindings": [
{
"schedule": "0 */15 * * * *",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how many times is this supposed to run ? once a day suffices

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which hour?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've set every day at 01:00 AM

)
.foldTaskEither<Error, void>(
err => {
context.log.error(`Cannot update municipalities| ${err}`);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may we prefix the error with the name of the function? moreover get rid of the space after '|'

exportCurrentMunicipalities(blobService),
exportForeignMunicipalities(blobService)
)
.foldTaskEither<Error, void>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like mapLeft() would suffice here (instead of folding)


import { getUpdateMunicipalitiesHandler } from "./handler";

const storageConnectionString = getRequiredStringEnv("StorageConnection");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const storageConnectionString = getRequiredStringEnv("StorageConnection");
const storageConnectionString = getRequiredStringEnv("AssetsStorageConnection");

denominazioneRegione: "",
siglaProvincia: abolishedMunicipality.provincia
}
}).getOrElseL(_ => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may we just return the either and skip all the invalid entries with a warning?

abolishedMunArray
.filter(am => municipalityToCatastale.has(am.comune.toLowerCase()))
.map(am =>
fromAbolishedMunicipalityToSerializableMunicipality(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may throw currently, see comment above

)
.chain(result =>
array.sequence(taskEither)(
filterEmptyCodiceCatastaleRecords(result, 9 as NonNegativeInteger).map(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why "9" ? I guess the index value must be hardcoded into filterEmptyCodiceCatastaleRecords instead of being taken as paramter

Copy link
Contributor Author

@AleDore AleDore Jul 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, fixed

@gunzip
Copy link
Contributor

gunzip commented Jul 10, 2020

why draft?=

@AleDore AleDore marked this pull request as ready for review July 10, 2020 17:00
@AleDore AleDore requested a review from gunzip July 10, 2020 17:00
@gunzip gunzip changed the title [#171683039] - Porting of municipalities exports from io-service-metadata [#171683039] Import and staticize municipalities from csv Jul 13, 2020
blobService: BlobService
) => async (context: Context): Promise<unknown> => {
return sequenceT(taskEither)(
const partialResults = sequenceT(taskEitherSeq)(
Copy link
Contributor Author

@AleDore AleDore Jul 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've to refactor this way (but it's not the only way I think), in order to avoid Task maximum number exceeded. It tooks about 20 seconds to write about 8300 blob. All feedbacks are welcome! @gunzip @balanza

@gunzip gunzip added the on-hold label Nov 17, 2020
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@balanza balanza marked this pull request as draft January 10, 2022 16:21
@balanza
Copy link
Contributor

balanza commented Jan 10, 2022

@AleDore I assume this to not be actual anymore

@AleDore
Copy link
Contributor Author

AleDore commented Jan 10, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants