Skip to content

JHU generating bogus ".0000" geo id #254

Closed
@krivard

Description

@krivard

JHU is generating a geo id of ".0000", which is not a valid geo id. It seems to affect some signals but not others. For example, from the ingestion log:

handling  /common/covidcast/receiving/jhu-csse/20200827_county_confirmed_incidence_num.csv
confirmed_incidence_num False
 invalid value for Pandas(geo_id='.0000', val='1287.0', se=nan, sample_size=nan) (geo_id)
exception while inserting rows: 'NoneType' object has no attribute 'geo_value'
archiving as failed - jhu-csse
handling  /common/covidcast/receiving/jhu-csse/20200827_county_confirmed_incidence_prop.csv
confirmed_incidence_prop False
archiving as successful

& the CSV files in question:

# this one is bad
$ head archive/failed/jhu-csse/20200827_county_confirmed_7dav_incidence_num.csv 
geo_id,val,se,sample_size
.0000,544.4285714285714,NA,NA
01001,6.285714285714286,NA,NA
01003,34.57142857142857,NA,NA
01005,-0.7142857142857143,NA,NA
01007,3.0,NA,NA

# but this one is fine
$ zcat archive/successful/jhu-csse/20200827_county_confirmed_7dav_incidence_prop.csv.gz | head
geo_id,val,se,sample_size
01001,11.250808651871852,NA,NA
01003,15.486632220642273,NA,NA
01005,-2.8934850291084593,NA,NA
01007,13.39644547646691,NA,NA
01009,16.55211941242447,NA,NA

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions