Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add agriculture production data #214

Open
ravinepal opened this issue Oct 29, 2017 · 20 comments
Open

Add agriculture production data #214

ravinepal opened this issue Oct 29, 2017 · 20 comments

Comments

@ravinepal
Copy link
Member

Here's the data (https://github.com/Code4Nepal/data/tree/master/datasets/agriculture) available

Source: Ministry of Agriculture Development, Nepal (PDF)

Here's a guide on how to visualize data on NepalMap.

@cliftonmcintosh
Copy link
Member

Please note that the data sets sometimes include things other than districts, like "C.REGION". Non-district data should be removed when importing.

Please also note that districts that have no values may legitimately be given a value of zero for a data point depending on how confident we are that there really is nothing of that thing in the missing districts. For some data we may not be confident that the missing districts have none of the thing being counted. In those cases, using zero for the missing districts may NOT be appropriate. For example, the tea data has an other category. Obviously, it is incorrect that all the districts that are not listed have zero tea production because the statistics tell us there is tea produced in "other" districts. I am not sure how we should handle this case.

@wizofe
Copy link

wizofe commented Oct 29, 2017

heya, i am interested in contributing to the project. could you guide me, in what exactly help you need? do you need to create a database, or stream the data through the main program?

I would love to be active in your project : )

@bbulpett
Copy link

I am attempting solution for #214 Add agriculture production data. Will submit pull request for approval when successful. Thank you.

@cliftonmcintosh
Copy link
Member

There is currently not an agriculture section in NepalMap. With the first agriculture data integration, the agriculture section will need to be created, much like we have sections on Demographics, Forest and Land Use, Disasters, etc.

@cliftonmcintosh
Copy link
Member

@wizofe and @bbulpett

There are several data sets for agriculture. Please submit one pull request per data set. Please also consider "claiming" a specific data set so that other people will know what is already being worked on.

@cliftonmcintosh
Copy link
Member

As I mentioned earlier, some data sets may not be complete because we lack data for some districts. They should not be included without further analysis.
Here is a list of data sets that appear to be complete enough to work on:

Please note these should be considered valid data sets only if they have data for all 75 districts.

Here is a list of data sets that appear to be incomplete and, in my opinion, should not be worked on without further evaluation.

  • Horses and Asses Population -- there is not data for every district, and I believe it is highly unlikely that the missing districts have no horses (or asses)
  • Rabbit population -- missing some districts
  • Tea -- has an "other" category that needs sorting out
  • Water and fish production -- has some incomplete data for some districts
  • Yaks -- zeroes in some districts are likely to be legitimate, but we should take a closer look
  • Coffee -- has an "other" category that needs sorting out
  • Cotton -- only a few districts, and need to verify if all the others have no cotton
  • Jute -- only a few districts, and need to verify if all the others have no jute

Some of these may be valid but we should verify that the lack of data really means that the missing districts have zero of those things.

@bbulpett
Copy link

Thank you @cliftonmcintosh for the explanation. Setting up my dev environment now. Will begin by adding Agriculture section. I will then start work on the Egg production data set.

@wizofe
Copy link

wizofe commented Oct 29, 2017

@cliftonmcintosh @bbulpett I am going to do the Milk Animals and Milk Production.

@cliftonmcintosh
Copy link
Member

Thanks, @wizofe and @bbulpett

@cliftonmcintosh
Copy link
Member

@bbulpett and @wizofe

It is perfectly fine to submit the work in steps. For example, you could submit a PR with just the SQL files for the statistics in your data sets. Like this one for forests.

Also please note that your data set may contain more than one data point, and each one would require its own integration into NepalMap. For example, the egg production data set is probably two separate data sets, one for number of laying animals by type (chicken versus duck) and another for eggs laid by type (chicken eggs versus duck eggs). So there would be an egg-laying animal table and an eggs table. The data on milk animals looks similar. It's likely there will be two tables, one for the type of animals, another for the amount of milk.

If you choose one to start with, please choose the actual eggs and the actual milk.

@cliftonmcintosh
Copy link
Member

@nikeshbalami

What is the unit for milk in the milk data? Litres?

@nikeshbalami
Copy link

Hi @cliftonmcintosh its
Unit: Mt.

@cliftonmcintosh
Copy link
Member

cliftonmcintosh commented Oct 30, 2017

@nikeshbalami

What is an "Mt."?

@nikeshbalami
Copy link

It's a "Metric Ton (Mt.)" @cliftonmcintosh

@cliftonmcintosh
Copy link
Member

Thanks

@Bezzy1999
Copy link
Contributor

Hi @cliftonmcintosh I took the liberty of adding the meat production data in #217 since it wasn't claimed by anyone else.

@cliftonmcintosh
Copy link
Member

@nikeshbalami and @ravinepal

I'm working on the egg data, and it seems like it must be incorrect. The number of hens and ducks is much, much higher than the number of eggs laid. For example, there are over 12 million laying hens but only about 1.3 million hen eggs laid. That means there is only one egg for every ten hens. That seems crazy. There's no way anyone would have ten hens and only expect one egg a year out of those ten hens. I grew up with chickens, and if we were in that situation, we would just kill them all and eat them. Can you help me understand the data? Is it just messed up?

@cliftonmcintosh
Copy link
Member

cliftonmcintosh commented Jun 16, 2018

@nikeshbalami and @ravinepal
Here is the problem: The egg numbers are for thousands, so "25" means "25000".

See page 48 of the PDF report

eggs-by-thousand

@nikeshbalami
Copy link

Thanks @cliftonmcintosh and so sorry, I forget to add "Unit" in all datasets which had created a problem. Will be taking care of it from now-onwards while scrapping data.

@cliftonmcintosh
Copy link
Member

@nikeshbalami

No worries. Thanks for the response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants