-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Brigade Project Meta-data representation on StatusBoard #30
Comments
This overlaps a bit with #26 which I created to track figuring out some initial details to capture into the index I've gotten a bit stuck fretting over the schema, but the move to having a version in the branch name should help that Maybe the antidote is that for the v1 index we just shove all the details we can into new fields and not worry any redundancy or cohesiveness. Long term what I'm worried about is that we don't want too much data logic to end up in tool code. For example, between project lists, GitHub metadata, git analysis, civic.json, and publiccode.yml there might be 6 different ways to determine who the main project maintainer is. Do we want an index that contains all 6 and every tool determining which it uses? I've been thinking we should aim for a set of "core" fields to shake out over time that the index handles filling based on an always-evolving panel of techniques. So we might have a core field for project maintainer that the index makes a best effort to fill for every project, and then we iterate within the index over time on the coverage and quality of that field by continuously adding/tweaking/ranking the various sources we can get at for potential values. So maybe the play is that for v1 we just keep adding root attributes aggressively with little concern for cohesion (i.e we add a publiccode key with the entire document if present, a github key with everything interesting we extract from GitHub), and then v2 is where we take a step back and design. It shouldn't be hard to have the automated infrastructure keep populating both in parallel. So tools built against the v1 index will be tightly coupled with the various underlying data sources and will be left to sort out on their own between various redundant fields. Then we'd have a second generation of tools able to build against the v2 index that provides consolidated & normalized data Does that sound like a good approach? Are there any other paths we might consider? |
I think your approach sounds good. My intent on this issue is more coming up with what set of meta data fields we want to promote on the status board as places to improve upon. Even if we don't have a "score", we apply merit by inclusion (and dismiss by omission). This is more a place to say, for instance, given that v1 is going to be tightly coupled to github (mostly), can we identify some objective metrics, e.g.
I guess to your point, the next layer is somewhat opinionated (maybe we just need to push for one or two of these), e.g.
And finally we may have some suggestions that are maybe a bit more controversial:
So, I am thinking of how to take the information your index is gathering, and present it in a useful (ala Google PageSpeed) manner to brigade members so they can know how to improve the overall index quality. I think to your point we need to keep the statusboard from taking on too much responsibility in terms of being the place where we decide on subjective assessments, but that is just a fine line we are going to have to balance. |
The next step on the Status board sub-project is to improve how we report the relative "health" of an indexed project, as well as describe actions to take to change the indexing.
The text was updated successfully, but these errors were encountered: