Har Zindagi

Product Overview

From the needs assessment:

Large volumes of digital data are being collected in the vaccine delivery space. However, such data isn’t being verified for accuracy, and insights from this data aren’t being used to improve vaccine service delivery. The intended recipients of this data (vaccinator supervisors) don’t have the technical skills or bandwidth required to analyze such large volumes of data, and verify it for accuracy.

We propose creating an automatic anomaly detection algorithm that verifies the accuracy of incoming data in near-real time. Features that identify anomalies in the data (heuristics) will be identified from existing data alongside heuristics collected from vaccinator supervisors. The heuristics will be used to identify data that is potentially anomalous. We will then ground-truth these potentially anomalous reports (send independent inspectors). The outcome of our ground-truthing will then be used to adjust our algorithm. We will then do a final test of the algorithm using independent inspectors for both efficacy and cost-effectiveness. Given the nature of our solution, if it is found to be successful it can be scaled within Punjab immediately and we would then seek to scale in other contexts.

It seems that they are building product that will integrate with various data entry systems that will detect potentially wrong/anomalous data entered.

Intro Call

Current situation

This section helps us understand where each team is at the moment.

What is your product elevator pitch? (i.e. describe project in 2-3 sentences)

Create anomaly detection system so all data coming into their home built data collection systems. Smart validation system.

What milestones are you currently working towards?

Obtaining ground truth data via survey teams. It's very costly to do this. So they used an existing data set which they obtained a set of heuristics from. Interviewing vaccinator supervisors to find what they feel people might lie about.

Looking to get access to real time data from govt of Punjab to begin training their algorithm.

How much of your project is currently open source?

None

Any open source challenges that are already on your mind?

Don't really care about licensing. FOSS is cool there.

Project management can be complex because they don't know github very well.

The data is private and they are unable to release it.

There's an academic risk to putting early stage stuff out.

Time/Energy into making the code presentable and usable among others.

Not very applicable to FOSS as it's so specific to their current use case.

How many people on your team have prior experience with open source development?

No, not really.

Project management

This section helps us understand how each team manages their project work and what methods/practices they use.

Do you use a project management method like waterfall or agile?

Not really. Only one student is working on the algorithm. They're hiring a software developer to help.

Do you hold daily or weekly meetings to track progress with the core development team?

Yes, but it's just a team of 3.

Are these meetings or meeting notes recorded anywhere?

Not really. No formal process. They are interested in learning more about these methodologies.

What tools do you currently use for project management (e.g. Trello)?

No.

Testing / code health

This section helps us understand how each team is writing code and if they are in a habit of writing tests for their code. Writing tests is important because this supports other ways to grow a community of people around the project later. Tests enable others to make small changes to a project and feel confident that their change works as expected before they submit a contribution. It also supports using other tools like continuous integration or test gating.

Do core developers regularly write unit or functional tests?

No.

About what percentage of your code base is tested?

0%

Did you receive peer reviews on your code base in the last three months by someone outside of your organization?

There is some code review from the professor onto the Student.

Documentation

This section helps us understand the documentation culture within each cohort. Some teams may do this better than others, and these questions help us evaluate where each team is in terms of writing great documentation.

Do all open source repositories have a README with an explanation of what the project does and why?

There's some documentation submitted to the BMGF along with their milestones.

Do any projects have a written, documented guide that explain how to create a development environment?

What they have so far is just a jupyter notebook and it's fairly well documented but no external facing guide.

Miscellaneous

This can be any questions we think are worth asking but don't fit into another category.

What would need to happen for you to focus more on improving the transparency and open source community of your project?

Their biggest fear is about the code quality (needs improvement), is there enough documentation? There's a lot of perfectionism in this group. They want a person to focus on documentation and code quality.

They want a general sense of what an open source strategy for the end result of engagement with other people and the code and software that they've build.

They want pro/cons of what open source means for their specific project.

What does success look like in a world where you have released your project as open source?

If somebody applies their code on other data sets and credit their progress back to Har Zindagi, it would be a big win for their project. They want to have some sort of credit for helping future work as well.

They want their code to be used and scaled up in other places.