Repositorio para ejercicio de mentoría de The Data Pub
Aprender técnicas no supervisadas de agrupación y asociación para descubrir similitudes entre colonias y vecindarios.
TODO: Describir la intención del proyecto, y particularmente el objetivo final, la predicción final de valor.
TODO: Describir los datos
Describe the data as it is, without any assumption, flie by file, and field by field.
- Characterize numerical columns with histograms and density plots.
- Characterize categorical columns with contingency tables.
Try out correlations, covariances and hypothesis testing in order to search for a suitable research question or hypothesis. Also evaluate sufficiency and relevance of data for the intended research question or hypothesis.
Try out linear models or other techniques to see if the research question or hypothesis is worth pusruing.
Modeling, model selection, model evaluation, and work towards production.