In the final article of this technical series we demonstrate hierarchical linear regression using PyMC3 to compare vehicle NOx emissions for a range of car manufacturers.
Posts in scientific-python
In the second article of this technical series we demonstrate the flexible syntax of PyMC3 with regularized linear modelling of car emissions data and model evaluation.
Bayesian inference bridges the gap between white-box model introspection and black-box predictive performance. This technical series describes some methods using PyMC3, an inferential framework in Python.
Practical data science projects often include an aspect of anonymisation to carefully remove sensitive information prior to analysis; here we demonstrate several complimentary techniques and principles.
In this technical article we explain why and how to use Singular Value Decomposition (SVD) for feature reduction: making large datasets more compact whilst preserving information.
Visualising data is important for aiding intuition & good understanding, but high-dimensional datasets can be hard to display. Here we demonstrate techniques to tackle the issue.
Here we demonstrate a standard semi-parametric regression method to create a model of harddrive failures. This model can be tested for accuracy and used for prediction.
We now have a clean, prepared, real-world dataset regarding the failures of thousands of harddrives, lets see what we can learn from a basic survival analysis.
We've reviewed the basic theory of survival analysis and discussed why it's a useful technique; now lets acquire, explore and prepare a real dataset for analysis.
The practicing data scientist will be familiar with a wide range of software for scientific programming, data acquisition, storage & carpentry, lightweight application development, and visualisation.