We're delighted to attend and support R in Insurance this year: a leading international conference for researchers and practitioners of actuarial science and financial data analysis.
Science is all about making & testing models to describe the real world, and it's always important to remember that the map is not the territory.
Here we demonstrate a standard semi-parametric regression method to create a model of harddrive failures. This model can be tested for accuracy and used for prediction.
We now have a clean, prepared, real-world dataset regarding the failures of thousands of harddrives, lets see what we can learn from a basic survival analysis.
We've reviewed the basic theory of survival analysis and discussed why it's a useful technique; now lets acquire, explore and prepare a real dataset for analysis.
In this series of blogposts we'll explain tools & techniques of dealing with time-to-event data, and demonstrate how survival analysis is integral to many business processes.
There are huge opportunities available to the life insurance industry to apply new statistical modelling techniques, recalling their original role in helping to advance data analysis.
The practicing data scientist will be familiar with a wide range of software for scientific programming, data acquisition, storage & carpentry, lightweight application development, and visualisation.
The term 'data science' has been around now for about five years with many explanations, discussions and occasional breathless over-excitement in the technology and business press.
In the summer of 2013, Jon Sedar and Michael Crawford got chatting in the pub after Hadley Wickham's great lecture at the Dublin R user group.