There is an econometrics component in the data science subject of this term. The assignment is due in two weeks, and I am expected to get rather busy with work in coming weeks; so I spent this Sunday reviewing the course content, hoping to get the assignment done asap.
This assignment is about using R and models we learnt in the course, to run predictions based on datasets provided; interesting one indeed.
It appears to me that this assignment was derived from the Zillow failure case - a “purely data driven failure”.
Having looked into this case briefly, I got some more understanding in forecasting. For a successful business like Zillow, it shouldn’t be hard to attract brilliant minded people who can build forecasting models. As a matter of fact, Zillow had many models with access to lots of data. So, using their own models, Zillow tried to buy “undervalued” properties, renovate then resell for a profit, at a large scale. This sounds like a great idea, but it failed. Its capital reduced 36% and Zillow also had to make redundant of its 2500 employees. The housing market turned out to be more volatile.
Take a first look on housing price, how hard is it to make predictions, right? Actually, it is extremely hard. The price of a property is measured by one thing and one thing only: the price! But it is affected by too many factors: from economy, currency exchange rate, inflation, immigration … to structural damage, neighbouring noise, maintenance… I am no stranger when it comes to forecasting something, you always start with making assumptions. When you have so many contributing factors to something, you would have to make many assumptions. An assumption is a guess; yes, you might claim it is an “educated” guess, but it still is a guess. Do they hold? How long are they valid for?
And, behind each contributing factor, there is human interaction and human behaviour which are more unpredictable than the weather. Forecasting based on historical observations will not reflect human behaviour of the future.
Forecasting models will give some idea about the future, but you need to know its limit; and best not to bet your life on it.