-By Arindam Banerjee, Professor of Marketing, IIM Ahmedabad
In a typical classroom session in the
Analytics course at our business schools often times the discussion steers
towards the efficacy of regression-based models. Issues such as “fit” of the
model, the interpretation of the parameters of the predictor variables, the
hurdles in interpretation created by issues such as multi-collinearity are
discussed and many plausible resolutions are debated in an animated manner. Very
rarely does a perfect solution emerge from these discussions.
In theory, a model is supposed to be a summarized
representation of the real phenomenon. Hence, the expectation is that models
should be, a) “complete” in their explanation of the real-life phenomenon, b) simple
for users to appreciate the phenomenon and, c) sufficiently reliable to predict future scenarios perfectly. Most social interactions are, however way too
complex to be amenable to a simplified yet perfect summary, as is demanded by
business users. In fact, the very simplification process in model building (a
necessary input for understanding and managerial diagnosis and control) is the
reason for the partialness of the explanation. Complex social phenomena usually
have complex explanation and therefore the process of simplification in model
building for the purpose of better understanding will necessarily create an
imperfect model - which is “part” explanatory and consequently imperfect in
prediction.
Net-of-net, while it is desirable to build
models which provide razor-sharp explanation of the phenomenon of interest, and
also predict with reasonable accuracy, such model building ventures are
idealistic. Most practical situations impose a compromise; analysts and their
stakeholders have to prioritize i.e., choose what is more important to them, a)
a partial but palatable explanation of the phenomenon or, b) a more reliable
prediction at least in the near term. Whichever objective is more pertinent for
the contextual requirement should drive the model building process.
Models
that Explain
Such models are used primarily for the diagnosis of certain social behavior. They are also referred to as the
process of discovering or the identification of causes of a phenomenon. For
instance, in market behaviour, a common query is to find the extent to which
changing prices affect demand. This is usually done by estimating a price
“elasticity” coefficient of demand in a regression model. Though a relatively
simple problem, the discovery of the price effect leads to many more questions
than they answer that the procedure unravels. Does a price-sensitive market
behaviour mean that lowering prices will have a positive impact on demand? Hard
to say, since there may be many other parameters (identified or hidden) that
affect demand and therefore manipulating one parameter can only provide an
average impact on the demand. Therefore, while the price sensitivity of the
market is an important finding from the model, the exact nature of how demand
will play out depends on many other complex interactions among market forces
that the model is rarely able to extract fully. Quite likely, the “fit” of
these diagnostic models is low – an indication that the models are simple but
partial summarization of market phenomenon. Such model rarely provide
confidence in their ability to exactly predict demand as the market parameters
change.
Description
of market phenomena requires high degree of confidence in the relationship
between demand and the pertinent market parameters. Therefore, identifying the
relationship between cause (price) and effect (demand) through the parameter
estimates in a regression model is very important. Multi-collinearity is a
common problem that hampers this process. Most readers would appreciate the
problem as one where the estimates shift frequently due to the high level of
correlation among various causal parameters (like price changes and advertising
changes happen simultaneously). Therefore, multi-collinearity is a critical problem
that requires a satisficing resolution in diagnostic models.
Models
that predict
The primary function of these models is to
predict the outcome as changes happen to the causal parameters (or forecast
outcome in the future). In this kind of model, the objective is to attain an
acceptable level of accuracy of the predicted outcome, i.e., how close is the
estimated outcome close to the actual value. In these situations, the “fit” of
the model is very critical to ensure a baseline level of confidence in the
model output. Most prediction (forecasting) models are developed using
regression algorithms that ensure that the information contained in the training
data is optimally matched using a complex mathematical formula. The important
point is the ability to match the contour of the data well with a complicated
and often times inexplicable mathematical function. Care is taken to ensure no
“overfitting” of the model that may reduce the ability of the model to match
quality testing parameters on data beyond the training set.
While the intended purpose here is not to
explain the process of building sophisticated ML models, the point we would
like to emphasize is that prediction models require a relatively more
complicated “summary” of data that many times does not lend itself to easy
interpretation. But the purpose of prediction is served. The quality check for
the model is not its diagnostic ability, but whether it can reliably churn out
an exact estimate of the outcome as changes happen in the future. Multi-collinearity,
a formidable hurdle in diagnostic models as explained earlier, is of secondary importance
over here. The critical elements to test the superiority of prediction models
are, a) the “fit” to the training data and, b) the ability to predict well in a
validation sample. No wonder, sophisticated Neural network algorithms used for
prediction are never questioned for their (in)ability to explain the phenomenon
that they are supposed to model and forecast.
Set
your Modeling Objective beforehand
In conclusion, while empirical models are
meant to provide a suitable summary of the information contained in the data,
they are never perfect in satisfying the objectives of the user group. Good
prediction models often require highly complex algorithms to meet reasonable
quality standards. However, their complex characteristics do not lend
themselves to the simple storytelling of the phenomenon that business leaders
appreciate. Therefore, simpler models that provide a digestible but partial view
of the marketplace are rarely good for prediction purposes. That should be
obvious given that most human behaviour (which is central to business dealings)
are fairly complex and do not have complete and comprehensible explanations.
Therefore, the practical way to find a resolution is to decide on a primary
objective – “explain” or “predict” and develop models accordingly to suit the
purpose. Otherwise, we may be trying to build models to attain multiple
objectives without much success and, that is not useful for most business
enterprises.
Arindam Banerjee is a Professor of Marketing at IIM Ahmedabad. SAGE is the proud publisher of his book “Business Analytics"
Check out the SAGE textbook here.
Normally I never remark on sites however your article is persuading to the point that I never stop myself to say something regarding it. I am Really glad to state that this post is extremely intriguing to peruse. DU HOME PLAN
ReplyDeleteI would like to thank you for the efforts you have made in writing this article about python training. I am hoping the same best work from you in the future as well. In fact your creative writing abilities has inspired me to start my own Blog Engine blog now. Really the blogging is spreading its wings rapidly. Your article is a fine example of it and it helps me choose the best python training in pune.
ReplyDeleteMachine learning training in pune
I feel really happy to have seen your webpage and look forward to so many more Information reading here.
ReplyDeleteJava Training in Chennai
CCNA Training In Chennai
Thank you for posting such a great article! I found your website perfect for my needs
ReplyDeletehttps://www.slajobs.com/data-science-training-in-chennai/
https://www.slajobs.com/devops-training-institute-in-chennai/
Welcome to J.Thomas Collectibles your web-based toy model store for diecast vehicles, trucks plastic model packs, military airplane business airplane models and bikes.
ReplyDeleteบริการ สล็อต slot ด้วยระบบเกมใหม่ปัจจุบันของพวกเรา ได้ปรับปรุงแก้ไขให้มีความเสถียรภาพมากยิ่งขึ้น PG SLOT เล่นสล็อตผ่านมือถือ ก็ไม่มีหลุด เล่นได้ทุกระบบปฏิบัติการ จะเล่นโทรศัพท์มือถือเครื่องไหน คอมจำพวกใด
ReplyDelete