In the business world, forecasting models need to be both scientifically valid and ‘face valid.’ The former is necessary, but not sufficient by itself. A model must be face valid so that decision-makers do not dismiss it immediately as totally irrelevant. If a model, however, is only face valid, and even if everyone agrees it is reasonable, it might lead to dire consequences. It might become a Weapon of Mass Destruction (WMD) as discussed in Dr. Cathy O’Neil’s book “Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy.” This column discusses some examples from it, why all forecasters should read it, and why managers should take lessons from it.
Upon reviewing one of the first quantitative models I developed after graduation from grad school, my manager at Arthur D. Little (a consulting firm) gave me a very important modeling lesson. He said that while it was extremely important to show that a model has scientific validity, it was equally important for the model to have ‘face validity’ as well.
What he meant by the first type of validity was that the model needed to be statistically sound and adequately represent the real-world issues it was developed to address. In support of decision-making, the model must behave in accordance with real-world behavior. Face validity, on the other hand, has to do with whether the factors make sense to decision-makers. For example, for cause and effect models, a key issue to address is whether the independent variables causally impact the dependent variables, not just statistically correlate with them. For example, a pricing model that displays demand going up when prices go up would be suspect to a reasonable decision-maker. It would immediately be rejected as unbelievable. That is, it is guilty until proven innocent. I was recently astounded to find out that many of the Internet-based ‘Big Data’ models in use today only have face validity, and that their developers don’t seem to care about scientific validity. Apparently, they believe that if the independent factors in a model seem like they are related to the dependent factors, then they are sufficient. In business modeling, face validity is necessary, but not sufficient. Indeed, they are potentially dangerous to use. I got this insight by reading a book by Dr. Cathy O’Neil, entitled “Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy.” All forecasters and planners would do well to read this important, groundbreaking book.
WHAT IS A WEAPON OF MATH DESTRUCTION?
Dr. O’Neil is a Ph. D. in Mathematics, and as a former quantitative analyst at a financial hedge fund, she experienced first-hand the damage financial models did leading up to and following the mortgage market meltdown of the past decade. They helped to cause financial institutions to collapse. She wrote the book because she got disillusioned with mathematical models that affect society. As expressed by one description of the book, “A former Wall Street quant sounds an alarm on the mathematical models that pervade modern life—and threaten to rip apart our social fabric.” Her premise is that the vast amount of Big Data on the Internet is being used in ways that are: 1) opaque; 2) unquestioned; and 3) unaccountable. In simple terms: 1) the detailed data used is not transparent to the person impacted by the decision-making it supports; 2) the use of the data is beyond reproach in modelers’ minds; and 3) modelers refuse to defend the model other than to say “it is what it says.” It’s akin to using the “court of public opinion” rather than the Law to assess whether someone has committed a crime. In addition, she states that the models used by decision-makers often result in behavior that has “vicious, self reinforcing feedback loops” whereby things get worse for those affected— especially minorities and the poor. While there are many examples of these in the book, below I discuss three I found most interesting.
S. News & World Report (USNWR) was a news magazine founded in 1933. Fifty years later in 1983, it was a 2nd rate publication. It decided to start a service to rank colleges and universities with the intent of helping young people make the first big decision in their lives. This was a game-changer for the publication because today it markets itself (on its website) as “a multi-platform publisher of news and information, which includes www.usnews.com and annual print and e-book versions of its authoritative rankings of Best Colleges, Best Graduate Schools and Best Hospitals.” The initial college rankings involved weighted factors that journalists (not educators) felt were reasonable, and that could be numerically counted (i.e., quantifiable). Basically, all were face valid factors, and not necessarily ones that were scientifically proven to be related to ‘educational quality. ’ Dr. O’Neill contends the rankings were too successful and its basic model over time was a WMD in the education industry. The rankings started a competitive ‘race to the top’ among universities, whereby they did everything they could to raise their rankings based on these face valid factors. Because costs were not included among the factors, this incentivized colleges to hire expensive faculty, beef up athletic programs, construct new luxurious dorms, and enhance dining menus. Some colleges even resorted to cheating their way to the top by fabricating the numbers they reported to USWNR. Parents and students also spent a lot of money on college admission planners to get their kids into the top-ranked colleges. In addition, some international students even cheated on standardized exams. The author contends that this competition contributed to the exorbitant tuition costs and student loans that we see today. The latter has left many minority and poor students saddled with debt that they likely will never pay back. A second type of WMD was the crime prediction software used by many police forces struggling with containing costs. The software was used to predict where crimes were most likely to take place on an hourly basis. Based on the forecasts, cops would be sent to locations where crimes were mostly likely to occur. The police forces could focus only on reported violent crimes, and optionally on non-violent ‘nuisance crimes’ (that would go unreported unless a cop saw it). Many opted to include the latter because of the popular so-called “brokenwindows policing.” This increased total crime incidents and caused a vicious feedback loop that sent police to the same non-violent neighborhoods, over and over again. Since geography is a highly correlated proxy for race, this led to the questionable arrests of many minorities—including many for resisting arrest because disgruntled residents were tired of being stopped for no apparent reason. The author argues that this WMD software contributed to the exorbitant numbers of minorities and poor that are currently unfairly incarcerated. The last example from the book has to do with the credit scoring activities of marketers and others using Internet data. The author contends that the well-known FICO scores, used by credit card and other loan providers, are not a WMD. FICO is regulated and transparent to borrowers. A FICO score is based on the financial history of the borrowers’ themselves — not others similar to them. However, these scores, while valid for creditworthiness, are often used for hiring; wrongly assuming that a high score means a harder worker. This often leads to the wealthy getting jobs over poor applicants who really need the jobs. This feedback loop contributes to the rich getting richer, and the poor getting poorer! Meanwhile, ‘e-scores’ developed for marketing purposes include other factors in additional to the FICO ones. The biggest offender is the borrower’s zip code, because average loan-default rates vary significantly by zip code. An e-score based on this is definitely a WMD. It assumes that just because my neighbors default on loans, I have a high chance of defaulting too. Thus, poorer loan applicants may not get loans, or if they do they are subject to higher interest rates. This increases the chance of poorer loan applicants defaulting because their payments are set too high. This is another vicious feedback loop of self-fulfilling policies that contribute to the rich getting richer at the expense of the poor.
ANY WMD MODELS IN BUSINESS?
WMDs, as defined by Dr. O’Neill, are focused on the so-called destruction they can wreak on societies, especially with regard to minorities and the poor. Do we have WMD models in the business world? I would speculate that there aren’t many, because in business the focus is always on attracting and retaining loyal customers and working with the best of suppliers — certainly no harm ever intended. Some business models appear to be WMDs (i. e., ‘lookand-feel like them’), however (at worse) they might be models of ‘moderate distortion. ’ Below are three examples. Certainly Gartner’s Top 25 Supply Chains ranking, that started 13 years ago, looks and feels like USNWR’s Best College rankings. Gartner uses six face valid weighted factors in rating supply chains: peer opinions, Gartner opinions, Return on Assets (ROAs), inventory turns, revenue growth, and a newly added ‘Corporate Social Responsibility Score.’ While Gartner’s intent was to recognize and call out best supply chain practices with regard to its “Demand Driven Value Network” model, the supply chain community initially took it to be Gartner’s view of the 25 best or excellent supply chains. Like the Best Colleges report, it started a race to the top among managers. They sought more access to Gartner’s analysts, and put more focus on attaining higher scores. As I wrote in an article entitled “Competitive Supply Chains: Excellence” in Supply Chain Management Review magazine (Jul/ Aug 2015), it cannot definitively identify the most excellent supply chains. It only includes big companies, is too reliant on opinions based on little knowledge of detailed supply chain operations, and gives too much credit to a supply chain organization for revenue growth (to which they are not generally held responsible within their companies). However, this does not make Gartner’s Top 25 a WMD. Its real intent is to stimulate healthy dialogue about what is possible; and it was created by supply chain experts. It would only hurt a company if a supply chain organization gamed its way to the top without properly vetting the steps required to do so. These would need to be fully transparent to the overall company. The proper judge of whether a supply chain is excellent, or even needed for that matter, is the company itself, not a 3rd party like Gartner. Starting in late 2004, I wrote a three part series on Sales and Operations Planning (S&OP) in the Journal of Business Forecasting (JBF). I wrote it because a lot of practitioners were asking for my advice about S&OP. The last column was, “Sales and Operations Planning Part III: A Diagnostics Model” (Spring 2005). It provided managers with a four-stage S&OP Process Maturity Model that I felt was needed in support of a resurgence in S&OP—a process that had begun in the mid-1980s. The model was similar to those developed by consultants to assess how developed a company’s processes are, relative to an idealized process. Depending on what stage a company’s process was in, the consultants would recommend a path to be taken to get to the ideal process over time. The S&OP model assessed a company’s stage in terms of: how meetings were conducted; how integrated and extended the processes were; and the extent to which the enabling software technologies were integrated. Since the S&OP process involves collaboration and consensus building among the supply, demand, and financial sides of a company, I basically assumed getting to stage 4 would offer the greatest benefit without any data supporting that premise. Was the model a WMD? I think not, because I believed (as did the industry) that full internal, and even external collaboration, would help companies achieve their financial objectives. So, a strong S&OP process would cause no harm, as long as a company installed a truly collaborative process instead of, for example, an extremely contentious one that turned out to be detrimental to its corporate culture. In, my Summer 2015 JBF column entitled, “Supply-Neutral versus Unconstrained Demand, ” I discussed ‘unconstrained’ versus ‘constrained demand’ forecasts, and that supply chain planning should be based on truly ‘supply-neutral’ demand forecasts. Ones that adequately reflect demand devoid of distortions related to supply surpluses, shortages and other supply related factors. Over time, forecasting demand that is not ‘supply-neutral’ might condition customers to demand products based on their availability, rather than on the customer’s true demand needs. I discussed several examples of these distortions that I had seen in my career. For example, many companies forecast customer demand from shipment data. In some cases, shipments are not the same as true demand, such as when supply factors cause customer orders to be filled imperfectly (e. g., not delivered on-time or as split shipments). When this is the case, a shipment forecast is not exactly the best representation of true future customer demand. Thus, while this type of forecasting model is not a WMD, it is model of moderate distortion; especially if customers are conditioned to routinely accept imperfect order fulfillment. However, a shipment forecasting model can become a WMD when customers get sick-and-tired of this, and in the long-run decide to buy from competitors. In summary, while there is not much evidence to support the fact that the business forecasting and planning community develops WMDs, it might develop models of moderate distortion, especially if it relies too heavily on face valid factors, without checking for scientific validity. However, if a model leads to decision-making that results in the loss of customers, in the business world, we would regard this as destructive!
Re-posted with permission from the author and the IBF - originally appearing in the 2018 Journal of Business Forecasting | www.ibf.org
- Learn more about Demand Management
- Learn more about Inventory Planning
- Learn more about John Galt Solutions
- Contact our press office at 312-701-9026 or visit www.johngalt.com
About Dr. Larry Lapide, Ph.D.
Dr. Larry Lapide is currently Research Affiliate at the MIT Center for Transportation & Logistics, as well as a Lecturer at the University of Massachusetts: Boston. Dr. Lapide has over 30 years of experience in industry, consulting, business research, and academia. Recently he was the Director of Demand Management at the MIT Center for Transportation & Logistics (CTL). He also managed the launch of MIT's Supply Chain 2020 Project and is responsible for CTL's Strategy Alignment Workshop. He concurrently served as the Research Director for the Demand Management Solutions Group, a consortium of companies that sponsored a multi-year research project focused developing advanced strategies, principles, and methods to optimally match supply and demand. Dr. Lapide is the recipient of the inaugural Lifetime Achievement in Business Forecasting & Planning Award.
About John Galt Solutions
Since its founding in 1996, John Galt Solutions has built a proven track record of providing affordable, automated forecasting and inventory management services for consumer-driven supply chains. We have an unmatched ability to configure tailored solutions for customers, regardless of size or business challenge, that save both time and money by compressing implementation periods and delivering intelligent information that positively impact your bottom line.
Tags: PR and News