ABOUT ECONOMICS DATA PROBLEM

We already discussed here that modern economic statistics is to significant extent “unmodelable” due to low frequency, narrow coverage, methodological inconsistencies and frequent and large revisions, at least by standards of ML and deep learning. Today, GS American economic team released a brief note in few pages describing where we are now data-wise in economics. Summary page is below but the key message is even shorter: economic dashboard today is strikingly similar to 1970. Of course such situation is hardly acceptable. While GS envisions a slow road of incremental changes to economic data-harvesting, such approach will leave economics far behind modern analytical and predictive advances. Modern data-science always starts with data. There is no way around it. That’s why the name of this blog is ‘Data-Economist’, not ‘ML-economist’, for example. And it’s exactly lack of data that makes those few ML and deep learning economic papers that exist of so little practical value. That has to change and sovereign states and supra-national bodies should deliberately start data revolution in economics. Incremental changes most likely will be too little and too late.

Now GS:

The Big Data Revolution in Economic Statistics: Waiting for Godot… and Government Funding (Hill)
6 May 2018 | 8:54AM EDT

The age of electronic information has fundamentally changed the economy and the ways we consume, communicate, and conduct business. But in terms of the mainstream approach to tracking the economy, it sometimes feels like nothing has changed. To paraphrase Charles de Gaulle: Big Data is the future of macro… and always will be.


The reality, of course, is more nuanced. Statistical agencies have made notable strides in recent decades: nearly 15% of the CPI is now collected online, release lags for several key indicators have been shortened, and new indicators like the Census business formations statistics and the Billion Prices Project will help signal future inflections in the economy.


While such gradualism may seem disappointing in the era of Instagram and instant gratification, the constraints faced by statistical agencies are significant. Budget pressures are acute, with funding for the Bureau of Labor Statistics (BLS) down 12% in real terms over the last decade. On top of funding constraints, the statistical agencies generally lack the legal authority to obtain access to new and valuable datasets—even from other government agencies.


Why does this matter? On average since 2005, estimates of quarterly GDP growth have been revised by 1.0 percentage point, and during the depths of the last recession, the official payroll figures understated the magnitude of annual job loss by over 900k. This underscores the value of new information sources, to the extent they are additive and cost-effective.


In the next 1-3 years, we expect the CPI to expand online data collection and possibly migrate the source data for the new vehicles index from a traditional sample to a major aggregator of automotive industry data. At the Census, the use of credit card data to complement existing surveys is more uncertain but also possible. The BLS is also exploring the creation of a worker-level wage dataset that could ultimately cover as much as 97% of the workforce and could improve the personal income statistics, among other uses. Incorporating Big Data on business formations, new hires, and tax withholdings into the payroll estimates is also possible over the medium term, though it would likely require legislative action.

Leave a Reply

Your email address will not be published. Required fields are marked *