Tuesday 15 May 2018

Bank of England - Big Data Guru

Where do you work? At my place of work the col thing to do is to walk around and say let's get oodles of data on stuff and then analyse it. This methodology is apparently foolproof and will give us the answers to the known universe.


Of course, a little while ago I would sit in the techie meetings and say things like 'Does it scale?' whenever there was a period of silence. Many of the techies now see me as some sort of wise counsel, in the same way I still see them as a naïve group very able at doing things but truly unable to understand why!


So Big Data is the new thing, the power of computers is unmatched and we will soon all learn it is about the data. O course, in the City, there are a huge amount of quantitative analysts who are very happy with any developments that improve their job prospects.


But data to me as two huge issues, accuracy and interpretation. More bad data is still no better than less bad data, perhaps worse even. Interpretation is key - are sales falling because of a declining market or an inefficient sales force - can data tell us this? What if the data says one thing but it turns out later it was the other?


In the Great Financial Crash, data was telling the CEO's of banks their risk of failure was tiny - then lots of banks went bust, quickly. The data had been interpreted spectacularly wrong.


Anyway, the link is to an article where the Bank of England is going to try to use a wider data set to make decisions. This of course is a waste of time, I can quite happily predict what the Bank is going to do with very limited data - it is going to set rates nice and low to keep the Zombie economy going and allow the Government to create too much debt. It has been doing this for over a decade with no sign of change - how is a bigger data set going to make this a different decision?


Of course it won't and this is some sham PR exercise, as per usual.

19 comments:

Lord Blagger said...

So here's some questions

State has created too much debt. Currently at 13 trillion for the UK, civil service pensions etc included as well as the borrowing.

1. When does it go tits up?

2. How do you defend yourself against it going tits up?

3. How do you profit from it going tits up?

dearieme said...

Gold.

hovis said...

Big data - not different to anythng else in computing - GIGO.
As you int at the data problem you hint at is intersting; data will be skewed by what is captured, which is also a function of pre-existing predicate of how the works works.

Lots of overblown bullshit about AI too - much of the advances are nothing but brute force calculation of massive data - very little inteligence. Of course in both case you cannot expect people not to big themselves up - who would not talk their own book - still no reason to belive the fantasy claims though.

Totally agree there will be no rate rises. I cannot see any way this would be politically acceptable, there will be no change without a truly existential crisis, there will however be a lot of theatre about normalisation - it's not going to happen.

Lord Blagger - my answer for 10 -
(1) we are 10 years in, this is a bit like the house prioce crash that was mooted from at least 2000 - perahps another 10 or 20 years? (no basis for this figure but as good as any other estimate.)

(2) & (3) How to defend and profit - surely depends on how serious you think the crisis gets..

Anomalous Cowshed said...

Big data relies upon statistical concepts; but means and standard deviations are artifacts of the data. The chances of having a large number of datapoints being bang average is basically nil. Unless you have a highly standardised/commoditised process generating the data in the first place. You can also increase the chances of finding average datapoints by massively increasing the dataset, or improving the precision of your measurements. It seems highly likely that big data systems will trend towards being accurate for a smaller and smaller portion of the data; they'll have a tendency to fail, quite probably catastrophically, for an increasing amount of the set.

An awful lot of the calls for "more data" seem to flow from not understanding the process generating the data in the first place.

Whatever problems the BoE has, I'm reminded that Haldane, I think, went on a tour of the regions within the last two years, and discovered that regional issues were not showing up in national data. Yup, scale. So some of the BoE's actions, derived from national aggregates, might well have been positively harmful at regional level.

Still, more data eh?

Charlie said...

"regional issues were not showing up in national data"

This is why proper scientists hate averages.

Anomalous Cowshed said...

Good grief, I'm a proper scientist!

Jan said...

It always amuses me that 0.1% or less is considered to be meaningful in terms of measures of inflation/wage growth/productivity/you name it......one month's data on its own doesn't tell us much especially when much of the data isn't captured anyway eg self-employed in employment stats etc

I've no idea how TPTB set about collecting and analysing the data they do have but I'm sure as hell it can't be as accurate as they make out.

Steven_L said...

Here in the council, I can confirm we're just as interested as ever in collecting and analysing your data. And because we're doing this for the purposes of protecting you all from yourselves we don't even need to ask your permission or tell you we're doing it.

andrew said...

As always, Depends

Ex-GF was a contract PM on a db migration project for a large ins co in Swindon. They had a spreadsheet of ~1000 ifas that was looked up daily, but not constantly (and not backed up) on a PC.
GF advised Access on a terminal server.
Some guru asked if it scaled.
$large_it_outsourcer that did the IT said 3 dedicated servers (live/test/dr) = £108K + 15K pa for scalable solution.

On big data, I got ~10 years worth of FTSE data and was able to fit a function that predicted the next day's movement with ~78% accuracy.
Except it did not work on future data that well.
in large multidimensional data sets it is soooo easy to pick / find / derive 'significant' results that aren't really, or are but then you have incorrectly identified the causes.
As others have mentioned, you also need to be very careful about the questions being asked in the first place.

CityUnslicker said...

Andrew- of course, hence the Hitchhikers Guide to the Galaxy with the best example of all time - 42 being the answer, but what is the question?

Bill Quango MP said...

ROBERT McNamara, Harvard, Ford, UsAAF, practically invented data analysis. He trained ww2 armyairforce officers in business skills of efficiency and organisation and statistical interpretation.
When he was made secretary of defence for Kennedy, he attempted to run the entire military in business lines.
And data was the key.

But during Vietnam there was so much data collected. Rice caches found. People with leftish leanings in former French administrative roles, number of VC rubber soled shoes discovered per 100 acre of jungle, etc, that it was mostly useless.

Tons and tons of “hard data”poured into MacV. And little of value came out.

The pacification said village region X was 32% pacified. But no one knew what that meant. Was that a win? And ?mac said, sure it was, because before it was only 22% pacified.
But 32% only meant the enclosed villages. The regional headquarters. The land the GIs were brushing through that day, was pacified.

Everyone citied figures throughout the entire war. Thus much captured. That much taken. This many killed.
Most of it was wrong data going in. And worse, it was wrongly interpreted coming out.

It’s why the USA was 100% unaware of the Tet offensive when it came. And why, up until then, they thought they were winning quite handsomely.

MMcNamara was presidentvof the world bank after being the longest serving secretary of defence in America’s history.
He did a better job there.

Anonymous said...

Erm, were is the link??

Anomalous Cowshed said...

Yeah, what did kick this off? Was it the Haldane speech of 30 April? Which is here https://www.bankofengland.co.uk/-/media/boe/files/speech/2018/will-big-data-keep-its-promise-speech-by-andy-haldane.pdf

CityUnslicker said...

Thanks cowshed, did forget the link and that is the appropriate one!

Sebastia said...

Data gives a figleaf of rationality to what are basically gut feel matters of judgement. Bad managers love lots of data. Something to blame when you find you have got it wrong.

One of my early mentors told me if you are right 51% of the time you will have a great career. If you are right 49% of the time you won't.

Sebastian Weetabix said...

wtf happened there?? Posted before I finished. Anyhoo, the best things governments can do is emulate J.J. Cowperthwaite and not collect any data. If you don't collect it you can't attempt to incompetently manage it - and the free market and individual liberty will take care of everything.

Shame it never caught on.

Electro-Kevin said...

Analysis paralysis is the aim.

Lord T said...

Big data is just one of the latest buzzwords engineered to seperate customers from their cash as they come to the end of big projects. Same as Cloud and Cybersecurity.

Anonymous said...

OT, but anyone heard from the Skripals lately? You'd think HMG would want to keep the Russia drum a-beating.

(more big transports than usual flying into Fairford US airbase, and RIAT isn't for a couple of months. Is something planned?)