Ian's Blog: March 2013

Thursday 28 March 2013

More Maxims for Privacy

I wrote some maxims for privacy a while back, but still I'm looking for that crystalisation of privacy - the "one slider" or "one sentence" that encompasses the foundations, or the basics of privacy.

For the most part privacy comes down to data collection and subsequent usage of that data - the rest is just additions to that. At least if we're concentrating on privacy and not the wider scheme of information management of which I believe privacy is just a sub-speciality, albeit a rather important one.

So, when dealing with information:

"If you don't have a use for it,
don't collect it!"

For me this sums it up. It links data collection and usage in a way that clearly states that just collect what you need right now; don't even think about the future uses yet. Interestingly I also think that deliberately restricting the data collection right at the start to what you absolutely know you are going to use immediately forces you (the software/system developer) to better focus on the product at hand - "Slow Data" anyone?

In these agile development days however, it is often argued that we'll develop the usages later and collect everything now. How often does this really happen? And, if you really were agile then you'd construct your system initially to do the minimum it needs to and get that out to the customer for their appraisal. If that goes well (or not), then modify as necessary during later stages in your agile development process.

If you're not agile (which apparently is waterfall, though I'm not sure), then you should have worked out what you need and work from there; which again should be self-limiting on the data collection. Surely in a good, fully worked out design you wouldn't be collecting superfluous things?

So, that's it, the essence of privacy as a single sentence; the rest is just layers pertaining to things such as provenance, data retention, purpose, infrastructure etc - that's what makes good information management a whole discipline in itself, of which privacy is one small, but important part.

Thursday 21 March 2013

Has space exploration become boring?

At the expense of evoking Betteridge's Law of Headlines - obviously space exploration isn't boring - and with Voyager 1 maybe leaving the Solar System, I was wondering what happened to the romance of space exploration.

Does anyone anymore sit up until the early hours of the morning as I did when Giotto encountered Halley, being amazed at Uranus' bizarre collection of moons, fascinated with the existence of nitrogen geysers on Triton, volcanoes on Io; does anyone (other than scientists working at NASA, ESA, JAXA etc) get overly excited these days at pictures from Mercury, Vesta etc?

Does anyone dream of what the Ice Giants explorer might have found at Uranus, or what creatures might live under the icy crust of Europa's ocean?

I remember (pre internet days) desperately waiting for pictures of Neptune, Triton, Miranda, Titan etc to appear in newspapers, books, news broadcasts. Even back in 1992 the joy of connecting to NASA ftp servers to download Voyager and Pioneer pictures of Saturn and it enigmatic, orange cloud enveloped moon Titan on the only Sun workstation with a colour display the university had, over a slow internet link. Watching in fascination as line-by-line the picture was displayed and possibly imagining oneself at JPL watching those raw pictures being received at Earth.

The Register has an article from yesterday on Voyager 1 (yes, still going since its launch in 1977!) which has the paragraph (emphasis mine):

Probably the most-loved survivor of 1970s space optimism, Voyager, has sent back signals indicating that it's left the heliosphere.

Maybe this is it, in the 1970s we were optimistic - there were many missions planned: Pioneers 10 and 11, followed by Voyagers 1 and 2 to complete the Grand Tour of the Solar System; later with the first missions to comets, landers on Venus and Mars.

Maybe science just took center stage for a brief moment only to be replaced with the need for fame and appearing on X-Factor? Maybe a picture of the creme brulee surface of Titan from a small lander piggybacked on a probe that made a multi-billion mile tour via Venus, Earth, the Moon, an asteroid or two, Jupiter and finally to Saturn, just don't complete against today's media offerings?

How can you not be amazed by pictures like this - think about what you're looking at and what it took to get those pictures for a moment!

Wikimedia Commons, see: here

On the other hand a grainy picture of Titan from one of the Voyager probes offered mystery and a challenge to be solved - what is under those clouds? - now we get picture of sand grains on Mars. Have we accidentally removed the mystery? Or, have we lost the big exciting picture to a mass audience? A third possibility is that science is either not understood, or just can't complete with a crass, exploitative talent show...

Space exploration in any form is exciting...just listing some of the current probes:

Dawn is on its way to Ceres after a successful encounter with Vesta.
Messenger has completed mapping all of Mercury's surface and turned up just one or two (or freaking lots!) of major mysteries
Cassini is still going strong around Saturn
Juno on its way to Jupiter
Venus Express still examining Earth's "twin"
Numerous orbiters around Mars and not forgetting two (yes TWO!) working rovers on the surface
Rosetta is still on its journey to 67P/Churyumov–Gerasimenko.
China's moon probe made a detour to visit a near-Earth asteroid
Hayabusa returning samples from an asteroid
New Horizons still speeds to its all too rapid fly-by of Pluto and its now five moons (incidentally traveling at approx 15km per second or 34000mph)
etc etc...
oh, not forgetting Voyager 1 and the rest...

Now tell my what that isn't exciting? Maybe our media needs to reacquire its love affair with exploration and science and stop feeding minds with talentless shows...

Monday 18 March 2013

Privacy not needed?

I'm told that we have no privacy anymore...none whatsoever...and this is why we don't need to review products, services, applications from a privacy aspect. You can argue the same for security and performance...just add memory and cores OK, or, use a longer encryption key?

But actually this touches on many aspects of what is privacy and I'm of the opinion that what privacy offers is the ability to the consumer or user to actively choose what is collected and for what purposes that data might be put.

Overall, however, this places privacy in a very small subset of information management and this is where we really need to turn our focus. Remember the "good days" when databases would be normalised and we worried about the quality of data? Privacy for all its faults and detractors returns us to a point where we need to think about what data we're collecting, how we're storing it and for what purposes it is being used for - that's good information management.

Saturday 9 March 2013

What has surgery got to do with information privacy?

I'm addicted to books and reading and Amazon knows this - I'm willing to give up quite a lot of my privacy for good book suggestions. I went to Amazon to find a copy of Atul Gawande's The Checklist Manifesto and ended up buying his other two books as well: Better and Complications. I received them two days ago and I've finished Checklist and Better and just starting on Complications - compulsive and utterly fascinating reading about Gawande's insights into his work, surgery and medicine in general.

So why is a computer scientist reading this? Simply because we need more discipline and communication in this field. Surgery has cottoned onto this and is following the safety-critical practices of aviation to improve.

Performing audits, especially those which require a deep look inside a system such as privacy or security is remarkably similar to surgery.

We receive a system for audit, sometimes we get a description and a good idea of what to do, sometimes not. We need to diagnose the system, quite literally probing and performing tests and hoping we don't miss something: an insecurely calculated hash or a hidden transformation of an IP address into a location etc.

We then report back to the system owner with our diagnosis and treatment: hash this, destroy this data, stop collecting x,y and z, add this to the T&C's, add an opt-out, go for a security check etc etc...

We don't always know what we'll find until we open the system up. And like surgery, opening a computer system up is just as painful for the patient as well as the engineer.

Tuesday 5 March 2013

Category Theory for Scientists

Had to quickly make a note about this as I think David Spivak in his paper (book!) Category Theory for Scientists has written a wonderful guide to what category theory is and how it can be used outside of its topological home. I've always held that the manner of thinking required by category theory provides an incredible toolkit for conceptualising and working with all sorts of structures and concepts, so I'm very, very happy to see something like this.

The only other major work in a similar vein is the deeply mathematical Baez and Stay's paper Physics, Topology, Logic and Computation: A Rosetta Stone.

Weighting Metrics

I'm reading Richard Feynman's book What Do You Care What Other People Think? [1] - a fascinating account of the things that Feynman did and believed in: the power of science and the experiment (there's even a xkcd cartoon about that).

Feynman worked on the Challenger Commission which investigated why the shuttle Challenger exploded and concluded with the discovery of the O-ring failure in one of the solid rocket boosters. One of the most memorable incidents was Feynman's live O-ring in ice water experiment.

However, after dealing with metrics on various issues recently, a paragraph in the book where Feynman discovers the results of a go or no-go decision on the state of the O-rings under cold conditions. There are four named experts and four answers: 2 x no, 1 x yes, 1 x don't know - which effectively splits the vote 50-50 (for some reason don't know = yes).

However Feynman points out that the foremost experts on the properties of the O-rings both stated no and one of the four experts was not present at the original meeting. Taking this into account we get the following: 2w x no, 1v x yes, 1u x don't know, where w > v > u. Simple mathematics returns not a 50-50 split but a split where the no vote would overwhelm (even by a microscopic margin) the yes/don't know combined vote.

Suffice to say here that weighting of the inputs into the calculation here was critical to getting the righ results. This is not to say that finding the weights is not hard, but as we see in the case above even simple ordering would have sufficed.

The metrics are simple, the relevance and weighting unfortunately are forgotten and it is these that really tell you what the metrics mean and how to analyse them.

References

[1] Feynman R. (1988) What Do You Care What Other People Think? Penguin Books. 978-0-141-03088-3