Tweet coverage of the 2016 Bot Summit at the V&A in London

I was at the 2016 Bot Summit in London a couple of weeks ago. I did my best to capture salient points from every talk in a tweet. Here are all of them in order.

OSM Live Edit Screensaver

I’ve been running a live open streetmap edits view as a screen saver for a couple of years now and it never fails to draw the attention from people in the room (whether they know what OSM is or not). The OSM visualization is pretty cool and really comes to life when it is displayed full screen. It is also a great way to see a part of the world you might not have known existed. I used to browse atlases when I was a kid, so this is me indulging in virtual travel again.

Will attended me to the fact that I shot a video of it but I never wrote up the super basic process behind it, so here goes.

What it looks like:

This must have been the tweet by Thomas that started it all in early 2013.

After I read that I fiddled around a bit with making my own screensaver in XCode. That seems simple enough but building stuff on OS X is a bit of a pain if you’re used to iOS and definitely not something you’ll be able to finish in an hour or so. It turns out that there is a far far easier way.

  1. Install the webviewscreensaver. Thanks to Alastair Tse.
  2. Plugin the URL to the live OSM view into the screensaver. This one: http://osmlab.github.io/show-me-the-way/
  3. Enjoy.

Reading more about the social

As of right now I’m frightfully behind on my Latour MOOC. What I have been doing instead is reading up old articles in my Instapaper. One such is this interview with Dutch sociologist/philosopher Willem Schinkel in Vrij Nederland. It’s good to read a fresh Dutch thinker who seems to understand things (and who also is in with Latour). Calling Geert Wilders a proto-fascist and the Netherlands a museum are only a couple of the ringers in there.

The disappointing bit came at the end where he confessed to not having a cell phone out of principle. This is a terrible bit of intellectual laziness which brings me to this point on Sloterdijk by Adam Greenfield which rings true:

The task before us is to discover, or invent, a politics, a mobility and a conviviality that are both authentic to the circumstances in which we find ourselves and capable of giving full expression to the emancipatory potential that remains latent and unrealized in our networked technologies.

Week 322

So it turns out I’ve fallen immensely behind with the weeknotes over here, but we did start writing them at the new office now, which should make up for something. Those live at http://kantberlin.tumblr.com/ currently.

What happened that week was a bunch of work and getting a desk from IKEA to work on at the new place:
Upgraded to a small-ish roller desk

The Möbeltaxi driver took us on an interesting shortcut through the old service tunnels of Tempelhof —I am amazed that Moves tracked it as well as it did— which might be fun to do some urban exploring in at some point:
Secret route through the service tunnels of the old airport. Useful when there is traffic which is more or less always.

Back then we were still drinking some horrible leftover coffee brewed in a two step process:
The morning press and filter

And I had a talk for at Bits of Freedom that I sketched out on our brilliant new whiteboard:
I gave the new whiteboard a proper exercising for an upcoming talk

I promised the people future shock and I think I delivered that to some extent.

Don’t release anonymized datasets

There is no thing as an anonymized dataset. Anybody propagating this idea even tacitly is doing a disservice to the informed debate on privacy. Here’s a round up with some recent cases.

Re:publica

Just today Berlin visualization outfit Open Data City published a visualization of the devices that were connected to their access points during the Re:publica conference earlier this month. The visualization is a neat display of the ebb and flow of people in the various rooms during the event.

It is also a good attempt to change the discourse about data protection in Germany. The discourse tends to be locked in the full stop stance where absolutely ‘nothing is allowed’ without a ton of waivers. Because of that hassle, a lot of things which could be useful are not implemented. A more relaxed approach and a case by case decision on things would be better. In the case of Re:publica there does not seem to be any harm in making this visualization or in releasing the data (here find it on Fusion Tables where I uploaded it).

What I find to be a disservice to the general debate is the application of ‘pseudonymized’ data where the device ids have been processed with a salt and hash. The identifying characteristics have been removed but the ids are still linked across sessions making it possible to link identities with devices and figure out who was where exactly when during the conference.

To state again: at a professional conference such as Re:publica there would in all likelihood be no harm done if the entire dataset would be de-anonymized. The harm done is the pretense that processing a dataset in this way and then releasing it with the interlinkage across sessions is a good idea.

Which brings me to my next point.

Equens

Yesterday the Dutch company, Equens, that processes all payment card transactions announced a plan to sell these transactions to stores. Transactions would be anonymized but still linked to a single card. This would make it trivial for anybody with a comprehensive secondary dataset (let’s say Albert Heijn or Vodafone) to figure out which real person belongs to which anonymized card. That last fact was not reported in any of the media coverage of this announcement which is also terrible.

After a predictable uproar this plan was iced, but they will keep on testing the waters until they can implement something like this.

Today Foursquare released all real-time checkin data but with suitable anonymization. They publish only the location, a datetime and the gender of the person checking in. That is how this should be done.

License plates

Being in the business of opening data we at Hack de Overheid had a similar incident where a dataset of license plates was released where the plates had been md5’ed without a salt. This made it trivial to find out whether a given license plate was in that dataset.

This was quickly fixed. Again this is not a plea against opening data —which is still a good idea in most cases— but a plea for thinking about the things you do.

AOL search data

The arch-example of poorly anonymized search data is of course still the AOL search data leak from back in 2006. That case has been extensively documented, but not extensively learned from.

Memory online is frightfully short as is the nature of the medium but it becomes annoying if we want to make progress on something. Maybe it would be better altogether to lose the illusion that progress on anything can be made online.

For the privacy debate it would be good to keep in mind that the increasingly advanced statistical inference available means that almost all anonymization is going to fail. The only way around this is to not store data unless you have to or to accept the consequences when you do.

Who owns the future?

In Conversation: Jaron Lanier and James Bridle On Who Owns the Future? from The School of Life on Vimeo.

I have just watched the above conversation between Jaron Lanier and James Bridle in Conway Hall organized by the School of Life. The event was to mark the occasion of Lanier’s new book “Who Owns The Future?” (Guardian review) and the conversation focused on some interesting ideas from it. I will probably not read the book itself, but I think the things said in the video above can be taken by themselves and though they are provocative they do not motivate me to give Lanier any money.

The main issue is that Lanier signals some interesting problems (He’s not alone. Om Malik just posted this about Data Darwinism), he makes some terrible comparisons and posits solutions that are wholly unconvincing.

Problems

Laniers big idea is that those with the biggest computers on the network (and the largest collection of brains to program those computers) are in danger of becoming the rentiers of big data. They will be able to out-compute everybody else and figure out what Gibson called the ‘order flow’ in his Blue Ant trilogy: the best set of actions given the circumstances.

That is an interesting if not exactly novel idea. It serves as a jumping off point into some outright crazy ideas about intellectual property. Lanier compares the contraction created by the current austerity measures with what is happening in the music industry. This is a ridiculous comparison. Even if it did hold, then whatever is happening is an overdue correction to a situation that was unsustainably overleveraged.

In the same vein he waves around the scarecrow that ‘the economy will shrink’. A notion that will undoubtedly play well with the same audience that is inclined to buy his book. Rhetoric about shrinking economies is almost always a phantom. Economic shrinkage may very well be in our near future and does not necessarily need to be a bad thing.

Lanier’s point that people are forced into an informal economy is valid but it speaks more to the failure of social systems than anything else. The social democratic contract that may be inconceivable for Americans is working quite well in Europe. It may need updating both for changing demographics and the digital age, but I don’t think many people here would trade it for what Lanier is peddling. Like I mentioned in my data tax post, we don’t have the problem of musicians who can’t pay their medical bills.

Solutions

The proposed solutions are even more problematic (though if you’re so inclined you might term them ‘thought provoking’).

Lanier seems overly influenced by the music industry and by the concept of private copyright. I would assert that the music industry with its track record is not something worth emulating. The sky is not falling in the music industry. They are facing a long overdue re-evaluation of their social contract because their carrier of value has lost its excludability. There are still lost of people making music and thriving.

Lanier seems to roughly comprehend how a just society should work: ‘For society to be democratic, income needs to be distributed in a way that is roughly a bell curve.’ but at the same time he seems to be confused how it should be implemented: ‘Socialism needs to be off the table in the information age.’

The bidirectional reference networks that Lanier proposes that preserve the context and provenance of data sound fantastic. There are however real reasons why we are doing the ‘profoundly dumb thing we are doing’ instead. His network sounds awfully similar to the idea of the semantic web, where everything online will work perfectly if only we would do it The Right Way (which we of course never will).

His solution to ‘Become as aware as possible of how you fit in other people’s computation schemes.’ is a good idea. It is the same algorithmic literacy pointed to in work by Kevin Slavin, Douglas Rushkoff and James Bridle himself.

I’m afraid that Lanier’s rhetoric of a ‘more honest accounting’ will play particularly well in Germany where similar words are already being used to take Google to court. Germany passed a Leistungsschutzrecht (ancillary copyright for publishers) because they figured out that large American companies were making outlandish amounts of money based on the work of large German publishing houses.

The conversation of a fair distribution of wealth in a power-law based networked economy is one we need to have. I doubt though if this particular book is a good starting point for such a conversation. Lanier’s cultural foundations point us towards a solution that is at best unrealistic and tries to extrapolate the problematic private notion of copyright to society as a whole.

The data tax I wrote about yesterday is an approach from a more public point of view. That would focus more on personal data and the revenue generated from such a tax would go into government so it would be subject to democratic controls. Ideas that won’t fly well with Lanier’s Silicon Valley crowd, but maybe that’s all the better.

Taxing data is not crazy

There are some interesting similarities between a recent proposal commissioned by the French government and the book out by Jaron Lanier just now “Who Owns The Future?”

Both analyses signal the dominance of corporate actors in a big data world and both suggest new methods of taxation as a potential solution to the problem. An article over at Forbes explains the commission’s proposal by Nicolas Colin and makes a lot of sense.

The French report has been received with predictable knee-jerk responses across the tech world. It is true that governments have not been very good at regulating the internet. But not regulating the internet is not a solution. We could hope for representation that is competent when it comes to the digital world.

The companies that create the internet should not cry foul. They have a track record of evading taxes more than contributing their fair share back to society.

I’ll tackle Lanier’s position in another post. I just watched the conversation he had with James Bridle in Conway Hall and noticed some errors in Lanier’s ideas: they require a fully functional semantic web, they seem overly informed by private copyright practice and complementarily they take a weak government for granted.

How you would enforce such a law is another question entirely, but it cannot go further off the mark than how large companies manage to evade taxes right now. It may in fact be a lot fairer to tax data at the point of collection/use.

If you don’t bother to read the article above, I can sum it up in two key points below:

Data is hazardous waste material and as such its production and storage should be discouraged (the CO2 tax was given as an example in the Forbes article). Cory Doctorow compared personal data breaches to nuclear disasters, because the fallout is so tremendously hard to contain and control. Whoever collects large amounts of personal data treats the privacy damage caused by breaches as an externality. As such the storage of such data should be discouraged with a tax.

Data is capital and should be taxed as all capital is. Storage, mining and arbitrage using data can generate revenue for sophisticated market actors (those that Lanier terms as those with ‘the biggest computer on the network’). Data is a value adding asset that generates wealth and more data for those who already have it. If we don’t want a situation where a small group of people get richer at the expense of everybody else, we should tax it.

So data is both capital and hazardous. We tax many things with either of those properties so we should definitely tax something that has both.

Hosting on Heroku with functioning MX records

It seems to be not completely obvious how to host a website on heroku while at the same time also maintaining e-mail delivery. You would think that this is a very common situation and it would be well documented but unfortunately it is not.

We got a DNSimple account because that’s the way that heroku allows naked domains to function. DNSimple sets up the ALIAS record for you rather easily, but what it doesn’t do is warn you if you have both MX and CNAME records on something. What happens is that the CNAME record always takes precedence as a redirect so your e-mails are then routed to proxy.heroku.com. Something that is undesirably and that DNSimple should warn against.

What turns out to be the best solution is to set ALIAS records for both your apex domain and your subdomains (as proposed here). This way you don’t need a CNAME record anymore that can interfere with other settings. Heroku in their documentation advise you to use a CNAME record, so I’m going to ask them if there are any problems with using an ALIAS for all web routing.

The other option would be to purchase another plan for Zerigo which seems to be heroku’s preferred solution for this issue right now. Again this is rather poorly documented and we would have liked to be informed about that before we chose for the DNSimple option.

Update: Heroku replied with the following.

Great question. The ALIAS record, created by DNSimple, is basically a bunch of magic that does a combination of what CNAMEs and A Records do, but does it behind the scenes. You can read more about the ALIAS records here: http://blog.dnsimple.com/zone-apex-naked-domain-alias-that-works/

That said, DNSimple would likely be better quipped to answer a question like this. I don’t see any reason why you couldn’t use ALIAS records in place of CNAMEs. There might be a slight difference in performance between the two, but I’m not certain enough about that to say for sure.

After which I asked the same question over at DNSimple on their blog. That comment is awaiting moderation and an answer but I’ll post that here as soon as it appears.

Watersnake, a simple voting app

My small project during Swhack was to create a django version of a delegated voting system partially inspired by Liquid Feedback and the manyfold problems that system has. In particular that it is written on such an esoteric stack that it is near impossible to get running without root on a Linux machine and let’s not even discuss the maintenance. What is even worse is that it makes it nearly impossible for outsiders to join the project and contribute to it significantly.

In this interview about Liquid Democracy you can read quite clearly how the technical mandate drives the direction of the project. Something that may not be very desirable if you think of it as a democracy-centric issue and not a technology-centric one.

So to see how hard it would be to write something similar in vanilla django. It’s easy to hate on django but you can find tons of people who can work on this in just about every major city, the framework and the documentation are mature and many parts of the framework can be called excellent.

I thought putting something together that at its core implements a delegated voting engine should be doable in an afternoon and it was. What took the most time was playing around with the settings of the testrunner which I hadn’t really used before. So the watersnake app in this project does majority voting on single proposals with support for delegation.  To see it work you have to run the tests, but building this out into a full fledged (web) app that can be deployed to heroku with a single command is technically trivial (and also time consuming).

This wasn’t a stretch to implement right now because I’m also doing some other projects which border on collaborative writing/decision making/filtering. As always, technology is neither the problem or the solution, but certain technical systems grant different socio-technical affordances than others. I will probably not work on this unless there is a clear demand, but I thought it would be useful to debunk the idea that building such a system needs to be difficult or complex.

Week 308

Besides the immense amount of things we did over at Hubbub last week, I also spent a lot of time doing various other things which sort of amazed me to be honest.

Giving this another go with my improved German skills #digiges

Tuesday I went to the Netzpolitische Abend here in c-base where Janneke Slöetjes of Bits of Freedom was one of the speakers. It was great fun catching up with what they’ve been busy with and the activist’s life.

And on Saturday Jan Lehnardt and I organized the first Swhack Berlin, a commemorative hackathon to do the things that we would normally only talk about. A round-up of the things we did is still forthcoming, but everybody is super-busy of course. It was a lot of fun and I was pleasantly surprised even by the 10+ people who showed up and got busy. We’ll do another one sometime in the near future.