Sunday 21 March 2010

Dev8D reflections and the real world: bits, atoms and collaboration

It's one month on from the Dev8D [10] February 2010 meeting, and now the immediate buzz has faded a little, I'm looking back with a little more perspective on the impact that meeting has had on me.

My plans for using Drupal have been sidelined somewhat by the pressure of day-to-day work, but I did manage to join in an Oxford Drupal Meetup and learned some more about using this platform.  I think I'm getting a better idea of its strengths.

The other topic that has really impacted my subsequent thinking is the RepRap demonstration and talk by Adrian Bowyer.  Prima facie, it's a low-cost 3-D printer.  But it's also far more than that, and the implications of the interplay between open source software, open data and personal manufacturing have been food for thought over the past few weeks.  These thoughts have been complemented by discussions of open data noted on O'Reilly Radar [1], and the extent to which the open data movement might learn from open source software, and recent comments by Eben Moglen [2] about software freedom and cloud computing, which in part explores the relationship between client/server, or broadcast/consumer in contrast to peer-to-peer architecture: are we surrendering power to dictate what we do or see to a few large corporations.  (There's a kind of counterpoint here to Jonathan Zittrain's thoughts about generative vs sterile computing platforms, and the extent to which an open, generative platform is precisely what allows malware and other bad things to happen to our information handling systems.)

It seems that affairs are coming full circle: agricultural societies were transformed by the industrial revolution, which institutionalized manufacturing in factories, and gave us mass produced products, which in turn gave us integrated electronics, computers, and personal computers. Personal computers have underpinned a whole new industry of software production, where the low cost of hardware required, putting the means of software production in the hands of individuals.  Meanwhile, the same trends in electronic devices have also given rise to computer networking, the Internet and the World Wide Web.  These trends in computing hardware and networking prepared the ground for open source software, and a whole new parallel economy of software production that is not dominated by institutions. (The Drupal system and its associated community is a good example of this parallel economy: individual traders and small business providing high quality online presence to small businesses and "third sector" organizations).

At Dev8D we saw several examples of how the open software model is starting to impact other areas of economic activity.  Following on from the linked data meetup on the first day (which is where I learned about RDF support in Drupal 7 [11]), there was much talk of doing interesting and useful things with open data, and especially open government and museum data.  But, to my mind even more significantly, we learned about grassroots open source/open data activities that are reaching back into the physical world - the world of atoms, not just bits. Throughout the meeting, workshops taught interfacing and programming Arduino devices [3] to observe and control events in the real world.  Arduino is a low-cost, open source electronic design that interfaces to a computer via USB (or network interfaces), and also connects to various kind of sensor and control interfaces.  The kinds of capabilities previously available for industrial process control are being made accessible to individuals.  And to top it all, there was RepRap: the replicating rapid prototyper [4].

As already noted, RepRap is at first view a low-cost 3-D printer.  But it is also a 3-D printer that can make about half of its own parts, and the remaining parts are standard commodity hardware and electronic devices.  Furthermore, the RepRap design is completely free and open. Thus, not only does RepRap put a basic manufacturing capability into individual hands, it also democratizes the capability to create more RepRap machines.  The possible economic, social and ecological consequences are quite heady stuff.  It doesn't stop there:  Adrian Bowyer is also looking to develop RepRap to the point that it can manufacture objects from Polylactic acid, which can be produced by fermentation from appropriate vegetable crops, and even to use RepRap to manufacture the fermentation hardware required.

So maybe we are seeing the capability for high-tech manufacturing becoming an integral part of a new kind of agricultural society?

I think we may be moving towards a world in which software and data and hardware don't have to live in separate compartments, each controlled by their own high priesthood of industrial powerhouses, but where the potential to provide a wide range of needs can be met within smaller communities.  The Dev8D meeting showed a glimpse of these possibilities, and some further investigations over the weeks since have shown some areas where these possibilities are edging towards reality [6] [7] [8].

During the course of these investigations, I came across a striking phrase in a context whose source I have unfortunately lost.  From a premise that a key reason for concentrating manpower in factories and offices was so that expertise could be brought together to achieve some greater goal, and noting Bill Joy's phrase "wherever you work, most of the smart people are somewhere else", it seems that given open communications and open data, we don't have to all come to the same factory, as the really smart person who can solve our problem can be accessible anywhere in the world.  And to engage all this talent?  I think some pointers are in Tim Bray's "Message from the Web" [9].

All this may seem to be a long way from JISC and Dev8D.  But I think that having the opportunity to meet with and learn from a wide range of other developers, we can become aware of and respond to new ideas that are popping up all around us.  And seeing working demonstrations of real-world interfaces created within the short time available brings home the real possibilities in ways that no mere slideware ever can do.

#g.

[1] http://radar.oreilly.com/2010/03/truly-open-data.html
[2] http://www.softwarefreedom.org/news/2010/feb/08/audio-and-video-eben-moglens-talk-freedom-cloud-no/
[3] http://www.arduino.cc/
[4] http://reprap.org/
[5] http://en.wikipedia.org/wiki/Polylactic_acid
[6] http://www.wired.com/magazine/2010/01/ff_newrevolution/all/1
[7] http://webworkerdaily.com/2010/02/10/the-future-of-work-from-bits-to-atoms/
[8] http://www.blueprintmagazine.co.uk/index.php/architecture/the-worlds-first-printed-building/
[9] http://www.tbray.org/ongoing/When/200x/2007/12/12/XBRL-Web
[10] http://dev8d.org/
[11] http://buytaert.net/rdfa-and-drupal

Monday 8 March 2010

Semantic wikis, content management systems and linked data

For some time I've been interested in connections between human readable web pages and machine-processable data on the web, specifically data that can be presented as RDF. Initiatives like RDFa, GRDDL and Microformats offer ways to include both in the same web resource, but don't entirely address the problem of authoring both through a common system or interface.

An early promising approach for addressing this problem was semantic wikis, exemplified by Semantic Media Wiki. For limited purposes, these work really nicely, but marking up data for machine processing often doesn't appear to yield sufficient benefit to justify the additional effort.

I recently attended the 2010 JISC developer meeting, Dev8D (http://dev8d.org/, http://wiki.2010.dev8d.org/w/Main_Page), and one of the topics that lit my fire was Drupal 7. This new release of Drupal (in alpha at the time of writing) brings RDF data into the system's core. I heard that the structure of any content delivered by Drupal can also be exposed as RDF, either as RDFa though the normal web interface, or via a queryable SPARQL endpoint. Wow! All that data accessible as linked data!

So I immediately set about redirecting a nascent project idea, which I'd originally conceived to be based on Semantic MediaWiki, to use Drupal 7 instead. I haven't made a lot of progress, but thinking about the new approach caused me to think about the relative strengths of semantic wikis and "semantic content management systems", which is a description I've just minted to describe systems like Drupal 7.

My original mini-project idea was to collect prose descriptions of technical topics and experts, and use typed links to capture relationships between them, in a way that might conceivably be useful for finding someone of whom to ask a focused technical question. The notion of starting with unconstrained free text was appealing, as relevant structure is not always evident when information is first assembled or recorded. On hearing about RDF support in Drupal 7, I contemplated using it as an alternative way to capture this loosely structured information, anticipating that Drupal's support for linking between nodes would allow the structure to emerge from free-form textual descriptions in much the same way as a semantic wiki.

Since then, I have noticed that a content management system (CMS) imposes a greater degree of uniformity between all object descriptions than a semantic wiki. At heart, the CMS stores information in relational database tables, where each record (row) of a given table follows broadly the same pattern. Of course, different types of record can have very different structures. But, to enter information into such a system, one must first decide what type of object is being described, and this in turn circumscribes the structure of information that can be entered.

In contrast, a semantic wiki element is first and foremost a free text description, within which elements of structure may occasionally be discerned, identified and marked up for semantic analysis. But where there is no such structure, the text may still stand alone, as an unconstrained description free of predetermined structure.

So it seems that semantic wikis and content management systems approach the same goal of combined machine-processable and human-readable information from entirely different directions. The semantic wiki from the origin of unstructured free-format text, within which structure can be discerned and encoded with suitable ad-hoc effort. The CMS from structured data whose individual elements may be unstructured free-form text. The inherently-structured CMS approach makes it easier to capture predetermined structure, while the unstructured wiki approach makes it easier to enter information absent of structure, or whose structure is yet to be determined.

Is there a point where these two approaches meet, combining the advantages of both? I.e., easy entry of unstructured information combined with easy capture or extraction of structure. I don't know, but I have some ideas that might make this possible. Without previously realizing it, I think this is one factor behind my work on Shuffl (http://code.google.com/p/shuffl/), but that project is sill a way from realizing this in a usable fashion.

This is a topic that I shall continue to think about.

Hello world

With a first-post title like this, I have to be a software person, right?

This is a place for me to make some noise; along the way, maybe I'll also manage to leave a trace of signal.