Coleman McCormick

Archive of posts with tag 'Open Source'

✦
✦
✦

Waypoint — a Raspberry Pi GPS Tracker

January 13, 2021 • #
✦
✦

Weekend Reading: Software Builders, Scarcity, and Open Source Communities

September 19, 2020 • #

👨‍💻 We Need More Software Builders, Not Just Users

On the announcement of Airtable’s latest round and $2.5b valuation (!), founder Howie Liu puts out a great piece on the latest round of changes in pursuit of their vision.

No matter how much technology has shaped our lives, the skills it takes to build software are still only available to a tiny fraction of people. When most of us face a problem that software can answer, we have to work around someone else’s idea of what the solution is. Imagine if more people had the tools to be software builders, not just software users.

📭 Scarcity as an API

Artificial scarcity has been an effective tactic for certain categories of physical products for years. In the world of atoms, at least scarcity can be somewhat believable if demand outstrips supply — perhaps a supply chain was underfilled to meet the demands of lines around the block. Only recently are we seeing digital goods distributed in similar ways, where the scarcity is truly 100% forced, where the scarcity is a line of code that makes it so. Apps like Superhuman or Clubhouse generate a large part of their prestige status from this approach.

Dopamine Labs, later called Boundless, provided a behavioral change API. The promise was that almost any app could benefit from gamification and the introduction of variable reward schedules. The goal of the company was to make apps more addictive and hook users. Like Dopamine Labs, a Scarcity API would likely be net-evil. But it could be a big business. What if a new software company could programmatically “drop” new features only to users with sufficient engagement, or online at the time of an event? What if unique styles could be purchased only in a specific window of time?

🥇 The Glory of Achievement

Antonio Garcia-Martinez’s review of Nadia Eghbal’s book on open source, Working in Public.

It’s really about how a virtualized, digital world decoupled from the physical constraints of manufacturing and meatspace politics manages to both pay for and govern itself despite no official legal frameworks nor institutional enforcement mechanisms. Every class of open source project in Eghbal’s typology can be extended to just about every digital community you interact with, across the Internet.

✦
✦
✦

Weekend Reading: Beastie Boys, Links, and Screencasting

May 2, 2020 • #

🎥 Beastie Boys Story

We watched this a couple nights ago. It’s hard to tell how objectively good it was, but I loved the heck out of it as a decades-long fan.

🔗 Linkrot

I’ll have to try out this tool that Tom built for checking links. When I’ve run those SEM tools that check old links, I get sad seeing how many are redirected, 404’d, or dead.

📹 Screencasting Technical Guide

This is an excellent walkthrough on how to make screencasts. I’ve done my own tinkering around with ScreenFlow to make a few things for Fulcrum. It’s something I want to do more of eventually. A good resource for gear + tools, preparation, and editing.

✦

Weekend Reading: Tagging with Turf, Mars Panorama, and Kinds of Easy

March 7, 2020 • #

🗺 turf-tagger

Bryan put together this neat little utility for merging point data with containing polygon attributes with spatial join queries. It uses Turf.js to do the geoprocess in the browser.

🚀 Mars Curiosity High-Res Panorama

Amazing photography of the Mars surface:

NASA’s Curiosity rover has captured its highest-resolution panorama yet of the Martian surface. Composed of more than 1,000 images taken during the 2019 Thanksgiving holiday and carefully assembled over the ensuing months, the composite contains 1.8 billion pixels of Martian landscape. The rover’s Mast Camera, or Mastcam, used its telephoto lens to produce the panorama; meanwhile, it relied on its medium-angle lens to produce a lower-resolution, nearly 650-million-pixel panorama that includes the rover’s deck and robotic arm.

⚒ Different Kinds of Easy

  1. “Easy” because there’s a delay between benefit and cost.

The cost of exercising is immediate. Exercise hurts while you’re doing it, and the harder the exercise the more the hurt. Investing is different. It has a cost, just like exercising. But its costs can be delayed by years.

Whenever there’s a delay between benefit and cost, the benefits always seem easier than they are. And whenever the benefits seem easier than they are, people take risks they shouldn’t. It’s why there are investing bubbles, but not exercise bubbles.

✦
✦

Weekend Reading: Blot, Hand-Drawn Visualizations, and Megafire Detection

November 9, 2019 • #

📝 Blot.im

Blot is a super-minimal open source blogging system based on plain text files in a folder. It supports markdown, Word docs, images, and HTML — just drag the files into the folder and it generates web pages. I love simple tools like this.

🖋 Handcrafted Visualization: Precision

An interesting post from Robert Simmon from Planet. These examples of visualizations and graphics of physical phenomena (maps, cloud diagrams, drawings of insects, planetary motion charts) were all hand-drawn, in an era where specialized photography and sensing weren’t always options.

A common thread between each of these visualizations is the sheer amount of work that went into each of them. The painstaking effort of transforming a dataset into a graphic by hand grants a perspective on the data that may be hindered by a computer intermediary. It’s not a guarantee of accurate interpretation (see Chapplesmith’s flawed conclusions), but it forces an intimate examination of the evidence. Something that’s worth remembering in this age of machine learning and button-press visualization.

I especially love that Apollo mission “lunar trajectory” map.

🔥 The Satellites Hunting for Megafires

Descartes Labs built a wildfire detection algorithm and tool that leans on NASA’s GOES weather satellite thermal spectrum data, in order to detect wildfires by temperature:

While the pair of GOES satellites provides us with a dependable source of imagery, we still needed to figure out how to identify and detect fires within the images themselves. We started simple: wildfires are hot. They are also hotter than anything around them, and hotter than at any point in the recent past. Crucially, we also know that wildfires start small and are pretty rare for a given location, so our strategy is to model what the earth looks like in the absence of a wildfire, and compare it to the situation that the pair GOES satellites presents to us. Put another way our wildfire detector is essentially looking for thermal anomalies.

✦
✦
✦

Search the Archives

September 5, 2019 • #

Since I’ve been posting here so frequently, it’s gotten challenging to scroll through the archive to find links to things I wrote about before. Last night I worked on implementing a simple site search page that searches the title, text, and tags of posts to find relevant content. This is a short post on how I put that together.

I use Jekyll to manage the site content and generation, with all of my posts written as markdown files with some custom front-matter to handle things like tagging, search-friendliness, and some other things used in templating the site. There are also a couple of other custom things build using Jekyll’s “collections” feature for other content types, like my Books section.

I ran across a library called Lunr that provides client-side search on an index that generates when your site builds. It’s small and simple, outputting an index as JSON document that then can be combed through from a search to return data from posts or other content types. This provided exactly what I wanted: lightweight and something that would support Jekyll and GitHub hosting that I use without having to change anything, add third-party indexing services, or use clunky Google Site Search to accomplish. I wanted something native to my own site.

The best implementation that I found that matched what I wanted was from Katy DeCorah. Using her version as a starter, I took that and customized to fit my site and index and return the specific data I wanted to appear in search results. The outcome looks nice and is certainly simple to use. Right now it still only supports searching the post archives, but that’s good enough for now. I’m still exploring ways to browse my archives by other dimensions like tags, but I want to do that in a way that’s useful and also as lightweight as possible.

Head to the /search page and check it out.

✦

Weekend Reading: tracejson, Euclid, and Designing at Scale

August 24, 2019 • #

🛰 tracejson

An extension to the GeoJSON format for storing GPS track data, including timestamps. GPX has been long in the tooth for a long time, but it works and is supported by everything. This approach could have some legs if application developers are interested in a new, more efficient, way of doing things. I know I’d like to explore it for Fulcrum’s GPS-based video capability. Right now we do GPX and our own basic JSON format for storing the geo and elapsed time data to match up video frames with location. This could be a better way.

🔷 Byrne’s Euclid

This is a gorgeous web recreation of Oliver Byrne’s take on Euclid’s Elements. A true work of art.

🎨 Design Tooling at Scale

This post from the Dropbox design team dives into how a large team with a complex product created a design system for a consistent language. It goes into how they organize the stack of design elements and structures using Figma for collaboration.

✦

Weekend Reading: Terrain Mesh, Designing on a Deadline, and Bookshelves

August 17, 2019 • #

🏔 MARTINI: Real-Time RTIN Terrain Mesh

Some cool work from Vladimir Agafonkin on a library for RTIN mesh generation, with an interactive notebook to experiment with it on Observable:

An RTIN mesh consists of only right-angle triangles, which makes it less precise than Delaunay-based TIN meshes, requiring more triangles to approximate the same surface. But RTIN has two significant advantages:

  1. The algorithm generates a hierarchy of all approximations of varying precisions — after running it once, you can quickly retrieve a mesh for any given level of detail.
  2. It’s very fast, making it viable for client-side meshing from raster terrain tiles. Surprisingly, I haven’t found any prior attempts to do it in the browser.

👨🏽‍🎨 Design on a Deadline: How Notion Pulled Itself Back from the Brink of Failure

This is an interesting piece on the Figma blog about Notion and their design process in getting the v1 off the ground a few years ago. I’ve been using Notion for a while and can attest to the craftsmanship in design and user experience. All the effort put in and iterated on really shows in how fluid the whole app feels.

📚 Patrick Collison’s Bookshelf

I’m always a sucker for a curated list of reading recommendations. This one’s from Stripe founder Patrick Collison, who seems to share a lot my interests and curiosities.

✦
✦
✦
✦

Discovering QGIS

May 29, 2019 • #

This week we’ve had Kurt Menke in the office (of Bird’s Eye View GIS) providing a guided training workshop for QGIS, the canonical open source GIS suite.

It’s been a great first two days covering a wide range of topics from his book titled Discovering QGIS 3.

The team attending the workshop is a diverse group with varied backgrounds. Most are GIS professionals using this as a means to get a comprehensive overview of the basics of “what’s in the box” on QGIS. All of the GIS folks have the requisite background using Esri tools throughout their training, but some of us that have been playing in the FOSS4G space for longer have been exposed to and used QGIS for years for getting work done. We’ve also got a half dozen folks in the session from our dev team that know their way around Ruby and Python, but don’t have any formal GIS training in their background. This is a great way to get folks exposure to the core principles and technology in the GIS professional’s toolkit.

Kurt’s course is an excellent overview that covers the ins and outs of using QGIS for geoprocessing and analysis, and touches on lots of the essentials of GIS (the discipline) and along the way. All of your basics are in there — clips / unions / intersects and other geoprocesses, data management, editing, attribute calculations (with some advanced expression-based stuff), joins and relates, and a deep dive on all of the powerful symbology and labeling engines built into QGIS these days1.

The last segment of the workshop is going to cover movement data with the Time Manager extension and some other visualization techniques.

  1. Hat tip to Niall Dawson of North Road Geographics (as well as the rest of the contributor community) for all of the amazing development that’s gone into the 3.x release of QGIS! â†Š

✦
✦
✦

FOSS4G North America 2019

April 11, 2019 • #

Next week Joe and I will be out in San Diego for FOSS4G-NA 2019. This’ll be my first one since I think 2012. There’s always an excellent turnout and strong base of good folks to catch up with. This year they’ve put together a B2B and Government Theme day to kick it off, which to my knowledge is a new thing for an event typically focused on the eponymous free, open source, and community-driven projects.

FOSS4G-NA 2019

I thumbed through the agenda to pick out some topics I’m interested in catching this year:

  • Open source for utilities and telecom
  • OpenStreetMap and WikiData
  • Open source in higher education
  • PDAL
  • OpenDroneMap
  • “Digital twin” technology for infrastructure
✦
✦

Weekend Reading: Calculator, SaaS Metrics, and System Shock

March 9, 2019 • #

💻 Open Sourcing Windows Calculator

Seems silly, but this kind of thing is great for the open source movement. There’s still an enormous amount of tech out there built at big companies that creates little competitive or legal risk by being open. Non-core tools and libraries (meaning not core to the business differentiation) are perfect candidates to be open to the community. Check it on GitHub.

📊 The Metrics Every SaaS Company Should Be Tracking

An Inside Intercom interview with investor David Skok, the king of SaaS metric measurement. His blog has some of the best reference material for measuring your SaaS performance on the things that matter. This deck goes through many of the most important figures and techniques like CAC:LTV, negative churn, and cohort analysis.

🎮 Shockolate — System Shock Open Source

A cross-platform port of one of the all-time great PC games, System Shock1. I don’t play many games anymore, but when I get the itch, I only seem to be attracted to the classics.

  1. Astute readers and System Shock fans will recognize a certain AI computer in this website’s favicon. â†Š

✦
✦

Weekend Reading: Fulcrum in Santa Barbara, Point Clouds, Building Footprints

February 2, 2019 • #

👨🏽‍🚒 Santa Barbara County Evac with Fulcrum Community

Our friends over at the Santa Barbara County Sheriff have been using a deployment of Fulcrum Community over the last month to log and track evacuations for flooding and debris flow risk throughout the county. They’ve deployed over 100 volunteers so far to go door-to-door and help residents evacuate safely. In their initial pilot they visited 1,500 residents. With this platform the County can monitor progress in real-time and maximize their resources to the areas that need the most attention.

“This app not only tremendously increase the accountability of our door-to-door notifications but also gave us a real time tracking on the progress of our teams. We believe it also reduced the time it has historically taken to complete such evacuation notices.”

This is exactly what we’re building Community to do: to help enable groups to collaborate and share field information rapidly for coordination, publish information to the public, and gather quantities of data through citizens and volunteers they couldn’t get on their own.

☁️ USGS 3DEP LiDAR Point Clouds Dataset

From Howard Butler is this amazing public dataset of LiDAR data from the USGS 3D Elevation Program. There’s an interactive version here where you can browse what’s available. Using this WebGL-based viewer you can even pan and zoom around in the point clouds. More info here in the open on GitHub.

🏢 US Building Footprints

Microsoft published this dataset of computer-generated building footprints, 125 million in all. Pretty incredible considering how much labor it’d take to produce with manual digitizing.

✦

Weekend Reading: Shanghai, Basecamp, and DocuSaurus

January 26, 2019 • #

🇨🇳 195-Gigapixel Photo of Shanghai

Shot from the Oriental Pearl Tower, the picture shows enormous levels of detail composited from 8,700 source photos. Imagine this capability available commercially from microsatellite platforms. Seems like an inevitability.

🏕 How Basecamp Runs its Business

I, like many, have admired Basecamp for a long time in how they run things, particularly Ryan Singer’s work on product design. This talk largely talks about how they build product and work as an organized team.

📄 Docusaurus

This is an open source framework for building documentation sites, built with React. We’re currently looking at this for revamping some of our docs and it looks great. We’ll be able to build the docs locally and deploy with GitHub Pages like always, but it’ll replace the cumbersome stuff we’ve currently got in Jekyll (which is also great, but requires a lot of legwork for documentation sites).

✦

Fulcrum Desktop

January 4, 2019 • #

A frequent desire for Fulcrum customers is to maintain locally a version of the data they collect with our platform, in their database system of choice. With our export tool, it’s simple to pull out extracts in formats like CSV, shapefile, SQLite, and even PostGIS or GeoPackage. What this doesn’t allow, though, is an automatable way to keep a local version of data on your own server. You’d have to extract data manually on some schedule and append new stuff to existing tables you’ve already got.

A while back we built and released a tool called Fulcrum Desktop, with the goal of alleviating this problem. It’s an open source command line utility that harnesses our API to synchronize content from your Fulcrum account into a local database. It supports PostgreSQL (with PostGIS), Microsoft SQL Server, and even GeoPackage.

Other than the primary advantage of providing a way to clone your data to your own system, one of the cool things you can do with Desktop is easily make your data available to your GIS users in a tool like QGIS. It also has a plugin architecture to support other cool things like:

  • Media management — syncing photos, videos, audio, signatures
  • S3 — storing media files in your own Amazon S3 bucket
  • Reports — Generating PDF reports

If you have the Fulcrum Developer Pack with your account, you have access to all of the APIs, so you need that to get Desktop set up (though it is available on the free trial tier).

We’ve also built another utility called fulcrum-sync that makes it easy to set up Desktop using Docker. This is great for version management, syncing data for multiple organizations, and overall simplifying dependencies and local library management. With Docker “containerizing” the installation, you don’t have to worry about conflicting libraries or fiddle with your local setup. All of the FD installation is segmented to its own container. This utility also makes it easier to install and manage FD plugins.

✦

Kindle Highlights

December 14, 2018 • #

I started making this tool a long time back to extract highlighted excerpts from Kindle books. This predated the cool support for this that Goodreads has now, but I still would like to spend some time getting back to this little side project.

Eric Farkas has another tool that looks like it does this, as well, so that’s worth checking out as a possible replacement. What I really want is my own private archive of the data, not really my own custom extraction tool. The gem I was using for mine might’ve been the same one, or does something similar reading from Amazon’s API. It’s nice because it outputs the data in JSON, so then it can be easily parsed apart into yaml or Markdown to use elsewhere. Each excerpt looks like this:

{
  "asin": "B005H0O8KQ",
  "customerId": "A28I9D90ISXNT6",
  "embeddedId": "CR!CJ3JV6W1D918FDT8WZTVP0GG6CNN:86C04A71",
  "endLocation": 72905,
  "highlight": "Springs like these are the source of vein-type ore deposits. It's the same story that I told you about the hydrothermal transport of gold. When rainwater gets down into hot rock, it brings up what it happens to find there—silver, tungsten, copper, gold. An ore-deposit map and a hot-springs map will look much the same. Seismic waves move slowly through hot rock.",
  "howLongAgo": "2 months ago",
  "startLocation": 72539,
  "timestamp": 1446421339000
}

If I can soon I’ll spend some time tinkering and see if I can pull some for other books I’ve read since.

✦

OpenDroneMap

October 24, 2018 • #

Since I got the Mavic last year, I haven’t had many opportunities to do mapping with it. I’ve put together a few experimental flights to play with DroneDeploy and our Fulcrum extension, but outside of that I’ve mostly done photography and video stuff.

OpenDroneMap came on a scene a couple years ago as a toolkit for processing drone imagery. I’ve been following it loosely through the Twittersphere since. Most of my image processing has been done with DroneDeploy, since we’d been working with them on some integration between our platforms, but I was curious to take a look once I saw the progress on ODM. Specifically what caught my attention was WebODM, a web-based interface to the ODM processing backend — intriguing because it’d reduce friction in generating mosaics and point clouds with sensible defaults and a clean, simple map interface to browse resulting datasets.

OpenDroneMap aerial

The WebODM setup process was remarkably smooth, using Docker to stand-up the stack automatically. All the prerequisites you need are git, Python, and pip running to get started, which I already had. With only these three commands, I had the whole stack set up and ready to process:

git clone https://github.com/OpenDroneMap/WebODM --config core.autocrlf=input --depth 1
cd WebODM
./webodm.sh start

Pretty slick for such a complex web of dependencies under the hood, and a great web interface in front of it all.

Using a set of 94 images from a test flight over a job site in Manatee county, I experimented first with the defaults to see what it’d output on its own. I did have a bit of overlap on the images, maybe 40% or so (which you need to generate quality 3D). I had to up the RAM available to Docker and reboot everything to get it to process properly, I think because my image set is pushing 100 files.

ODM processing results

That project with the default settings took about 30 minutes. It generates the mosaicked orthophoto (TIF, PNG, and even MBTiles), surface model, and point cloud. Here’s a short clip of what the results look like:

This is why open source is so interesting. The team behind the project has put together great documentation and resources to help users get it running on all platforms, including running everything on your own cloud server infrastructure with extended processing resources. I see OpenAerialMap integration was just added, so I’ll have to check that out next.

✦
✦
✦
✦
✦
✦
✦
✦

Bringing Geographic Data Into the Open with OpenStreetMap

September 9, 2013 • #

This is an essay I wrote that was published in the OpenForum Academy’s “Thoughts on Open Innovation” book in early summer 2013. Shane Coughlan invited me to contribute on open innovation in geographic data, so I wrote this piece on OpenStreetMap and its implications for community-building, citizen engagement, and transparency in mapping. Enjoy.

OpenStreetMapWith the growth of the open data movement, governments and data publishers are looking to enhance citizen participation. OpenStreetMap, the wiki of world maps, is an exemplary model for how to build community and engagement around map data. Lessons can be learned from the OSM model, but there are many places where OpenStreetMap might be the place for geodata to take on a life of its own.

The open data movement has grown in leaps and bounds over the last decade. With the expansion of the Internet, and spurred on by things like Wikipedia, SourceForge, and Creative Commons licenses, there’s an ever-growing expectation that information be free. Some governments are rushing to meet this demand, and have become accustomed to making data open to citizens: policy documents, tax records, parcel databases, and the like. Granted, the prevalence of open information policies is far from universal, but the rate of growth of government open data is only increasing. In the world of commercial business, the encyclopedia industry has been obliterated by the success of Wikipedia, thanks to the world’s subject matter experts having an open knowledge platform. And GitHub’s meteoric growth over the last couple of years is challenging how software companies view open source, convincing many to open source their code to leverage the power of software communities. Openness and collaborative technologies are on an unceasing forward march.

In the context of geographic data, producers struggle to understand the benefits of openness, and how to achieve the same successes enjoyed by other open source initiatives within the geospatial realm. When assessing the risk-reward of making data open, it’s easy to identify reasons to keep it private (How is security handled? What about updates? Liability issues?), and difficult to quantify potential gains. As with open sourcing software, it takes a mental shift on the part of the owner to redefine the notion of “ownership” of the data. In the open source software world, proprietors of a project can often be thought of more as “stewards” than owners. They aren’t looking to secure the exclusive rights to the access and usage of a piece of code for themselves, but merely to guide the direction of its development in a way that suits project objectives. Map data published through online portals is great, and is the first step to openness. But this still leaves an air gap between the data provider and the community. Closing this engagement loop is key to bringing open geodata to the same level of genuine growth and engagement that’s been achieved by Wikipedia.

An innovative new approach to open geographic data is taking place today with the OpenStreetMap project. OpenStreetMap is an effort to build a free and open map of the entire world, created from user contributions – to do for maps what Wikipedia has done for the encyclopedia. Anyone can login and edit the map – everything from business locations and street names to bus networks, address data, and routing information. It began with the simple notion that if I map my street and you map your street, then we share data, both of us have a better map. Since its founding in 2004 by Steve Coast, the project has reached over 1 million registered users (nearly doubling in the last year), with tens of thousands of edits every day. Hundreds of gigabytes of data now reside in the OpenStreetMap database, all open and freely available. Commercial companies like MapQuest, Foursquare, MapBox, Flickr, and others are using OpenStreetMap data as the mapping provider for their platforms and services. Wikipedia is even using OpenStreetMap as the map source in their mobile app, as well as for many maps within wiki articles.

What OpenStreetMap is bringing to the table that other open data initiatives have struggled with is the ability to incorporate user contribution, and even more importantly, to invite engagement and a sense of co-ownership on the part of the contributor. With OpenStreetMap, no individual party is responsible for the data, everyone is. In the Wikipedia ecosystem, active editors tend to act as shepherds or monitors of articles to which they’ve heavily contributed. OpenStreetMap creates this same sense of responsibility for editors based on geography. If an active user maps his or her entire neighborhood, the feeling of ownership is greater, and the user is more likely to keep it up to date and accurate.

Open sources of map data are not new. Government departments from countries around the world have made their maps available for free for years, dating back to paper maps in libraries – certainly a great thing from a policy perspective that these organizations place value on transparency and availability of information. The US Census Bureau publishes a dataset of boundaries, roads, and address info in the public domain (TIGER). The UK’s Ordnance Survey has published a catalog of open geospatial data through their website. GeoNames.org houses a database of almost ten million geolocated place names. There are countless others, ranging from small, city-scale databases to entire country map layers. Many of these open datasets have even made their way into OpenStreetMap in the form imports, in which the OSM community occasionally imports baseline data for large areas based on pre-existing data available under a compatible license. In fact, much of the street data present in the United States data was imported several years ago from the aforementioned US Census TIGER dataset.

Open geodata sources are phenomenal for transparency and communication, but still lack the living, breathing nature of Wikipedia articles and GitHub repositories. “Crowdsourcing” has become the buzzword with public agencies looking to invite this type of engagement in mapping projects, to widely varying degrees of success. Feedback loops with providers of open datasets typically consist of “report an issue” style funnels, lacking the ability for direct interaction from the end user. By allowing the end user to become the creator, it instills a sense of ownership and responsibility for quality. As a contributor, I’m left to wonder about my change request. “Did they even see my report that the data is out of date in this location? When will it be updated or fixed?” The arduous task of building a free map of the entire globe wouldn’t even be possible without inviting the consumer back in to create and modify the data themselves.

Enabling this combination of contribution and engagement for OpenStreetMap is an impressive stack of technology that powers the system, all driven by a mesh of interacting open source software projects under the hood. This suite of tools that drives the database, makes it editable, tracks changes, and publishes extracted datasets for easy consumption is produced by a small army of volunteer software developers collaborating to power the OpenStreetMap engine. While building this software stack is not the primary objective of OSM, it’s this that makes becoming a “mapper” possible. There are numerous editing tools available to contributors, ranging from the very simple for making small corrections, to the power tools for mass editing by experts. This narrowing of the technical gap between data and user allows the novice to make meaningful contribution and feel rewarded for taking part. Wikipedia would not be much today without the simplicity of clicking a single “edit” button. There’s room for much improvement here for OpenStreetMap, as with most collaboration-driven projects, and month-by-month the developer community narrows this technical gap with improvements to contributor tools.

In many ways, the roadblocks to adoption of open models for creating and distributing geodata aren’t ones of policy, but of technology and implementation. Even with ostensibly “open data” available through a government website, data portals are historically bad at giving citizens the tools to get their hands around that data. In the geodata publishing space, the variety of themes, file sizes, and different data formats combine to complicate the process of making the data conveniently available to users. What good is a database I’m theoretically allowed to have a copy of when it’s in hundreds of pieces scattered over a dozen servers? “Permission” and “accessibility” are different things, and both critical aspects to successful open initiatives. A logical extension of opening data, is opening access to that data. If transparency, accountability, and usability are primary drivers for opening up maps and data, lowering the bar for access is critical to make those a reality.

A great example the power of the engagement feedback loop with OpenStreetMap is the work of the Humanitarian OpenStreetMap Team’s (HOT) work over the past few years. HOT kicked off in 2009 to coordinate the resources resident in the OpenStreetMap community and apply them to assist with humanitarian aid projects. Working both remotely and on the ground, the first large scale effort undertaken by HOT was mapping in response to the Haiti earthquake in early 2010. Since then, HOT has grown its contributor base into the hundreds, and has connected with dozens of governments and NGOs worldwide—such as UNOCHA, UNOSAT, and the World Bank—to promote open data, sharing, transparency, and collaboration to assist in the response to humanitarian crises. To see the value of their work, you need look no further than the many examples showing OpenStreetMap data for the city of Port-au-Prince, Haiti before and after the earthquake. In recent months, HOT has activated to help with open mapping initiatives in Indonesia, Senegal, Congo, Somalia, Pakistan, Mali, Syria, and others.

One of the most exciting things about HOT, aside from the fantastic work they’ve facilitated in the last few years, is that it provides a tangible example for why engagement is such a critical component to organic growth of open data initiatives. The OpenStreetMap contributor base, which now numbers in the hundreds of thousands, can be mobilized for volunteer contribution to map places where that information is lacking, and where it has a direct effect on the capabilities of aid organizations working in the field. With a traditional, top-down managed open data effort, the response time would be too long to make immediate use of the data in crisis.

Another unspoken benefit to the OpenStreetMap model for accepting contributions from a crowd is the fact that hyperlocal map data benefits most from local knowledge. There’s a strong desire for this sort of local reporting on facts and features on the ground all over the world, and the structure of OpenStreetMap and its user community suits this quite naturally. Mappers tend to map things nearby, things they know. Whether it’s a mapper in a rural part of the western United States, a resort town in Mexico, or a flood-prone region in Northern India, there’s always a consumer for local information, and often times from those for whom it’s prohibitively expensive to acquire. In addition to the expertise of local residents contributing to the quality of available data, we also have local perspective that can be interesting, as well. This can be particularly essential to those humanitarian crises, as there’s a tendency for users to map things that they perceive as higher in importance to the local community.

Of course OpenStreetMap isn’t a panacea to all geospatial data needs. There are many requirements for mapping, data issue reporting, and opening of information where the data is best suited to more centralized control. Data for things like electric utilities, telecommunications, traffic routing, and the like, while sometimes publishable to a wide audience, still have service dependencies that require centralized, authoritative management. Even with data that requires consolidated control by a government agency or department, though, the principles of engagement and short feedback loops present in the OpenStreetMap model could still be applied, at least in part. Regardless of the model, getting the most out of an open access data project requires an ability for a contributor to see the effect of their contribution, whether it’s an edit to a Wikipedia page, or correcting a one way street on a map.

With geodata, openness and accessibility enable a level of conversation and direct interaction between publishers and contributors that has never been possible with traditional unilateral data sharing methods. OpenStreetMap provides a mature and real-world example of why engagement is often that missing link in the success of open initiatives.

The complete book is available as a free PDF download, or you can buy a print copy here.

✦

Terra

September 7, 2013 • #

Inspired by a couple of others, I released a micro project of mine called Terra, to provide a fast way to run several geospatial tools on your computer.

Terra

Because I work with a variety of GIS datasets, I end up writing lots of scripts and small automation utilities to manipulate, convert, and merge data, in tons of different formats. Working with geo data at scale like this challenges the non-software developer to get comfortable with basic scripting and programming. I’ve learned a ton in the last couple years about Unix environments, and the community of open source geo tools for working with data in ways that can be pipelined or automated. Fundamental knowledge about bash, Python, or Ruby quickly becomes critical to saving yourself countless hours of repetitive, slow data processing.

GDALThe renaissance tool of choice for all sorts of data munging is GDAL, and the resident command line suites of GDAL and OGR. The GDAL and OGR programs (for raster and vector data, respectively) are super powerful out of the box, once you understand the somewhat obtuse and involved syntax for sending data between datasources, and the myriad translation parameters. But these get extra powerful as multitools for all types of data is when you can read from, and sometimes write to, proprietary data formats like Esri geodatabases, ECW files, MrSID raster images, GeoPDFs, SpatiaLite, and others. Many of these formats, though, require you to build the toolchain from source on your own, including the associated client libraries, and this process can be a giant pain, particularly for anyone who doesn’t want to learn the nuances of make and binary building1. The primary driver for building Terra was to have a simple, clean, consistent environment with a working base set of geotools. It gives you a prebuilt configuration that you can have up and running in minutes.

Terra uses Vagrant, for provisioning virtual machines, and Chef, an automation tool for batching up the setup, and maintaining its configuration. Vagrant is really a wrapper around VirtualBox VMs, and uses base Linux images to give you a clean starting point for each deployment. It’s amazing for running dev environments. It’s supports both Chef and Puppet, two CM tools for automating installation of software. I used Chef since I like writing Ruby, and created recipes to bootstrap the installs.

This all started because I got sick of setting up custom GDAL builds on desktop systems. Next on the list for this mini project is to provision installs of some other open geo apps, like TileMill and CartoDB, to run locally. Try it out on your computer, all you need is VirtualBox and Vagrant installed, and install is a few simple commands. Check it out on GitHub, follow the README to get yourself set up, and post an issue if you’d like to see other functionality included.

  1. Don’t mean to bag on Makefiles, they’re fantastic↩

✦

Creating New Contributors to OpenStreetMap

January 15, 2013 • #

I wrote a blog post last week about the first few months of usage of Pushpin, the mobile app we built for editing OpenStreetMap data.

As I mentioned in the post, I’m fascinated and excited by how many brand new OpenStreetMap users we’re creating, and how many who never edited before are taking an interest in making contributions. This has been an historic problem for the OpenStreetMap project for years now: How do you convince a casually-interested person to invest the time to learn how to contribute themselves?

There are two primary hurdles I’ve always seen with why “interested users” don’t make contributions; one technical, and one more philosophical:

  1. Editing map data is somewhat complicated, and the documentation and tools don’t help many users to climb over this hump.
  2. It’s hard to answer the question: “Why should I edit this map? What am I editing, and who benefits from the information?”

To the first point, this is an issue largely of time and effort on the part of the volunteer-led developer community behind OpenStreetMap. GIS data is fundamentally complex, much moreso than Wikipedia’s content, the primary analog to which OpenStreetMap is often compared—“Wikipedia for maps”. It’s an apt comparison only on a conceptual level, but when it comes time to build an editor for the information within each system, the demands of OpenStreetMap data take the complexity to another level. As I said, the community is constantly chewing this issue, and making amazing progress on a new web-based editor. In building Pushpin, we spent a long time making sure that the user didn’t need to know anything about the complex OpenStreetMap tagging system in order to make edits. We picked apart the wiki and taginfo to abstract the common tags into simple picklists, which prevents both the need to type lots of info, and the need to know that amenity=place_of_worship is the proper tag for a church or mosque.

As for answering the “why”, that’s a little more complicated. People contribute to community projects for a host of reasons, so it’s a challenge to nail down how this should be communicated about OSM. There are stray bits around that tell the story pretty succinctly, but the problem lies in centralizing that core message. The LearnOSM site does a good job of explaining to a non-expert what the benefits are of becoming part of the contributor community, but it feels like the story needs to be told somewhere closer to the main homepage. Alex Barth recently proposed an excellent idea to the OpenStreetMap mailing list, a “contributors mark” that can be used within OSM-based services to convey the value of free and open map data. This is an excellent idea that addresses a couple of needs. For one it communicates what the project actually is, rather than just sending the unsuspecting user to a page about ODbL, and it also gives a general sense of how the data is used by real people.

In order for those one million user accounts to turn into one million contributors, we need to do a better job at conveying the meaning of the project and the value it provides to OpenStreetMap’s thousands of data consumers.

✦

Symbola

July 13, 2012 • #

If you’ve ever done much involving symbol sets in mapping (especially web mapping), you know about the nightmare of managing 700 separate PNG files, with different, duplicate versions for slightly different size variations and colors. Even with a small set of a dozen symbols, if you want 3 sizes and 5 colors for each, you’re looking at 180 distinct PNG marker symbols to keep track of. Ugh. SVG format simplifies this in certain ways, but isn’t as universally supported or easy to work with as simple GIFs or PNGs.

With TileMill, I’ve wanted to use marker symbolizers frequently in maps, but I often avoid it because it involves a bit of tweaking and configuration to get all the files in the right place, build out all the styles marker-by-marker, and changing colors isn’t that easy.

Symbola

To address this issue and make it easier to get dynamic markers on your maps, Zac built Symbola, an icon font constructed by embedding SVG graphics into TTF and OTF files. It uses the open source FontForge library and a python script to grab the set of SVG graphics, convert them to glyphs, and assigns them to unicode characters. There is a process involved to get the unicode data attributes into your data, but if you use PostGIS as much as I do, this is well worth doing. There are instructions in the README on how to insert specific characters into your PostGIS data tables using SQL. It’s a clever way to manage symbology at the database level, rather than creating duplication all over your style files with hundreds of iconography-specific style definitions. Check it out on GitHub, you can even customize it and add your own markers.

✦

WhereCampTB

February 22, 2012 • #

My talk from Ignite Spatial at WhereCampTB, talking about the OSM Tampa Bay meetup group. Check out the slides in better detail here.

It was a fun event a couple weeks ago — great participation from folks in all sorts of industries involved in mapping or using GIS tools.

✦

WhereCampDC

June 23, 2011 • #

We just returned from a fantastic weekend up in DC - first at the Ignite Spatial event on Friday night, then the WhereCampDC unconference on Saturday. Being the first event of it’s kind that I’ve attended (with the “barcamp” unconference session format), I thought I’d write up some thoughts and impressions from an amazing 2-day trip.

Ignite Spatial

This was also my first experience hearing talks in the ignite format—20 slides, 15 seconds each, 5 minutes. A fantastic format to break people out of the habit of simply reading their slides off a screen. Held at Grosvenor Auditorium at National Geographic Society headquarters, the series was well-run, prompted and emcee’d by Nathaniel Kelso, who did a bang-up job pulling together logistics for both days Our own Tony Quartararo gave his first talk in the ignite format: Homophily and the “geoherd”, where he posited that if the theory of homophily (love of being alike) applies to the spread of human attitudes and behaviors, than it also can spread our community’s interest in geography and technology onto other social circles of people who haven’t yet been addicted to using Foursquare, editing OpenStreetMap, or contributing to open source projects. Being aware of our “three degrees of influence” can help us to spread our collective interest in geospatial technology to those that may not even be aware of such things. Vizzuality’s Javier de la Torre presented his work on the OldWeather project, a social and community-driven effort to derive century-old historical weather data by having members transcribe Royal Navy captain’s logbooks—a clever solution to acquiring loads of data about wind, water temperature, sea conditions, and shipboard events from 100 years ago. He even showed off some stunning visualizations of the data. Definitely a crowd favorite. Sophia Parafina declared that “WMS is Dead” (I agree!), Mapbox founder Eric Gundersen showed off making gorgeous maps with their TileMill map design studio, GeoIQ’s Andrew Turner demonstrated the many, many ways he bent geodata to his will to find the perfect DC house.

WhereCamp unconference

The unconference was held at the Washington Post office, which has a nice setup for the format and the attendance that showed up (200 people!). This was my first experience with the user-generated conference format, and I enjoyed it far more out of it than other formal conferences. It starts with the attendees proposing talks and discussions and scheduling them out in separate rooms throughout the day, then everyone breaks up into groups to drill down on whatever topics they find useful. I attend a lot of conferences with high-level discussion about GIS and the mapping community, so in this particular crowd I was more interested in deep diving on some technical discussions of open source stuff we’ve been using a lot of lately.

Unconference board

After meeting most of the guys from Development Seed, I knew I wanted to sit in on Tom MacWright’s talk about Wax, their Javascript toolkit to extend functionality for Modest Maps, which makes it super easy to publish maps on the web. What they’re doing with Wax will be the future of web mapping for a lot of people. Really the only open source alternative to the commercial Google Maps API at this stage is OpenLayers, which can be overly featureful, heavy, and slow for most developers who just want some simple maps on the web. Dane Springmeyer proposed a discussion around “Mapnik Visioning”, wherein we went around the room discussing the future of our favorite renderer of beautiful map tiles. Mapnik is a critical low-level platform component for generating tiles from custom data, a foundational piece of the open source web mapping puzzle, and it was refreshing to see such technical, in-depth discussion for where to go next with the Mapnik project. Takeaway: node.js and node-mapnik bindings are going to be the future of the platform. AJ Ashton spun up a discussion about TileMill, the map tile design studio that Mapbox has constructed to help cartographers make beautiful maps easily with open standards and their own custom data. TileMill has definitely added a huge capability for us to style up and distribute maps of our own data. The stack of tools that TileMill provides allows designers to create great cartography for map data quickly, and to export as a tileset for viewing on the web or mobile. TileMill has firmly planted itself in our arsenal as something we’ll continue to use for a long time, a fantastic tool for designers.

✦
✦