Coleman McCormick

Archive of posts with tag 'Data'

āœ¦
āœ¦

Rolling Windows for Goal Tracking

April 26, 2020 • #

Since the beginning of 2019 Iā€™ve been tracking ongoing goals using a Google Sheet I made, where I can enter each activity day by day and generate a rollup showing how Iā€™m tracking on each goal throughout the course of the year.

Andy Matuschak put it well in this post where he talked about his system for habit-building. A calendar week isnā€™t great for tracking overall progress because itā€™s artificially-constrained.

Letā€™s take my current goal of running 650 miles this year. That averages to doing 12.47 miles per week to hit the number. With something like running, pacing out the progress is critical ā€” you canā€™t procrastinate and stack progress at the end of the month or quarter to ā€œcatch up,ā€ at least not healthily. And you also want the progress report to give you a sense of ā€œhow have I been doing?ā€

If you look at a calendar week (like Monday to Sunday), you could have one week where you overshoot the goal, say a race week or just one where you got in high mileage, followed by one with more rest days. A purely week-oriented method would give the sense that you were off-target during the rest week, and way over during the intense one.

In Andyā€™s post he puts it well: moving windows help to ā€œmake every day doable.ā€ Putting things off doesnā€™t threaten your progress, as long as you donā€™t put them off too far.

My method for doing this on my run tracker shows me how much Iā€™ve run in the past 7 days, juxtaposed with the 7-day target if Iā€™m ā€œon plan.ā€ I need to average 1.78 miles per day to stay on track, so this formula tells me how Iā€™m doing over the last 7 days:

Last 7 7-Day Target
13.51 12.47

Hereā€™s how I calculate this in the spreadsheet. I track each run in a separate row, with a miles attribute for each one. The formula for ā€œLast 7ā€ looks like this:

SUMIFS(miles,date,">"&TODAY()-7)

miles and date are the columns in the data for each of those. I use the whole column in notation like Running!B:B. Thatā€™ll take the whole series as input and SUMIFS sums based on the logic in the last argument.

Because Iā€™m currently tracking about 13 miles behind goal pace for the year, I need to make sure I keep this rolling figure just above the 7-day target line in order to close the gap back to level.

This is working better overall to give a picture of the current state for me. It also works well for other things with daily targets like skill practice, book pages for reading, learning a language or instrument, really anything you can quantify with time or scalar goals.

āœ¦

Weekend Reading: COVID Edition

April 25, 2020 • #

āš—ļø COVID and Forced Experiments

Benedict Evans looks at what could return to normal after coronavirus, and what else might have accelerated change that was already happening.

ā€œEvery time we get a new kind of tool, we start by making the new thing fit the existing ways that we work, but then, over time, we change the work to fit the new tool. Youā€™re used to making your metrics dashboard in PowerPoint, and then the cloud comes along and you can make it in Google Docs and everyone always has the latest version. But one day, you realise that the dashboard could be generated automatically and be a live webpage, and no-one needs to make those slides at all. Today, sometimes doing the meeting as a video call is a poor substitute for human interaction, but sometimes itā€™s like putting the slides in the cloud.ā€

šŸ“ˆ COVID-19: Whatā€™s wrong with the models?

One of the things continually aggravating about all of the data, models, projections, and analyses about COVID-19 is how little anyone cares to retroactively analyze prior predictions. Over the last two months the predictions have been all over the map, and as time marches on and many are wrong, some are right, thereā€™s no analysis of what assumptions were made that turned out not to be true causing the wide divergence between projection and reality.

Peter Attia calls out here something rarely acknowledged about why projections are wicked:

ā€œProjections only matter if you can hold conditions constant from the moment of your prediction, and even then, itā€™s not clear if projections and models matter much at all if they are not based on actual, real-world data. In the case of this pandemic, conditions have changed dramatically (e.g., aggressive social distancing), while our data inputs remain guesswork at best.ā€

šŸ’‰ The Pandemic Isnā€™t a Black Swan but a Portent of a More Fragile Global System

Nassim Taleb, making his way into the New Yorker.

āœ¦
āœ¦

Weekend Reading: Chess, COVID Tracking, and Note Types

March 21, 2020 • #

ā™Ÿ Chess

Tom MacWright on chess. Reduce distraction, increase concentration

Once you have concentration, you realize that thereā€™s another layer: rigor. Itā€™s checking the timer, checking for threats, checking for any of a litany of potential mistakes you might be about to make, a smorgasbord of straightforward opportunities you might miss. Simple rules are easy to forget when youā€™re feeling the rush of an advantage. But they never become less important.

Might start giving chess a try just to see how I do. Havenā€™t played in years, but Iā€™m curious.

šŸ§Ŗ The COVID Tracking Project

The best resource Iā€™ve run across for aggregated data on COVID cases. Pulled from state-level public health authorities; this project just provides a cleaned-up version of the data. Thereā€™s even an API to pull data.

āœšŸ¼ Taxonomy of Note Types

Andy Matuschakā€™s notes on taking notes. This is from his public notebook, like reading someone thinking out loud (or on a screen at least).

āœ¦

Library 2.0

March 6, 2020 • #

Since I began tracking my reading habits a year and a half ago, Iā€™ve been able to keep up with it regularly. It lives in a Google Sheet and allows me to log dates I started and finished books, attributes about them, ratings, links, and more.

I spent some time with Airtable importing and cleaning up the data so I could have a richer version with the ability to view, edit, and add to the library from my phone. Airtable has the ability to create Views (similar to what we do with Views in Fulcrum) which are essentially saved queries with specific formats ā€”Ā e.g. it remembers hidden columns, sort order, and grouping. Iā€™ve got two main views: one for my ā€œCurrent Libraryā€ (books Iā€™ve read or am currently reading) and another ā€œTo Readā€ list with ones Iā€™ve added for future reading. This lets me keep them all in a single table with a category for status, but can view my archive without seeing the hundreds on the reading list.

Library 2.0 in Airtable

The data entry interface for adding new records isnā€™t that great (not as good at this as Fulcrum), but it is certainly better than Google Sheets for this.

My 'Current' Library (left) and Reading List (right) My ā€˜Currentā€™ Library (left) and Reading List (right)

Airtable also supports Zapier for automations, so I could potentially send the data entered to other services if I want to.

Check out the data here:

āœ¦

Weekend Reading: Landgrid, Quantified Self, and Tesla Teardown

February 22, 2020 • #

šŸ˜ Landgrid

This is a product from Loveland Technologies, with a cohesive dataset of parcel boundaries provided as an API for application builders.

More on their parcel data and how they do it here.

šŸ¤³šŸ½ My Quantified Self Setup

My goal tracking efforts pale in comparison to what Julian Lehr is doing. I might give a try to Airtable for mine, also. Iā€™ve been in Google Sheets since mineā€™s pretty basic, but AT might make it more mobile-friendly for editing.

šŸš— Tesla teardown finds electronics 6 years ahead of Toyota and VW

What stands out most is Teslaā€™s integrated central control unit, or ā€œfull self-driving computer.ā€ Also known as Hardware 3, this little piece of tech is the companyā€™s biggest weapon in the burgeoning EV market. It could end the auto industry supply chain as we know it.

One stunned engineer from a major Japanese automaker examined the computer and declared, ā€œWe cannot do it.ā€

āœ¦

Books and Microdata

January 27, 2020 • #

Tom posted a while back about his book review section, and adding schema.org microdata to those pages for book review-related data. The promise of these schema standards is to provide a semantic markup framework for unstructured text content, so things like recipes, movies, and products can conform to an attribute standard for (theoretically) better indexing and search.

Referencing his implementation, I went through my library templates and added schema attributes on the relevant properties I publish. I donā€™t know what value thoseā€™ll have, but Iā€™m a supporter of the open web and bottom-up adoption of formats for data structures. I remember Microformats from way back in the early Web 2.0 days. They didnā€™t seem to catch on, but Google has over time rolled out JSON-LD (linked data) to feed those tasty machine-readable formats to the spider, for easier surfacing of useful content in search.

Hereā€™s a snapshot of some of the data on an individual /book page:

<div class="book" itemprop=itemReviewed itemscope itemtype=http://schema.org/Book>
<h1 itemprop=name>The Quiet American</h1>
<h2>by <span class="author" itemprop=author>Graham Greene</span></h2>
<p class="book-meta">Published: <span itemprop=datePublished></span></p>
</div>

Itā€™s pretty straightforward to add markup for title, author, completion date, ISBN, and other things. Itā€™s also neat that the Book object type also ā€œbelongs toā€ the CreativeWork type, so it can contain those properties, as well.

One other thing I included here was a section to backlink to othersā€™ posted book reviews on their personal sites. After Tom tweeted yesterday about doing this on his site, I decided Iā€™d backlink to his, too. If you maintain a reading log and want to continue the viral spread of semantic indie blog cross-referencing, let me know. Iā€™d be happy to link to others.

Next I wanted to try adding the appropriate JSON-LD tags for other parts of the site and see how that all works.

āœ¦

Reading Metrics

January 9, 2020 • #

Since I began tracking my books in a spreadsheet in 2018, Iā€™ve got a bunch of data I can now look at on my reading habits.

One thing I took a stab at was a ā€œduration chartā€ that could show the reading patterns over time, based on when I started and finished each book.

Book reading durations

Using this stacked bar chart style, you can see which books I stalled out on and put down for long periods. Not a judgment on those booksā€™ respective merits, more of a criticism of my dodgy reading habits. The Federalist had probably a full 6 month fallow period where I forgot about it.

Some other fun-but-meaningless statistics:

Total Pages

18,9831

Type

  • Fiction: 24%
  • Nonfiction 76%

Formats

  • Audiobook: 57%
  • Kindle: 13%
  • Real Book: 30%

Authors I read more than one from

  • Nassim Taleb
  • Cixin Liu
  • HP Lovecraft
  • Jonathan Haidt

Oldest book

Top 5 genres (from tags)

  1. History
  2. Philosophy
  3. Psychology
  4. Science
  5. Business
  1. Page counts are, of course, a very rough estimate. But fun to see the quantity and calculate pages per day on average (about 50!).Ā 

āœ¦
āœ¦
āœ¦

Weekend Reading: Kipchoge's 2 Hours, Future Ballparks, and the World in Data

October 12, 2019 • #

šŸƒšŸ¾ā€ā™‚ļø Eliud Kipchoge Breaks 2-Hour Marathon Barrier

An amazing feat:

On a misty Saturday morning in Vienna, on a course specially chosen for speed, in an athletic spectacle of historic proportions, Eliud Kipchoge of Kenya ran 26.2 miles in a once-inconceivable time of 1 hour 59 minutes 40 seconds.

āš¾ļø What the Future American Ballpark Should Look Like

An architectā€™s manifesto on how teams can rethink the design of baseball stadiums:

Fans want to feel that the club has bought into them, and a bolder model of fan engagement could give them a real stake in the clubā€™s success. One of the most promising recent trends in North American sports is the way soccer clubs are emulating their European counterparts by developing dedicated supportersā€™ groups. These independent organizations drive enthusiasm and energy in the ballpark, and make sure seats stay filled.

Instead of just acknowledging and tolerating the supporter group model, weā€™re going to encourage and codify it in the parkā€™s architecture by giving over control of entire sections of the ballpark to fans. Rather than design the seating sections and concourse as a finished product, weā€™ll offer it up as a framework for fan-driven organizations to introduce their own visions.

šŸ“° Does the News Reflect What We Die From?

Analysis of how media over-represents rare causes, and represents almost not at all the most common causes of death.

āœ¦

Weekend Reading: Observable Edition

September 7, 2019 • #

This weekā€™s links are all interactive notebooks on Observable. Their Explore section always highlights interesting things people are creating. A great learning tool for playing with data and code to see how it works.

āŒØļø The Enigma Machine

Easily the most impressive interactive notebook Iā€™ve ever seen. This one from Tom shows the electromechanical pathways of the German Enigma machine at work ā€” enter a character and see how the rotors and circuits encrypt text.

šŸš² A Bicycle Drivetrain Analyzer

Another great example of the power of interactive programs. This one lets you compute bicycle chainring gear ratios by speed setting. You can add multiple cassettes and chainrings to compare:

Bicycle drivetrain analysis

šŸŒ Mapping the Mediterranean

Have to include a map example. Here the author brings in DEM data then styles and generates it all in code with GDAL for data manipulation and D3 for graphics.

āœ¦

Watts vs. Speed

August 4, 2019 • #

After a long ride today, I was looking at the stats on Strava and wondering how wattage calculations work to determine power. Strava has a built in estimate it uses for your power rating if you donā€™t have a power meter on your bike. From looking into it, their calculations look pretty sophisticated for estimating power pretty closely, unless youā€™re really riding in extreme conditions:

The power produced while riding is made up of several components:

  • Power produced to overcome the rolling resistance of forward motion.
  • Power produced to overcome wind resistance.
  • Power produced to overcome the pull of gravity (in the case of climbing hills).
  • Power produced to accelerate from one speed to another.

The total power produced, P(total), is the sum of all four power components.

P(total) = P(rolling resistance) + P(wind) + P(gravity) + P(acceleration)

It looks like the biggest source of error would be the environmentals, particularly wind resistance and elevation change (if the GPS elevation data is poor). My ride today shows an average 103 watts for the 1 hour 20 minute ride. Since itā€™s almost totally flat and their was only a little wind today, it should be pretty accurate. Seems to me that wind-induced error would sort of cancel itself out on circuitous routes like this one ā€”Ā for every segment of headwind, you get another with tailwinds.

I also found this bike calculator that takes various inputs and adjusts the resulting speed and watts accordingly.

āœ¦

Fulcrum as a Personal Database

July 29, 2019 • #

I use Fulcrum all the time for collecting data around hobbies of mine. Sometimes itā€™s for fun or interests, sometimes for mapping side projects, or even just for testing the product as we develop new features.

Here are a few of my key every day apps I use for personal tracking. Iā€™m always tinkering around with other things as we expand the product, but each of these Iā€™ve been using for years pretty consistently.

Gas Mileage

Of course there are apps out there devoted to this task, but I like the idea of having my own raw data input for this. Piping this to a spreadsheet lets me run some calculations on it to see MPG, total spend, and total miles driven over time.

Gas mileage tracker

Maps Collection

Iā€™m a collector of paper maps, and some time back I built out a tracker in Fulcrum to inventory my collection. One day I plan to add some other details to this for year, publisher, and the like, but it works for now as a basic inventory of what Iā€™ve got.

Maps database

Workouts

Iā€™ve been lax this year with the routine, but Iā€™d built out a log for tracking my workout sessions at the gym ā€” mostly to track doing the ā€œRunner 360ā€ workout. It works great and provides a way to build some charts with progress on efforts over time.

Home Inventory

In order to have a reliable log of all of the expensive stuff in my house, I created this so that thereā€™s some prayer of having a tight evidence log of what I own if thereā€™s ever a flood, hurricane, or fire (or even theft) that requires a homeowners insurance claim. I figured it canā€™t hurt to have photographic evidence of whatā€™s in the house if it came to needing to prove it.

Home inventory

Football Clubs

This one is more of an experiment in using Fulcrum (and its API) as a cloud-based PostGIS database. I created a simple schema for each team, league, and stadium location. I had this idea to use these coordinates for generating a poster of stadiums from satellite images. One day I might have time for that, but thereā€™s also an open database you can download of all the locations as geojson.

Football clubs map

There are a few others Iā€™ve got in ā€œR&Dā€ mode right now testing out. Always on the hunt for new and interesting things I can make Fulcrum do. Itā€™s a true power tool for data entry and data management.

āœ¦

Weekend Reading: The Next Mapping Company, Apple on Pros, and iPadOS Workflow

June 15, 2019 • #

šŸ—ŗ (Who will be) Americaā€™s Next Big Mapping Company?

Paul Ramsey considers who might be in the best position to challenge Google as the next mapping company:

Someone is going to take another run at Google, they have to. My prediction is that it will be AWS, either through acquisition (Esri? Mapbox?) or just building from scratch. There is no doubt Amazon already has some spatial smarts, since they have to solve huge logistical problems in moving goods around for the retail side, problems that require spatial quality data to solve. And there is no doubt that they do not want to let Google continue to leverage Maps against them in Cloud sales. They need a ā€œgood enoughā€ response to help keep AWS customers on the reservation.

Because of mappingā€™s criticality to so many other technologies, any player that is likely to compete with Google needs to be a platform ā€” something that undergirds and powers technology as a business model. Apple is kinda like that, but nowhere near as similar to an electric utility as AWS is.

šŸ‘ØšŸ½ā€šŸ’» Apple is Listening

With the release of the amazing new Mac Pro and other things announced at WWDC, itā€™s clear that Apple recognizes its failings in delivering for their historically-important professional customers. Marco Arment addresses this well here across the Mac Pro, updates to macOS, iPadOS, and the changes that could be around the corner for the MacBook Pro.

šŸ“± iPadOS: Initial Thoughts, Observations, and Ideas on the Future of Working on an iPad

Iā€™m excited to get iPadOS installed and back to my iPad workflow. This is a good comprehensive overview from Shawn Blanc, someone who has done most of his work on an iPad for a long time.

āœ¦

Weekend Reading: Data Moats, China, and Distributed Work

May 25, 2019 • #

šŸ° The Empty Promise of Data Moats

In the era of every company trying to play in machine learning and AI technology, I thought this was a refreshing perspective on data as a defensible element of a competitive moat. Thereā€™s some good stuff here in clarifying the distinction between network effects and scale effects:

But for enterprise startups ā€” which is where we focus ā€” we now wonder if thereā€™s practical evidence of data network effects at all. Moreover, we suspect that even the more straightforward data scale effect has limited value as a defensive strategy for many companies. This isnā€™t just an academic question: It has important implications for where founders invest their time and resources. If youā€™re a startup that assumes the data youā€™re collecting equals a durable moat, then you might underinvest in the other areas that actually do increase the defensibility of your business long term (verticalization, go-to-market dominance, post-sales account control, the winning brand, etc).

Companies should perhaps be less enamored of the ā€œshiny objectā€ of derivative data and AI, and instead invest in execution in areas challenging for all businesses.

šŸ‡ØšŸ‡³ China, Leverage, and Values

An insightful piece this week from Ben Thompson on the current state of the trade standoff between the US and China, and the blocking of Chinese behemoths like Huawei and ZTE. The restrictions on Huawei will mean some major shifts in trade dynamics for advanced components, chip designs, and importantly, software like Android:

The reality is that China is still relatively far behind when it comes to the manufacture of most advanced components, and very far behind when it comes to both advanced processing chips and also the equipment that goes into designing and fabricating them. Yes, Huawei has its own system-on-a-chip, but it is a relatively bog-standard ARM design that even then relies heavily on U.S. software. China may very well be committed to becoming technologically independent, but that is an effort that will take years.

The piece references this article from Bloomberg, an excellent read on the state of affairs here.

āŒØļø The Distributed Workplace

I continue to be interested in where the world is headed with remote work. Here InVisionā€™s Mark Frein looks back at what traits make for effective distributed companies, starting with history of past experiences of remote collaboration from music production, to gaming, to startups. As he points out, you can have healthy or harmful cultures in both local and distributed companies:

Distributed workplaces will not be an ā€œanswerā€ to workplace woes. There will be dreary and sad distributed workplaces and engaged and alive ones, all due to the cultural experience of those virtual communities. The key to unlocking great distributed work is, quite simply, the key to unlocking great human relationshipsā€Šā€”ā€Šstruggling together in positive ways, learning together, playing together, experiencing together, creating together, being emotional together, and solving problems together. Weā€™ve actually been experimenting with all these forms of life remote for at least 20 years at massive scales.

āœ¦
āœ¦
āœ¦

Weekend Reading: How We Collect Data, Mapping the Camp Fire, and Earth's Great Unconformity

January 5, 2019 • #

šŸ—ŗ How We Get Data Collected in the Field Ready for Use

My colleagues Bill Dollins and Todd Pollard (the core of our data team), wrote this post detailing how we go from original ground-based data collection in Fulcrum through a data processing pipeline to deliver product to customers. A combination of PostGIS, Python tools, FME, Amazon RDS, and other custom QA tools get us from raw content to finished, analyst-ready GEOINT products.

šŸ”„ Mapping the Camp Fire with Drones

The 518 coordinated flights operation, by 16 Northern California emergency responder agencies, is one of the biggest drone response to a disaster scene in the nationā€™s history. The 16 UAV teams were led by Alameda County Sheriffā€™s Office. Stockton Police, Contra Cost County Sheriffā€™s Office & Menlo Park Fire Protection District had the most team members present, with Union City Police, Hayward Police and Stanislaus County Sheriffā€™s Office providing units as well. San Francisco Police oversaw airspace mitigation. In addition to the mapping flights, over 160 full 360-degrees and interactive panoramas were created with the help of Hangar, as well as geo-referenced video was shot along major roads in Paradise through Survae.

An impressive effort by response agencies in California to respond to this tragic disaster and assess the damage.

šŸŒ Earth is Missing a Huge Part of its Crust

An article on the Great Unconformity in the geologic record and its potential cause:

The Grand Canyon is a gigantic geological library, with rocky layers that tell much of the story of Earthā€™s history. Curiously though, a sizeable layer representing anywhere from 250 million years to 1.2 billion years is missing.

The likely culprit was a theoretical planetwide glaciated period known as the ā€œSnowball Earthā€.

āœ¦

Fulcrum Desktop

January 4, 2019 • #

A frequent desire for Fulcrum customers is to maintain locally a version of the data they collect with our platform, in their database system of choice. With our export tool, itā€™s simple to pull out extracts in formats like CSV, shapefile, SQLite, and even PostGIS or GeoPackage. What this doesnā€™t allow, though, is an automatable way to keep a local version of data on your own server. Youā€™d have to extract data manually on some schedule and append new stuff to existing tables youā€™ve already got.

A while back we built and released a tool called Fulcrum Desktop, with the goal of alleviating this problem. Itā€™s an open source command line utility that harnesses our API to synchronize content from your Fulcrum account into a local database. It supports PostgreSQL (with PostGIS), Microsoft SQL Server, and even GeoPackage.

Other than the primary advantage of providing a way to clone your data to your own system, one of the cool things you can do with Desktop is easily make your data available to your GIS users in a tool like QGIS. It also has a plugin architecture to support other cool things like:

  • Media management ā€” syncing photos, videos, audio, signatures
  • S3 ā€” storing media files in your own Amazon S3 bucket
  • Reports ā€” Generating PDF reports

If you have the Fulcrum Developer Pack with your account, you have access to all of the APIs, so you need that to get Desktop set up (though it is available on the free trial tier).

Weā€™ve also built another utility called fulcrum-sync that makes it easy to set up Desktop using Docker. This is great for version management, syncing data for multiple organizations, and overall simplifying dependencies and local library management. With Docker ā€œcontainerizingā€ the installation, you donā€™t have to worry about conflicting libraries or fiddle with your local setup. All of the FD installation is segmented to its own container. This utility also makes it easier to install and manage FD plugins.

āœ¦
āœ¦

The Library Database

October 29, 2018 • #

Iā€™ve been an avid user of Goodreads for tracking books for the last ten years. Tom MacWright wrote a post and a script utility last year to export and format items from Goodreads into pages that could work in a Jekyll site, like his and this one. On my profile I track more than just what Iā€™m reading; I also log start and finish dates, ratings, reviews, and more. Getting a feed somewhere on the website would certainly be cool (I have a branch now with this in progress). On my way to getting that working, I took the Goodreads export format and put it in a Google Sheet, then edited a good bit to build out a richer dataset that I can keep adding to over time. I added fields for format, whether a book is part of a series, and a URL to get to bookā€™s listing.

Books database

I have some ideas on some simple analyses to do on this data. Once I get the feed publishing my reading log inline with the blog posts, Iā€™ll work on some experiments with visualizations that could be done with this dataset.

āœ¦

Weekly Links: LiDAR, WannaCry, and OSM Imagery

May 18, 2017 • #

šŸ—ŗ LiDAR Data for DC Available as an AWS Public Dataset

LiDAR point cloud data for Washington, DC, is available for anyone to use on Amazon Simple Storage Service (Amazon S3). This dataset, managed by the District of Columbiaā€™s Office of the Chief Technology Officer (OCTO), with the direction of OCTOā€™s Geographic Information System (GIS) program, contains tiled point cloud data for the entire District along with associated metadata.

This is a great move by the District to make high value open data available.

šŸ–„ WannaCry and the Power of Business Models

Ben Thompson breaks down the blame game of the latest zero-day attack on Windows systems. This article makes a great case for the business model being to blame rather than Microsoft, their customers, the government, or someone else. a SaaS business model naturally aligns incentives for everyone:

I am, of course, describing Software-as-a-service, and that categoryā€™s emergence, along with cloud computing generally (both easier to secure and with massive incentives to be secure), is the single biggest reason to be optimistic that WannaCry is the dying gasp of a bad business model (although it will take a very long time to get out of all the sunk costs and assumptions that fully-depreciated assets are ā€œfreeā€). In the long run, there is little reason for the typical enterprise or government to run any software locally, or store any files on individual devices. Everything should be located in a cloud, both files and apps, accessed through a browser that is continually updated, and paid for with a subscription. This puts the incentives in all the right places: users are paying for security and utility simultaneously, and vendors are motivated to earn it.

šŸ›° DigitalGlobe Satellite Imagery Launch for OpenStreetMap

DG is opening up access to imagery for tracing in OpenStreetMap, giving the project a powerful new resource for more basemap data. Especially cool for HOTOSM projects:

Over the past few months, we have been working with several of our partners that share the common goal of improving OpenStreetMap. To that end, they have generously funded the launch of a global imagery service powered by DigitalGlobe Maps API. This will open more data and imagery to aid OSM editing. OSM contributors will see a new DigitalGlobe imagery source, in addition to imagery provided by our partners, Bing and Mapbox.

šŸ“· Updating Google Maps with Deep Learning

If youā€™re in the mapping space, seeing any of this R&D that Google is doing is mind-boggling.

āœ¦

Weekly Links: OSM on AWS, Fulcrum Editor, & Real-time Drone Maps

April 21, 2017 • #

Querying OpenStreetMap with Amazon Athena šŸ—ŗ

Using Amazonā€™s Athena service, you can now interactively query OpenStreetMap data right from an interactive console. No need to use the complicated OSM API, this is pure SQL. Iā€™ve taken a stab at building out a replica OSM database before and itā€™s a beast. The dataset now clocks in at 56 GB zipped. This post from Seth Fitzsimmons gives a great overview of what you can do with it:

Working with ā€œthe planetā€ (as the data archives are referred to) can be unwieldy. Because it contains data spanning the entire world, the size of a single archive is on the order of 50 GB. The format is bespoke and extremely specific to OSM. The data is incredibly rich, interesting, and useful, but the size, format, and tooling can often make it very difficult to even start the process of asking complex questions.

Heavy users of OSM data typically download the raw data and import it into their own systems, tailored for their individual use cases, such as map rendering, driving directions, or general analysis. Now that OSM data is available in the Apache ORC format on Amazon S3, itā€™s possible to query the data using Athena without even downloading it.

Introducing the New Fulcrum Editor šŸ”ŗ

Personal plug here, this is something thatā€™s been in the works for months. We just launched Editor, the completely overhauled data editing toolset in Fulcrum. I canā€™t wait for the follow up post to explain the nuts and bolts of how this is put together. The power and flexibility is truly amazing.

Real-time Drone Mapping with FieldScanner šŸš

The team at DroneDeploy just launched the first live aerial imagery product for drones. Pilots can now fly imagery and get a live, processed, mosaicked result right on a tablet immediately when their mission is completed. This is truly next level stuff for the burgeoning drone market:

The poor connectivity and slow internet speeds that have long posed a challenge for mapping in remote areas donā€™t hamper Fieldscanner. Designed for use the fields, Fieldscanner can operate entirely offline, with no need for cellular or data coverage. Fieldscanner uses DroneDeployā€™s existing automatic flight planning for DJI drones and adds local processing on the drone and mobile device to create a low-resolution Fieldscan as the drone is flying, instead of requiring you to process imagery into a map at a computer after the flight.

āœ¦
āœ¦