Living in Brooklyn, we're constantly surrounded by grand old buildings, each with a rich history of their own. Since 2010, Suzanne Spellen, writing as Montrose Morris in homage to her favorite 19th century Brooklyn architect, has documented the stories behind the facades of Brooklyn's most (and sometimes least) beautiful buildings in near daily contributions to the Brownstoner blog. The combined effect of her now more than 1,000 articles is a rich portait of Brooklyn's past and present, as told through the stories of the borough's buildings. I created Brooklyn, one (thousand) building(s) at a time to help make Spellen's entire catalog more accessible to curious borough dwellers who want to know more about the buildings they walk past each day. This project was conceptualized and designed as part of the Foundations of Spatial Thinking course offered through the SAVI Certificate of GIS and Design program at the Pratt Institute.
This map is built around the Building of the Day series from Brownstoner, as well as geolocation services through NYC's Geoclient API, building footprints from NYC's Open Data Portal, and building information from NYC City Planning's MapPLUTO.
I used the Scrapy open source web scraping framework for Python to extract the street addresses and blog post URLs for the nearly 100 pages of building histories located at www.brownstoner.com/blog/category/botd/. Using Scrapy, this first involved creating the project and declaring the items to be scraped:
I then created the spider, which is used to declare which websites to crawl and to define how the sites should be parsed using XPath selectors to return the required information (street address and blog post URL):
After setting up the items and spider files, it was simply a matter of running the spider and outputing the results as comma separated values:
Via Scrapy, I generated a table containing the street address for each building profiled and its accompanying profile URL. Through some simple data manipulation, I transformed the addresses into "building" + "street name" + "borough" format, which enabled me to geocode the address using NYC's Geoclient API.
The Geoclient API allows users to send geolocation requests to the City's mainframe Geosupport system and includes information like borough, block, lot (BBL) data, building identification number (BIN) data, community district data, and 154 other data items that are mostly unavailable using other geocoding methods. For this project, I was interested in returning the BBL and BIN information, and accomplished this via a relatively simple Google Spreadsheet, an example of which could be found here, that used Trevor Lohrbeer's ImportJSON script to call the API and return the requested items.
First, I joined City Planning's MapPLUTO dataset to my geocoded data using the BBL number in order to capture the year of building construction and information about historical preservation status. I then performed another join using the BIN to a shapefile containing building footprints made available by DoITT's GIS division. That data provided me with the information necessary to visualize the profiled buildings of the day according to their footprints.
For example, in the above I set the color of water features to a slate blue, the color of most landscape, roadway, and transit elements to various shades of grey, and the color of parks to a grey-green, while turning off the visualization of all other points of interest.
The map uses the Google Street View Image API to return a small image of the profiled building upon clicking on the building footprint.