Lack of open data makes for clunky democracy

Living in the city of Chester I sometimes think the old Deva Victrix (79AD) city is still in the air.

We have a 6 story luxury student block being planned for development at the end of my road.  When deciding to support / object to this development I wanted to know if this was based on real need.

Chester already has lots of student housing, both multiple occupancy homes and newer student blocks.  The latter being more expensive.

Like other university cities Chester has a superb community of students that bring energy and life to the city.  The University of Chester is an important asset to the city that helps power growth.

Student blocks are being developed by third-parties and sold to investors looking for returns at the expense of students/taxpayers.  I wanted to know if the development was speculative or for a real need.

After reaching out to my local authority I was shocked to discover there was no open data available on this controversial subject.  

We live in a data rich society.  It’s way past time we had this data in my opinion.  To answer my question  I really needed some level of ‘occupancy’ data.

I didn’t take me long to find it.

This data was stuck on a poorly formatted web site.   Available to all but not exactly formatted for easy access.  No API.  Can’t really blame the 3rd party running the website, they are doing the important job of matching students with housing after all.  Not sure what happened to the semantic web.

With Python on my Pi I wanted to see if it was possible to automate the scraping, processing, presentation.   A micro-BI project carried out over a a couple of Sunday mornings.

Scraping
One of the popular Python packages for scraping data is Beautiful Soup.  Developed by a librarian in the New York library!

Storing
It’s great to see SQL Server come back to Linux.   Would have liked to have used it but I needed something much lighter for my Pi.   I was keen to store this data with schema rather than dump it into JSON.
SqLite is so popular but I’ve never had a use for it.  It’s on most of our phones as mobile phone developers use it for local data persistence.  Thought I’d give it a try.  Can’t be hard, SQL is a standard after all.

Presentation
I would of loved to have used Power BI.  I didn’t bother looking for a Pi data gateway for SqLite on Linux.

On R we have excellent visualisation library’s such as GGPlot.  Wondered what was available on Python.  Bokeh seemed powerful at first glance.   It’s not at V1 yet but I expect it will become the Python GGPlot in time.

Hosting
Again Power BI allows for sharing – it would have been ridiculously easy to create a Power BI Dashboard.  But same reasons as above it was off limits.

Flask is a micro-web framework that seemed right up my street.  Are you seeing a trend here? Writing this code early on Sunday I had very little time.

The code is on github. Proc.py contains the scraping/db work.

You’ll find the Bokeh code in Dash.py

The Flask server is set-up in the web folder.   Note this code is not ready for production purposes although it’s been running on my Pi without fail.   If I was to take this off the Pi I would probably use Gunicorn & Nginx to serve the app using an Azure VM.

Incidentally, I did this development on my Windows machine using VS Code with Python Tools installed, it provides a very refreshing debugging/linting experience.  My only gripe was that I’d like to be able to see Pandas Data Frames more easily when debugging.  Seeing the stack is fun but too deep in many cases.  I suppose I’m spoilt by RStudio.
I was surprised that I didn’t need to modify the code (other than changing the working path) to get it running on the Pi.  This applied to the full stack.  Oh how times are changing.

Conclusion

The web data does indeed show there are empty properties spread across Chester.
This data should be available for society.  We shouldn’t need hacking skills to get the data in my opinion.

After I joined the local community action group I found that local authorities across the country have very disparate data services.  It seems the improvements we’ve seen from the Gov haven’t filtered down to our cash strapped Local Authorities yet.

About Lee Hawthorn

Data Professional
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply