Converting Garmin GPX files to Pandas and CSV
When I started running with an Apple Watch, I learned about the excellent HealthFit app which, in addition to syncing to a large number of run tracking sites like Strava and Runalyze, can export .gpx
files to iCloud storage.
So now I had .gpx
files of my runs and as a data scientist I wanted to analyze them. I first started with gpxpy. But as is quickly apparent in the documentation, the goal of gpxpy is not to produce row / columnar data, but rather simplify the looping of the xml elements. Plus I could not get it to recognize trackpoint extensions which contain information like heart rate etc.
That led me to write gpxcsv starting with the lxml
package. I wanted several features:
- A command line entry point
- Intelligent handling of any data in a trackpoint without a pre-defined list of the columns.
- Easily get the data into pandas in an interactive session.
Of course, now that I had this module to get the data into row/column form, I kept going. I made a gpxrun module which uses pandas to compute the GPS-based distance of the run. And then a Dash App to both analyze and convert GPX files and maybe collect statistics on the error/discrepancy between the GPS produced distance and the device/pedometer distance.
You can see the Dash app screenshot below.
