BOOM! I'm back at it. I've made some minor changes to my code accessing the STRAVA API and am using a different strategy to store stream data outside the SQL database. I have been working on generating my outcome measures, max power for a given time interval. To do this I generated max effort curves, or as STRAVA calls them power curves, for all activities. I then selected several critical power intervals, 1, 5, 10, 20, 60 minute efforts for all activities. Since most training is periodized, I decided to use a 6-week window to analyze the greatest gain in performance over any 6-week period over the course of one season, January to September. It took me awhile to code a rolling window over an irregular time series. I found a solution; it's not pretty nor efficient if you have advice it is welcomed. To determine the 6-week period with the greatest gain in performance I ran a linear regression. I selected the period with the largest slope. Here are some examples:
10 Minute power
20 Minute power
More to come!
I am now actively recruiting athletes to share their STRAVA workouts. I would like to recruit around 2000 athletes, which corresponds to an estimated 500 thousand cycling activities. When that happens, we can begin some analyses. I will try an update the analytics page with a more sophisticated explanation of the methods we are planning to use.
There is a rather extensive library that is easy to use. Check out the library for more information. The developer is very responsive to inquiries regarding his work. I wrote two scripts, the first downloads all the athlete’s activities and clubs, and updates the last time my application downloaded the athlete’s data. The second downloads the specific details surrounding the athlete’s activity. STRAVA calls this a STREAM. Unfortunately, the python library does not support this task. As such, this became a laborious task to code. The STRAVA API documentation is good but can be misleading, check it out. Not that it was complicated; it is just that the extreme detail sucked my time up. Regardless the biggest issue was sorting through all the different stream combinations that users collect. I am sure little bugs will pop up from time to time.
We published the site!! I think it looks great, but of course I have odd tastes. As it stands, the only data that we are currently collecting is demographic information that is sent to us during the authentication, token exchange process. This is great, but more work is needed to move additional data from STRAVA to our warehouse. I will spend some time now reaching out to fellow athletes and hopeful gain their trust to incorporate their training experiences in the hopes of developing efficient training protocols.
After some pain, the database, and php scripts have been completed and I am now ready to publish the web portal. I will now focus on the python scripts that will collect user data once they have given us permission. This can be tricky as the STRAVA api limits the number of calls we can make to the API.
Intial portal desgin has been completed. The next steps include the application itself. We will focus on database desgin and python scripts to extract user data into our warehouse.
Department of Biomedical Informatics
Columbia University Medical Center
Attn: Greg Hruby
622 West 168th St. VC5
New York, NY 10032