Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Add data from Google Location History #6

Open
MeurillonGuillaume opened this issue Jul 31, 2020 · 3 comments
Open

[FEATURE] Add data from Google Location History #6

MeurillonGuillaume opened this issue Jul 31, 2020 · 3 comments
Assignees

Comments

@MeurillonGuillaume
Copy link

Feature request: Google Location history

Is your feature request related to a problem? Please describe.

No it's not.

Describe the solution you'd like

As talked about during the Demo Day and during meetings, it would be really nice to implement Google Location History as a way for contributors to open-source their own Google data, so that it could be used with their permission to have real-world use.

I have done some research, and the Google Location History files are very useful for the bikedataproject. The data dump comes in a highly compressed .zip-file (22.6MB for 453MB of JSON) which would be very nice for users to be able to upload large amounts of data in a very short amount of time.

Playing around with Go a little further, I discovered that the main JSON file is just a large file (split up per 1GB according to Google) containing location, timestamp, type of activities that are predicted & their confidence as a percentage. Using a loop over this data, we can extract all points where the activity ON_BICYCLE has a confidence of at least 50% (can be tweaked though) and create trips based on these points.

Some days contain wrongly classified activities, but this is often just 1-5 points and can be filtered out if there are not enough points in this trip. The Google Location service (dump) contains datapoints between every 1 second to 5 minutes, depending on the movement etc. This makes the trips not super accurate, but more a rough estimate of the trajectory.

We could make another file upload option for users. There could be 2 ways forwards:

  1. Let the user create their data dump, and upload the entire .zip-file. This would be beneficial for the user, as the upload will go very fast. A file between 1-100MB can be uploaded, and the bikedataproject service can then unzip the file in the backend and process the delivered files. The downside of this method is that we will need a lot of validation for malicious or fake files.
  2. Let the user create their data dump, extract the Locationhistory.json file, and make them upload just this. The upside to this is that it get's much easier for us. Less validation for malicious files, no unzipping procedure, etc. The downsides are though that this method will take much longer for the user: manual extraction + uploading times will increase drastically (500MB vs 22MB upload as an example).

My preference goes out to option 1, though can be discussed.

Describe alternatives you've considered (if applicable)

Not applicable.

Additional context

  1. Location history can be retrieved through https://takeout.google.com/settings/takeout
@Driesvanransbeeck
Copy link

Thanks for looking into this @MeurillonGuillaume! It would be amazing to add it to the Bike Data Project platform.

@xivk
Copy link

xivk commented Aug 11, 2020

This is pretty awesome! I would also really like to see this feature added.

Me and @Driesvanransbeeck will discuss what we still do before the launch, not sure if it will be feasible doing this by then though.

@MeurillonGuillaume
Copy link
Author

MeurillonGuillaume commented Aug 11, 2020

@xivk One thing to keep in mind if this is being implemented is that the service might consume more memory when processing because these location history files can get rather large (up to 2GB).

These files cant really be processed in batches because it's all one large JSON array. I don't see how I could split them in batches at least 😄

@MeurillonGuillaume MeurillonGuillaume transferred this issue from bikedataproject/go-strava-daemon Aug 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants