You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
No it's not.
Describe the solution you'd like
As talked about during the Demo Day and during meetings, it would be really nice to implement Google Location History as a way for contributors to open-source their own Google data, so that it could be used with their permission to have real-world use.
I have done some research, and the Google Location History files are very useful for the bikedataproject. The data dump comes in a highly compressed .zip-file (22.6MB for 453MB of JSON) which would be very nice for users to be able to upload large amounts of data in a very short amount of time.
Playing around with Go a little further, I discovered that the main JSON file is just a large file (split up per 1GB according to Google) containing location, timestamp, type of activities that are predicted & their confidence as a percentage. Using a loop over this data, we can extract all points where the activity ON_BICYCLE has a confidence of at least 50% (can be tweaked though) and create trips based on these points.
Some days contain wrongly classified activities, but this is often just 1-5 points and can be filtered out if there are not enough points in this trip. The Google Location service (dump) contains datapoints between every 1 second to 5 minutes, depending on the movement etc. This makes the trips not super accurate, but more a rough estimate of the trajectory.
We could make another file upload option for users. There could be 2 ways forwards:
Let the user create their data dump, and upload the entire .zip-file. This would be beneficial for the user, as the upload will go very fast. A file between 1-100MB can be uploaded, and the bikedataproject service can then unzip the file in the backend and process the delivered files. The downside of this method is that we will need a lot of validation for malicious or fake files.
Let the user create their data dump, extract the Locationhistory.json file, and make them upload just this. The upside to this is that it get's much easier for us. Less validation for malicious files, no unzipping procedure, etc. The downsides are though that this method will take much longer for the user: manual extraction + uploading times will increase drastically (500MB vs 22MB upload as an example).
My preference goes out to option 1, though can be discussed.
Describe alternatives you've considered (if applicable)
@xivk One thing to keep in mind if this is being implemented is that the service might consume more memory when processing because these location history files can get rather large (up to 2GB).
These files cant really be processed in batches because it's all one large JSON array. I don't see how I could split them in batches at least 😄
Feature request: Google Location history
Is your feature request related to a problem? Please describe.
No it's not.
Describe the solution you'd like
As talked about during the Demo Day and during meetings, it would be really nice to implement Google Location History as a way for contributors to open-source their own Google data, so that it could be used with their permission to have real-world use.
I have done some research, and the Google Location History files are very useful for the bikedataproject. The data dump comes in a highly compressed
.zip
-file (22.6MB for 453MB ofJSON
) which would be very nice for users to be able to upload large amounts of data in a very short amount of time.Playing around with Go a little further, I discovered that the main
JSON
file is just a large file (split up per 1GB according to Google) containing location, timestamp, type of activities that are predicted & their confidence as a percentage. Using a loop over this data, we can extract all points where the activityON_BICYCLE
has a confidence of at least 50% (can be tweaked though) and create trips based on these points.Some days contain wrongly classified activities, but this is often just 1-5 points and can be filtered out if there are not enough points in this trip. The Google Location service (dump) contains datapoints between every 1 second to 5 minutes, depending on the movement etc. This makes the trips not super accurate, but more a rough estimate of the trajectory.
We could make another file upload option for users. There could be 2 ways forwards:
.zip
-file. This would be beneficial for the user, as the upload will go very fast. A file between 1-100MB can be uploaded, and the bikedataproject service can then unzip the file in the backend and process the delivered files. The downside of this method is that we will need a lot of validation for malicious or fake files.Locationhistory.json
file, and make them upload just this. The upside to this is that it get's much easier for us. Less validation for malicious files, no unzipping procedure, etc. The downsides are though that this method will take much longer for the user: manual extraction + uploading times will increase drastically (500MB vs 22MB upload as an example).My preference goes out to option 1, though can be discussed.
Describe alternatives you've considered (if applicable)
Not applicable.
Additional context
The text was updated successfully, but these errors were encountered: