First, make sure we load some useful libraries (and of course
mpathsenser
itself).
library(tidyr)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(ggplot2)
library(mpathsenser)
The data for this vignette is contained in the extdata
folder. However, on some system this folder may be set to read-only and
it is generally good practice not to modify package folders (to prevent
changing or breaking the package). To this end, we first copy the data
to a temporary directory (as defined by the environment variable , , or
), a directory that is freshly created each time at R’s start up and
cleaned up when the session ends.
# Get the temp folder
<- tempdir()
tempdir
# Get a handle to the data files
<- system.file("extdata", "example", package = "mpathsenser")
path
# Get a list of all the files that are to be copied
<- list.files(path, "carp-data", full.names = TRUE)
copy_list
# Copy all data
file.copy(
from = copy_list,
to = tempdir,
overwrite = TRUE,
copy.mode = FALSE
)
The extdata
folder contains several .zip
files as well as some JSON
files. It is likely that the
data for your study will look the same only much more. Note that all of
these data files came directly from m-Path Sense (i.e. there was no
pre-processing yet).
The data from m-Path Sense originates in the following way: The
application continuously collects all kinds of data in the background
(e.g. accelerometer data). Once collected, the data goes through several
stages where, for example, the data is pre-processed (as already happens
with data from the light
sensor) or anonymised upon
request. Finally, data is written to a JSON
file which is
really just a text file but with a specific format. When some new data
comes in (whether it be from the same sensor or not), the next line is
written in the JSON
file and so on, until the file has
reached a certain size (5MB by default). The JSON
file is
then zipped to reduce its size and subsequently transferred to a server.
Once transferred, the data is deleted from the participant’s phone to
both save on space as well as prevent data leakage.
Thus, a first step to take is to unzip these files to extract its
JSON
contents. If you feel more comfortable unzipping using
your favourite zip program you can do so, just make sure all files end
up in the same directory (including the non-zipped JSON files).
unzip_data(path = tempdir)
#> Unzipped 37 files.
In m-Path Sense, data is written to JSON files as it comes in. In the
JSON file format, every file starts with [
and ends with
]
. If the app is killed, JSON files are not properly closed
and hence cannot be read by JSON parsers. So, we must first test if all
files are in a valid JSON format and fix those that are not.
While you can also call fix_jsons
directly, it is
generally safer (and faster) to first run test_jsons
to get
an idea of how many files need fixing.
# Note that test_jsons returns the full path names
<- test_jsons(tempdir)
to_fix #> Warning: There were issues in some files
print(to_fix)
#> [1] "C:\\Users\\u0134047\\AppData\\Local\\Temp\\RtmpSIm1ZH\\1_example_carp-data-2022-06-14-09-18-41-055229Z.json"
#> [2] "C:\\Users\\u0134047\\AppData\\Local\\Temp\\RtmpSIm1ZH\\1_example_carp-data-2022-06-14-09-38-53-504884Z.json"
#> [3] "C:\\Users\\u0134047\\AppData\\Local\\Temp\\RtmpSIm1ZH\\1_example_carp-data-2022-06-14-09-55-14-202021Z.json"
#> [4] "C:\\Users\\u0134047\\AppData\\Local\\Temp\\RtmpSIm1ZH\\1_example_carp-data-2022-06-14-11-48-39-822128Z.json"
#> [5] "C:\\Users\\u0134047\\AppData\\Local\\Temp\\RtmpSIm1ZH\\1_example_carp-data-2022-06-14-12-46-41-739139Z.json"
#> [6] "C:\\Users\\u0134047\\AppData\\Local\\Temp\\RtmpSIm1ZH\\1_example_carp-data-2022-06-14-12-51-10-826674Z.json"
#> [7] "C:\\Users\\u0134047\\AppData\\Local\\Temp\\RtmpSIm1ZH\\1_example_carp-data-2022-06-14-13-24-42-818906Z.json"
#> [8] "C:\\Users\\u0134047\\AppData\\Local\\Temp\\RtmpSIm1ZH\\1_example_carp-data-2022-06-14-17-47-29-568210Z.json"
#> [9] "C:\\Users\\u0134047\\AppData\\Local\\Temp\\RtmpSIm1ZH\\1_example_carp-data-2022-06-14-20-35-31-622759Z.json"
#> [10] "C:\\Users\\u0134047\\AppData\\Local\\Temp\\RtmpSIm1ZH\\1_example_carp-data-2022-06-14-23-47-14-992568Z.json"
#> [11] "C:\\Users\\u0134047\\AppData\\Local\\Temp\\RtmpSIm1ZH\\1_example_carp-data-2022-06-15-06-59-54-808885Z.json"
#> [12] "C:\\Users\\u0134047\\AppData\\Local\\Temp\\RtmpSIm1ZH\\1_example_carp-data-2022-06-15-08-03-14-431352Z.json"
fix_jsons(path = NULL, to_fix)
#> Fixed 12 files
# Create a new database
<- create_db(tempdir, "getstarted.db")
db
# Import the data
import(
path = tempdir,
db = db,
batch_size = 12
)#> All files were successfully written to the database.
<- c(
sensors "Accelerometer", "Activity", "AppUsage", "Bluetooth", "Calendar",
"Connectivity", "Device", "Gyroscope", "InstalledApps", "Light",
"Location", "Memory", "Pedometer", "Screen", "Weather", "Wifi"
)coverage(
db = db,
participant_id = "2784",
sensor = sensors,
relative = FALSE
)#> # A tibble: 384 × 3
#> hour measure coverage
#> <dbl> <fct> <dbl>
#> 1 0 Accelerometer 0
#> 2 1 Accelerometer 0
#> 3 2 Accelerometer 0
#> 4 3 Accelerometer 0
#> 5 4 Accelerometer 0
#> 6 5 Accelerometer 0
#> 7 6 Accelerometer 0
#> 8 7 Accelerometer 1528
#> 9 8 Accelerometer 540
#> 10 9 Accelerometer 6496
#> # … with 374 more rows
Finally, recall that once you’re done working with a database to also close it.
close_db(db)