Reading
Overview
Downloaded information from Wildlife Insights image projects is divided into four different files:
- cameras.csv: metadata about the devices (cameras) used in the project.
- deployments.csv: metadata about the placement of a camera, including start date, end date, coordinates and other camera settings.
- images.csv: contains data about each individual image, including species identifications and timestamp.
- projects.csv: metadata about project methodology and objectives.
Reading functions allow you to conveniently read these files (together at once or independently) as pandas dataframes in order to use them as inputs of other wiutils functions.
Here is a quick overview of the different reading functions and their description:
| Function | Description |
|---|---|
load_demo |
Loads the cameras, deployments, images and projects tables from a demo dataset. |
read_bundle |
Reads the cameras, deployments, images and projects tables from a specific Wildlife Insights project bundle. |
read_cameras |
Reads the cameras' table from a specific Wildlife Insights project bundle. |
read_deployments |
Reads the deployments' table from a specific Wildlife Insights project bundle. |
read_images |
Reads the images' table from a specific Wildlife Insights project bundle. |
read_projects |
Reads the projects' table from a specific Wildlife Insights project bundle. |
Reading a bundle
A bundle is a zip file downloaded from Wildlife Insights containing the four different csv files described earlier. wiutils offers a convenient function to read these four files as pandas dataframes at once: read_bundle. You can either read the files directly from the downloaded zip file or from a folder where you extracted the contents.
Note
This function always returns four items (dataframes) in the same order: cameras, deployments, images and projects.
Here is how you can read the files from a bundle:
import wiutils
cameras, deployments, images, projects = wiutils.read_bundle("path/to/bundle.zip")
import wiutils
cameras, deployments, images, projects = wiutils.read_bundle("path/to/folder")
Notice that the only thing that changes is the path you are passing as the argument to the read_bundle function: when reading from a zip file, you need to specify the absolute or relative path to the file (including the .zip extension); when reading from a folder, you need to specify the absolute or relative path of the folder.
Note
If you are using reading the files from the folder you extracted the contents to, make sure there are no nested folders; you have to specify the path to the folder that has the four csv files.
Reading individual files
You can also read individual files from a bundle as pandas dataframes using one of the following functions:
read_cameras()read_deployments()read_images()read_projects()
Note
When reading indiviudal files (and also a bundle), the deployments and images dataframes will automatically have datetime types (instead of strings) for relevant columns (start_date and end_date for deployments and timestamp for images).
Two reasons to do this instead of reading all the files at once might be:
-
You don't need to read the four files. For example, you need just to read the deployments.csv file to plot the date ranges. Instead of reading all the files (which can take some time depending on the size of the images.csv file), you just read the deployments.csv file as a dataframe.
-
You need to have more control over how the files are read as dataframes. For example, you might want so specify the rows or columns that are going to be read or the data types for each column.
Note
Just like the read_bundle function, the four functions to read individual files work on bundle zip files directly or a folder where their contents were extracted to.
For example, if you want to read just the deployments.csv file, you can use the read_deployments function:
import wiutils
deployments = wiutils.read_deployments("path/to/bundle.zip")
With these functions, you have the possibility to pass any keyword argument accepted by the pandas.read_csv function in order to have more control on how the files are read.
For example, you might want to just read the first 25000 rows of the images.csv file to test some functions before reading all the images:
import wiutils
images = wiutils.read_deployments("path/to/folder", nrows=25000) # Note that nrows is an argument accepted by pandas.read_csv
Loading demo data
If you don't have a bundle file handy or just want to test wiutils functions using a smaller dataset, we provide two demo datasets that you can load as dataframes:
cajambrecristales
Simply pass one of these two names to the load_demo function in order to get the four respective dataframes (i.e. cameras, deployments, images and projects):
import wiutils
cameras, deployments, images, projects = wiutils.load_demo("cajambre")
We can then inspect one of our dataframes:
>>> images
project_id deployment_id ... cv_confidence license
0 2003123 CTCAJ013743 ... NaN CC-BY-NC
1 2003123 CTCAJ013743 ... NaN CC-BY-NC
2 2003123 CTCAJ013743 ... NaN CC-BY-NC
3 2003123 CTCAJ013743 ... NaN CC-BY-NC
4 2003123 CTCAJ013743 ... NaN CC-BY-NC
... ... ... ... ...
5248 2003123 CTCAJ193741 ... NaN CC-BY-NC
5249 2003123 CTCAJ193741 ... NaN CC-BY-NC
5250 2003123 CTCAJ193741 ... NaN CC-BY-NC
5251 2003123 CTCAJ193741 ... NaN CC-BY-NC
5252 2003123 CTCAJ193741 ... NaN CC-BY-NC
[5253 rows x 26 columns]
>>> images["timestamp"]
0 2014-12-08 07:46:00
1 2014-11-22 05:58:38
2 2014-12-08 07:46:02
3 2014-11-22 05:58:22
4 2014-10-23 12:31:30
...
5248 2014-11-26 05:13:00
5249 2014-11-17 22:09:58
5250 2014-11-23 04:42:34
5251 2014-11-11 21:26:54
5252 2014-11-11 13:26:34
Name: timestamp, Length: 5253, dtype: datetime64[ns]