bandicoot is an open-source python toolbox to analyze mobile phone metadata. For more information, see: http://cpg.doc.ic.ac.uk/bandicoot/
The source code of the notebook is available as demo.ipynb and a plain
Python version as demo.py. You can download them from our repository on Github at https://github.com/computationalprivacy/bandicoot/tree/master/demo

# Records for the user 'ego'
!head -n 5 data/ego.csv
# GPS locations of cell towers
!head -n 5 data/antennas.csv
import bandicoot as bc
U = bc.read_csv('ego', 'data/', 'data/antennas.csv')
Export and serve an interactive visualization using:
bc.visualization.run(U)
or export only using:
bc.visualization.export(U, 'my-viz-path')
import os
viz_path = os.path.dirname(os.path.realpath(__name__)) + '/viz'
bc.visualization.export(U, viz_path);
from IPython.display import IFrame
IFrame("/files/viz/index.html", "100%", 700)
Using bandicoot, compute aggregated indicators from bc.individual and bc.spatial:
bc.individual.percent_initiated_conversations(U)
bc.spatial.number_of_antennas(U)
bc.spatial.radius_of_gyration(U)
The signature of the active_days indicators is:
bc.individual.active_days(user, groupby='week', interaction='callandtext', summary='default', split_week=False, split_day=False, filter_empty=True, datatype=None)
What does that mean?
Weekly aggregation
By default, _bandicoot_ computes the indicators on a weekly basis and returns the average (mean) over all the weeks available and its standard deviation (std) in a nested dictionary.
bc.individual.active_days(U)
The groupby keyword controls the aggregation:
groupby='week' to divide by week (by default),groupby='month' to divide by month,groupby=None to aggregate all values.bc.individual.active_days(U, groupby='week')
bc.individual.active_days(U, groupby='month')
bc.individual.active_days(U, groupby=None)
Some indicators such as active_days returns one number. Others, such as duration_of_calls returns a distribution.
The summary keyword can take three values:
summary='default' to return mean and standard deviation,summary='extended' for the second type of indicators, to return mean, sem, median, skewness and std of the distribution,summary=None to return the full distribution.bc.individual.call_duration(U)
bc.individual.call_duration(U, summary='extended')
bc.individual.call_duration(U, summary=None)
split_week divide records by 'all week', 'weekday', and 'weekend'.split_day divide records by 'all day', 'day', and 'night'.bc.individual.active_days(U, split_week=True, split_day=True)
The function bc.utils.all computes automatically all indicators for a single user.
You can use the same keywords to group by week/month/all time range, or return extended statistics.
features = bc.utils.all(U, groupby=None)
features
bandicoot supports exports in CSV and JSON format. Both to_csv and to_json functions require either a single feature dictionnary, or a list of dictionnaries (for multiple users).
bc.to_csv(features, 'demo_export_user.csv')
bc.to_json(features, 'demo_export_user.json')
!head demo_export_user.csv
!head -n 15 demo_export_user.json
You can easily develop your indicator using the @grouping decorator. You only need to write a function taking as input a list of records and returning an integer or a list of integers (for a distribution). The @grouping decorator wraps the function and call it for each group of weeks.
from bandicoot.helper.group import grouping
@grouping(interaction='call')
def shortest_call(records):
in_durations = (r.call_duration for r in records)
return min(in_durations)
shortest_call(U)
shortest_call(U, split_day=True)