read_csv

bandicoot.io.read_csv(user_id, records_path, antennas_path=None, attributes_path=None, recharges_path=None, network=False, duration_format='seconds', describe=True, warnings=True, errors=False, drop_duplicates=False)

Load user records from a CSV file.

Parameters:
user_id : str

ID of the user (filename)

records_path : str

Path of the directory all the user files.

antennas_path : str, optional

Path of the CSV file containing (place_id, latitude, longitude) values. This allows antennas to be mapped to their locations.

recharges_path : str, optional

Path of the directory containing recharges files (datetime, amount, balance, retailer_id CSV file).

antennas_path : str, optional

Path of the CSV file containing (place_id, latitude, longitude) values. This allows antennas to be mapped to their locations.

network : bool, optional

If network is True, bandicoot loads the network of the user’s correspondants from the same path. Defaults to False.

duration_format : str, default is ‘seconds’

Allows reading records with call duration specified in other formats than seconds. Options are ‘seconds’ or any format such as ‘%H:%M:%S’, ‘%M%S’, etc.

describe : boolean

If describe is True, it will print a description of the loaded user to the standard output.

errors : boolean

If errors is True, returns a tuple (user, errors), where user is the user object and errors are the records which could not be loaded.

drop_duplicates : boolean

If drop_duplicates, remove “duplicated records” (same correspondants, direction, date and time). Not activated by default.

Notes

  • The csv files can be single, or double quoted if needed.
  • Empty cells are filled with None. For example, if the column call_duration is empty for one record, its value will be None. Other values such as "N/A", "None", "null" will be considered as a text.

Examples

>>> user = bandicoot.read_csv('sample_records', '.')
>>> print len(user.records)
10
>>> user = bandicoot.read_csv('sample_records', 'samples', 'sample_places.csv')
>>> print len(user.antennas)
5
>>> user = bandicoot.read_csv('sample_records', '.', None, 'sample_attributes.csv')
>>> print user.attributes['age']
25