Reading data from files

Import Pandas

The following lines of code from assignment 3 imports the Pandas library. Pandas contains functions needed to read a csv file

import pandas as pd

Reading the file into a pandas DataFrame

Input the data from subject 1 into a DataFrame using the following line of code

df = pd.read_csv('s01/s01.txt', sep='\t')

Displaying the DataFrame

Subject 1's data is now held in the DataFrame df. A portion of df is shown below.

df.tail(5)

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	id	year	month	day	hour	minute	gender	age	handedness	wait	block	trial	target_location	target	flankers	rt	response	error	pre_target_response	ITI_response	target_on_error
187	1	2015	5	22	11	30	m	25	r	1.627	5	28	left	white	congruent	0.349	white	True	False	False	0.024
188	1	2015	5	22	11	30	m	25	r	1.627	5	29	right	white	congruent	0.371	white	True	False	False	0.023
189	1	2015	5	22	11	30	m	25	r	1.627	5	30	up	black	incongruent	0.549	black	True	False	False	0.023
190	1	2015	5	22	11	30	m	25	r	1.627	5	31	left	white	neutral	0.463	white	True	False	False	0.023
191	1	2015	5	22	11	30	m	25	r	1.627	5	32	right	black	neutral	0.430	black	True	False	False	0.023

The code above shows the last 5 values of df. If we were to just use df.tail(), it would show us the last 10 values of the DataFrame.

Reading Multiple Files

Import and use glob

import glob
sub_files = glob.glob('s??/s??.txt')

Here, I am importing pythons glob package and using it to list all of subjects .txt files. All subjects have a file starting with s followed by a two digit ID number. I used '??' as the id number to find all of subjects files and their corresponding .txt file.

Reading the .txt files

To read each participants file I have used list comprehension to include a for loop within my list. The code will loop through and add each subjects file to a list called sub_data. This produces a list of individual DataFrames; one DataFrame for each subject. To put all of the subjects data into one DataFrame, I used pd.concat() to concatenate all of the participants data files.

sub_data = [pd.read_csv(file, sep='\t') for file in sub_files]
df = pd.concat(sub_data)

The complete DataFrame

Next I used df.reset_index() to ensure all of the trials had unique index numbers and then printed a random sample of 8 values from df

import glob
subFiles = glob.glob('s**/s**.txt')
subData = [pd.read_csv(file, sep='\t') for file in subFiles]
df = pd.concat(subData)
df = df.reset_index()
df.sample(8)

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	index	id	year	month	day	hour	minute	gender	age	handedness	wait	block	trial	target_location	target	flankers	rt	response	error	pre_target_response	ITI_response	target_on_error
313	121	2	2015	5	25	14	36	f	21	r	12.508	3	26	right	white	congruent	0.409	white	True	False	False	0.024
365	173	2	2015	5	25	14	36	f	21	r	3.096	5	14	left	black	neutral	0.429	black	True	False	False	0.024
334	142	2	2015	5	25	14	36	f	21	r	2.156	4	15	down	black	congruent	0.460	black	True	False	False	0.024
235	43	2	2015	5	25	14	36	f	21	r	4.677	1	12	left	white	incongruent	0.451	black	False	False	False	0.024
525	141	1	2015	5	22	11	30	m	25	r	1.599	4	14	down	white	neutral	0.818	black	False	False	True	0.023
498	114	1	2015	5	22	11	30	m	25	r	1.392	3	19	down	black	congruent	0.551	black	True	False	False	0.023
409	25	1	2015	5	22	11	30	m	25	r	3.240	practice	26	up	white	congruent	0.425	white	True	False	False	0.023
351	159	2	2015	5	25	14	36	f	21	r	2.156	4	32	left	black	incongruent	0.728	white	False	False	False	0.024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reading_files2.md

reading_files2.md

Reading data from files

Import Pandas

Reading the file into a pandas DataFrame

Displaying the DataFrame

Reading Multiple Files

Import and use glob

Reading the .txt files

The complete DataFrame

Files

reading_files2.md

Latest commit

History

reading_files2.md

File metadata and controls

Reading data from files

Import Pandas

Reading the file into a pandas DataFrame

Displaying the DataFrame

Reading Multiple Files

Import and use glob

Reading the .txt files

The complete DataFrame