Question :
How can I read in a .csv file (with no headers) and when I only want a subset of the columns (say 4th and 7th out of a total of 20 columns), using pandas? I cannot seem to be able to do usecols
Answer #1:
In order to read a csv in that doesn’t have a header and for only certain columns you need to pass params header=None
and usecols=[3,6]
for the 4th and 7th columns:
df = pd.read_csv(file_path, header=None, usecols=[3,6])
See the docs
Answer #2:
Previous answers were good and correct, but in my opinion, an extra names
parameter will make it perfect, and it should be the recommended way, especially when the csv has no headers
.
Solution
Use usecols
and names
parameters
df = pd.read_csv(file_path, usecols=[3,6], names=['colA', 'colB'])
Additional reading
or use header=None
to explicitly tells people that the csv
has no headers (anyway both lines are identical)
df = pd.read_csv(file_path, usecols=[3,6], names=['colA', 'colB'], header=None)
So that you can retrieve your data by
# with `names` parameter
df['colA']
df['colB']
instead of
# without `names` parameter
df[0]
df[1]
Explain
Based on read_csv, when names
are passed explicitly, then header
will be behaving like None
instead of 0
, so one can skip header=None
when names
exist.
Answer #3:
Make sure you specify pass header=None
and add usecols=[3,6]
for the 4th and 7th columns.