Only detect a column when loading csv

0

I have my historical series downloaded in a csv file, which has 5 columns

  

['Date', 'Price', 'Open', 'High', 'Low']

I do the following:

import pandas as pd
df = pd.read_csv("C:/Users/Lazardi/Desktop/GFG.BA.csv", header=0,index_col=False)

print(df)
                     Date;Price;Open;High;Low
0     01/06/2018;106.400;107.100;107.900;104.500
1     31/05/2018;105.800;106.000;107.000;103.500
2     30/05/2018;104.000;103.300;107.000;103.300
3     29/05/2018;102.700;103.650;106.450;100.050
4     28/05/2018;104.000;107.700;108.500;103.000
5     27/05/2018;107.600;107.60...0;111.750;106.700
6     26/05/2018;108.400;109.900;110.700;108.100
7     25/05/2018;111.800;115.000;115.000;110.300
8     24/05/2018;115.000;115.000;115.500;114.000
9     23/05/2018;114.000;117.000;117.000;114.000
10    22/05/2018;115.400;113.950;116.950;112.350
.................................................
4303          20/08/2006;1.532;1.542;1.551;1.523
4304          19/08/2006;1.523;1.542;1.542;1.523
4305          18/08/2006;1.597;1.606;1.615;1.587
4306          17/08/2006;1.606;1.615;1.615;1.597
4307          16/08/2006;1.615;1.615;1.615;1.578
4308          15/08/2006;1.615;1.642;1.652;1.615

[4309 rows x 1 columns]

df.info() 

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4309 entries, 0 to 4308
Data columns (total 1 columns):
Date;Price;Open;High;Low    4309 non-null object
dtypes: object(1)
memory usage: 33.7+ KB

I ask, why does it say "Data columns (total 1 columns)"?

I see the columns again

>>> df.columns

Index(['Date;Price;Open;High;Low'], dtype='object')

and when I try to see the columns by individual:

>>> df[2:10] 

shows them all to me as if they were one.

What can I do?

    
asked by Napoleon Ricaurte 02.06.2018 в 19:11
source

1 answer

1

The problem is that the default separator character for the columns used by < strong> pandas.read_csv is the comma ( , ), not the semicolon ( ; ). That's why takes the whole row as a single column .

You just have to use the sep parameter to indicate the correct separator. On the other hand, if you want to parse the column Date as dates you must indicate it with parse_dates and since the date is in the format dd / mm / yyyy you must also use < strong> dayfirst=True

from io import StringIO
import pandas as pd

# Esto es solo para emular un fichero csv
csv = StringIO('''
Date;Price;Open;High;Low
01/06/2018;106.400;107.100;107.900;104.500
31/05/2018;105.800;106.000;107.000;103.500
30/05/2018;104.000;103.300;107.000;103.300
29/05/2018;102.700;103.650;106.450;100.050
28/05/2018;104.000;107.700;108.500;103.000
27/05/2018;107.600;107.600;111.750;106.700
26/05/2018;108.400;109.900;110.700;108.100
25/05/2018;111.800;115.000;115.000;110.300
24/05/2018;115.000;115.000;115.500;114.000
23/05/2018;114.000;117.000;117.000;114.000
22/05/2018;115.400;113.950;116.950;112.350
''')


df = pd.read_csv(csv, sep=";", parse_dates=["Date"], dayfirst=True)
>>> df
         Date  Price    Open    High     Low
0  2018-06-01  106.4  107.10  107.90  104.50
1  2018-05-31  105.8  106.00  107.00  103.50
2  2018-05-30  104.0  103.30  107.00  103.30
3  2018-05-29  102.7  103.65  106.45  100.05
4  2018-05-28  104.0  107.70  108.50  103.00
5  2018-05-27  107.6  107.60  111.75  106.70
6  2018-05-26  108.4  109.90  110.70  108.10
7  2018-05-25  111.8  115.00  115.00  110.30
8  2018-05-24  115.0  115.00  115.50  114.00
9  2018-05-23  114.0  117.00  117.00  114.00
10 2018-05-22  115.4  113.95  116.95  112.35

>>> df.Date
0    2018-06-01
1    2018-05-31
2    2018-05-30
3    2018-05-29
4    2018-05-28
5    2018-05-27
6    2018-05-26
7    2018-05-25
8    2018-05-24
9    2018-05-23
10   2018-05-22
Name: Date, dtype: datetime64[ns]
    
answered by 02.06.2018 в 19:34