Error "Single positional indexer is out of bounds" when reading csv

Question

Error "Single positional indexer is out of bounds" when reading csv

Navigation

#1 by (1 votes)

1

When I run the following code:

from pandas import Series
from matplotlib import pyplot
from statsmodels.tsa.seasonal import seasonal_decompose
series = Series.from_csv('GOOG.csv', header=0)
result = seasonal_decompose(series, model='multiplicative')
result.plot()
pyplot.show()

I always get the same error:

Single positional indexer is out of bounds

The data of the csv file are of the type:

Date        Close
2013-10-14  435.224274
2013-10-15  438.155212
2013-10-16  446.113464
2013-10-17  441.523315
2013-10-18  502.437134
...

I'm trying to get the previous csv by filtering the two columns that interest me from another. The csv that was originally used has the structure:

Date, Open, High, Low, Close, Adj Close, Volume
  2012-05-18,42.049999,45.000000,38.000000,38.230000,38.23000 0,573576400
  ...

Of the columns that are indicated in the header, I am interested only in Date and Close , which are the data I want to use to draw the graphs.

How could I solve it?

python python-3.x csv pandas

asked by Carla 03.12.2017 в 17:56

source

1 answer

Display datetimepicker up Problem when switching from a second activity to a third AndroidStudio

score 1 · Accepted Answer

Series.from_csv is marked as "Deprecated" in favor of pandas.read_csv much more flexible. To obtain a series, simply pass the parameter squeeze with value True as long as the csv consists of only one column (more index) or specify that you only read these two explicitly if you have more.

To read only part of the csv columns (in this case two) just pass a list to the argument usecols .

We are going from the following csv (invented according to what you show in the comments about the structure of your original csv):

Date, Open, High, Low, Close, Adj Close, Volume
  2012-05-18,42.049999,45.000000,38.000000,38.230000,38.23000 0,573576400
  2012-05-19,42.049999,45.000000,38.000000,39.560000,38.23000 0,573576400
  2012-05-20,42.049999,45.000000,38.000000,40.120000,38.23000 0,573576400
  2012-05-21,42.049999,45.000000,38.000000,42.230000,38.23000 0,573576400
  2012-05-22,42.049999,45.000000,38.000000,45.856000,38.23000 0,573576400
  2012-05-23,42.049999,45.000000,38.000000,44.230000,38.23000 0,573576400
  2012-05-24,42.049999,45.000000,38.000000,46.230000,38.23000 0,573576400
  2012-05-25,42.049999,45.000000,38.000000,48.820000,38.23000 0,573576400
  2012-05-26,42.049999,45.000000,38.000000,50.590000,38.23000 0,573576400
  2012-05-27,42.049999,45.000000,38.000000,42.450000,38.23000 0,573576400
  2012-05-28,42.049999,45.000000,38.000000,55.450000,38.23000 0,573576400
  2012-05-29,42.049999,45.000000,38.000000,47.120000,38.23000 0,573576400
  2012-05-30,42.049999,45.000000,38.000000,58.690000,38.23000 0,573576400

The error that samples is launched from NumPy and is generally due to a bad parsing of the csv that causes a column to be missing or on, causing an illegal indexing in the NumPy array used internally to store the data. In a correct csv can happen by not specifying the correct separator used in him. For this it is enough to use the argument sep .

On the other hand, we must parse the column of dates properly since otherwise it will be read as Python strings ( dtype object ) and statsmodels.tsa.seasonal.seasonal_decompose will throw an exception as it waits for a time series. To avoid problems it is advisable to pass a list to the parse_dates argument with the columns that are dates and we want them to be parsed to datetime . If your dates are of European style (with the first day, as 19/8/04 ) you must indicate it with the argument dayfirst .

With all this the code should be (remember to change the csv path according to your case):

import pandas as pd
from matplotlib import pyplot
from statsmodels.tsa.seasonal import seasonal_decompose
series = pd.read_csv('datos.csv', squeeze=True, header=0, parse_dates = [0],
                     index_col =[0], sep=',', usecols = ["Date",  "Close"])

result = seasonal_decompose(series, model='multiplicative')
result.plot()
pyplot.show()

What the following graph shows us:

Code tested in:

Python      3.6.3
Matplotlib  2.1.0
NumPy       1.13.3
Pandas      0.21.0
Statsmodels 0.8.0