You can use the glob
module to filter the files. It is advisable that you use iterators instead of creating intermediate lists, dictionaries or dataframes that you will not use anymore.
One option would be:
import glob
import os
import pandas as pd
ruta = '' # Ruta al directorio que contiene los csv
archivos_csv = glob.iglob(os.path.join(ruta, "*.csv"))
dataframes = (pd.read_csv(csv) for csv in archivos_csv)
df = pd.concat(dataframes, axis=1)
Note: You should not use dict
as the identifier of a variable, overwrite the class dict
and you may end up with unexpected results. In any case use dict_
.
Edit:
If you want to print the names of the files, you can do without glob.iglob
(return an iterator) and use glob.glob
(return a list):
ruta = ''
archivos_csv = glob.glob(os.path.join(ruta, "*.csv"))
print("Archivos csv encontrados:")
print(*(os.path.basename(path) for path in archivos_csv), sep= "\n")
#En Python 2.x cambiar por:
#print "Archivos csv encontrados:"
#for nombre in (os.path.basename(path) for path in archivos_csv):
# print nombre
Exit:
csv files found:
2.csv
1.csv
3.csv
Or print a list with the names of the files directly:
ruta = ''
archivos_csv = glob.glob(os.path.join(ruta, "*.csv"))
print([os.path.basename(path) for path in archivos_csv])
Exit:
['2.csv', '1.csv', '3.csv']
Note: glob
does not return the files in a certain order, if you want to open the files according to a certain order we can use sorted
/ list.sort
on the output of glob.glob
: archivos_csv = sorted(glob.iglob(os.path.join(ruta, "*.csv")))