To know what folders and how many are inside another folder

1

I want to analyze items that are inside folders. For this I see that I need to use the library glob . Googleando I have learned to know what elements or how many there is of a specific extension within a folder.

jpg = glob.glob('carpeta/*.jpg')
total_jpg = len(jpg)

print('Hay ',total_jpg, ' ficheros jpg')
print('Los nombres son:',jpg)

But what if I wanted to know how many folders are there or their names? I do not see how to do it and having no extension I do not know how to do it.

I know this post and I think that it is answered here but I am not able to see it, there are a lot of options and it is not clear to me what does what I want.

    
asked by NEA 23.12.2018 в 19:56
source

1 answer

1

I do not think that glob is the best option to analyze folder contents. A utility (in my opinion) could be based on os.listdir() , which returns a list with all the names of existing files and folders, so you can easily classify them according to their extension:

import os
from collections import defaultdict

clasificados = defaultdict(list)
for nombre in os.listdir():
  if "." in nombre:
    extension = nombre.split(".")[-1].lower()
  else:
    extension = ""
  clasificados[extension].append(nombre)

for k, v in clasificados.items():
  print("Extensión '{}': {} ficheros".format(k, len(v)))

Explanation We build a dictionary in which the keys will be the file extensions, and the values a list with the files that have that extension. The extension is extracted starting with split() by the point (if there is one) and passing it to lowercase, and if there is no point, an empty extension is set. Once classified in that dictionary, it can be traversed to show how many elements there are of each extension. In my case when running it in my "Downloads" folder, something like this appears:

Extensión 'pdf': 27 ficheros
Extensión 'zip': 1 ficheros
Extensión 'xlsx': 1 ficheros
Extensión 'png': 5 ficheros
Extensión 'py': 1 ficheros
Extensión 'gif': 3 ficheros
Extensión '': 2 ficheros
etc...

Detect directories

As you have seen, the above does not work either to detect directories, since there is no way of knowing by name if something is a directory or not. But the os module provides other ways to iterate through the contents of a folder.

Using os.scandir() instead of getting file names, we get objects of type os.DirEntry with a series of methods that allow us to obtain additional information about each element. One of these methods is .is_dir() , which gives you True if it is a directory, or .is_file() that gives True if it is a normal file . Another is .name that gives us the name if we want to look at its extension.

Using this we can classify by extensions only those that are really files, and count aside those that are directories:

import os
from collections import defaultdict

clasificados = defaultdict(list)
directorios = []

for elemento in os.scandir():
  nombre = elemento.name
  if elemento.is_dir():
    directorios.append(nombre)
  else:
    if "." in nombre:
      extension = nombre.split(".")[-1].lower()
    else:
      extension = ""
    clasificados[extension].append(nombre)

print("{} directorios".format(len(directorios)))
for d in directorios:
  print("- {}".format(d))
print("{} ficheros".format(sum(len(caso) for caso in clasificados.values())))
for k, v in clasificados.items():
  print("- '{}': {} ficheros".format(k, len(v)))

And the output is something like:

2 directorios
- Safari
- tmp
52 ficheros
- 'pdf': 27 ficheros
- 'zip': 1 ficheros
- 'xlsx': 1 ficheros
- 'png': 5 ficheros
- 'py': 1 ficheros
- 'gif': 3 ficheros
...etc
    
answered by 23.12.2018 / 20:21
source