Managing files in Django is quite easy: you can add them to a model with only one line (for a review of the Django models, you can consult our article on managing web data frames), and the framework will do it all for ti - validations, loading, type verification. Even serving them requires very little effort.
However, there is one thing that Django no longer does as of version 1.3: automatically delete the files of a model when the instance is deleted.
There are many good reasons why this decision was made: in some cases this behavior was prone to data loss, and it is also slow enough to penalize the response time of our application. In addition, Cloud solutions such as Google or Amazon will not give us too many problems with excess files. But of course, sometimes we use a VPS and space is limited.
There are several possible solutions to eliminate those unpleasant unused files:
One possible solution is to create a custom command like this:
import os
from django.core.management.base import BaseCommand
from django.apps import apps
from django.db.models import Q
from django.conf import settings
from django.db.models import FileField
class Command(BaseCommand):
help = "This command deletes all media files from the MEDIA_ROOT directory which are no longer referenced by any of the models from installed_apps"
def handle(self, *args, **options):
all_models = apps.get_models()
physical_files = set()
db_files = set()
# Get all files from the database
for model in all_models:
file_fields = []
filters = Q()
for f_ in model._meta.fields:
if isinstance(f_, FileField):
file_fields.append(f_.name)
is_null = {'{}__isnull'.format(f_.name): True}
is_empty = {'{}__exact'.format(f_.name): ''}
filters &= Q(**is_null) | Q(**is_empty)
# only retrieve the models which have non-empty, non-null file fields
if file_fields:
files = model.objects.exclude(filters).values_list(*file_fields, flat=True).distinct()
db_files.update(files)
# Get all files from the MEDIA_ROOT, recursively
media_root = getattr(settings, 'MEDIA_ROOT', None)
if media_root is not None:
for relative_root, dirs, files in os.walk(media_root):
for file_ in files:
# Compute the relative file path to the media directory, so it can be compared to the values from the db
relative_file = os.path.join(os.path.relpath(relative_root, media_root), file_)
physical_files.add(relative_file)
# Compute the difference and delete those files
deletables = physical_files - db_files
if deletables:
for file_ in deletables:
os.remove(os.path.join(media_root, file_))
# Bottom-up - delete all empty folders
for relative_root, dirs, files in os.walk(media_root, topdown=False):
for dir_ in dirs:
if not os.listdir(os.path.join(relative_root, dir_)):
os.rmdir(os.path.join(relative_root, dir_))
There is another more complex option to implement: use signals.
When the instance of the model to which the file belongs is removed, here we can simply use the post_delete signal, which will ensure that the instance has already been deleted from the database successfully. The code for this part is quite simple:
from django.db.models import FileField
from django.db.models.signals import post_delete, post_save, pre_save
from django.dispatch.dispatcher import receiver
LOCAL_APPS = [
'my_app1',
'my_app2',
'...'
]
def delete_files(files_list):
for file_ in files_list:
if file_ and hasattr(file_, 'storage') and hasattr(file_, 'path'):
# this accounts for different file storages (e.g. when using django-storages)
storage_, path_ = file_.storage, file_.path
storage_.delete(path_)
@receiver(post_delete)
def handle_files_on_delete(sender, instance, **kwargs):
# presumably you want this behavior only for your apps, in which case you will have to specify them
is_valid_app = sender._meta.app_label in LOCAL_APPS
if is_valid_app:
delete_files([getattr(instance, field_.name, None) for field_ in sender._meta.fields if isinstance(field_, FileField)])
This last solution is more elegant but has one drawback: it does not take into account the multiple references to the same file (although, unless you are not directly modifying your database, this should never happen).