Skip to content Skip to sidebar Skip to footer

How Do I Speed Up Iteration Of Large Datasets In Django

I have a query set of approximately 1500 records from a Django ORM query. I have used the select_related() and only() methods to make sure the query is tight. I have also used co

Solution 1:

A QuerySet can get pretty heavy when it's full of model objects. In similar situations, I've used the .values method on the queryset to specify the properties I need as a list of dictionaries, which can be much faster to iterate over. http://docs.djangoproject.com/en/1.3/ref/models/querysets/#values-list


Solution 2:

1500 records is far from being a large dataset, and seven seconds is really too much. There is probably some problem in your models, you can easily check it by getting (as Brandon says) the values() query, and then create explicitly the 1500 object by iterating the dictionary. Just convert the ValuesQuerySet into a list before the construction to factor out the db connection.


Solution 3:

How are you iterating over each item:

items = SomeModel.objects.all()

Regular for loop on each

for item in items:
    print item

Or using the QuerySet iterator

for item in items.iterator():
    print item

According to the doc, the iterator() can improve performance. The same applies while looping very large Python list or dictionaries, it's best to use iteritems().


Solution 4:

Does your model's Meta declaration tell it to "order by" a field that is stored off in some other related table? If so, your attempt to iterate might be triggering 1,500 queries as Django runs off and grabs that field for each item, and then sorts them. Showing us your code would help us unravel the problem!


Post a Comment for "How Do I Speed Up Iteration Of Large Datasets In Django"