Question :
I have a model that represents paintings I present on my site. On the main webpage I’d like to show some of them: newest, one that was not visited for most time, most popular one and a random one.
I’m using Django 1.0.2.
While first 3 of them are easy to pull using django models, last one (random) causes me some trouble. I can ofc code it in my view, to something like this:
number_of_records = models.Painting.objects.count()
random_index = int(random.random()*number_of_records)+1
random_paint = models.Painting.get(pk = random_index)
It doesn’t look like something I’d like to have in my view tho – this is entirely part of database abstraction and should be in the model. Also, here I need to take care of removed records (then number of all records won’t cover me all the possible key values) and probably lots of other things.
Any other options how I can do it, preferably somehow inside the model abstraction?
Answer #1:
Using order_by('?')
will kill the db server on the second day in production. A better way is something like what is described in Getting a random row from a relational database.
from django.db.models.aggregates import Count
from random import randint
class PaintingManager(models.Manager):
def random(self):
count = self.aggregate(count=Count('id'))['count']
random_index = randint(0, count - 1)
return self.all()[random_index]
Answer #3:
The solutions with order_by(‘?’)[:N] are extremely slow even for medium-sized tables if you use MySQL (don’t know about other databases).
order_by('?')[:N]
will be translated to SELECT ... FROM ... WHERE ... ORDER BY RAND() LIMIT N
query.
It means that for every row in table the RAND() function will be executed, then the whole table will be sorted according to value of this function and then first N records will be returned. If your tables are small, this is fine. But in most cases this is a very slow query.
I wrote simple function that works even if id’s have holes (some rows where deleted):
def get_random_item(model, max_id=None):
if max_id is None:
max_id = model.objects.aggregate(Max('id')).values()[0]
min_id = math.ceil(max_id*random.random())
return model.objects.filter(id__gte=min_id)[0]
It is faster than order_by(‘?’) in almost all cases.
Answer #4:
Here’s a simple solution:
from random import randint
count = Model.objects.count()
random_object = Model.objects.all()[randint(0, count - 1)] #single random object
Answer #5:
You could create a manager on your model to do this sort of thing. To first understand what a manager is, the Painting.objects
method is a manager that contains all()
, filter()
, get()
, etc. Creating your own manager allows you to pre-filter results and have all these same methods, as well as your own custom methods, work on the results.
EDIT: I modified my code to reflect the order_by['?']
method. Note that the manager returns an unlimited number of random models. Because of this I’ve included a bit of usage code to show how to get just a single model.
from django.db import models
class RandomManager(models.Manager):
def get_query_set(self):
return super(RandomManager, self).get_query_set().order_by('?')
class Painting(models.Model):
title = models.CharField(max_length=100)
author = models.CharField(max_length=50)
objects = models.Manager() # The default manager.
randoms = RandomManager() # The random-specific manager.
Usage
random_painting = Painting.randoms.all()[0]
Lastly, you can have many managers on your models, so feel free to create a LeastViewsManager()
or MostPopularManager()
.
Answer #6:
The other answers are either potentially slow (using order_by('?')
) or use more than one SQL query. Here’s a sample solution with no ordering and just one query (assuming Postgres):
random_instance_or_none = Model.objects.raw('''
select * from {0} limit 1
offset floor(random() * (select count(*) from {0}))
'''.format(Model._meta.db_table)).first()
Be aware that this will raise an index error if the table is empty. Write yourself a model-agnostic helper function to check for that.
Answer #7:
Just a simple idea how I do it:
def _get_random_service(self, professional):
services = Service.objects.filter(professional=professional)
i = randint(0, services.count()-1)
return services[i]
Answer #8:
Hi I needed to select a random record from a queryset who’s length I also needed to report (ie web page produced described item and said records left)
q = Entity.objects.filter(attribute_value='this or that')
item_count = q.count()
random_item = q[random.randomint(1,item_count+1)]
took half as long(0.7s vs 1.7s) as:
item_count = q.count()
random_item = random.choice(q)
I’m guessing it avoids pulling down the whole query before selecting the random entry and made my system responsive enough for a page that is accessed repeatedly for a repetitive task where users want to see the item_count count down.