5
$\begingroup$

I'm programming a learning software. It works with question-/answercards. I´m searching for a algorithm that gives me a higher probability for cards that the user has answered wrong.

My actual idea (edit: Inverse transform sampling) is that each card has an integer which indicates how often the user has answerd the question wrong. Count all integer-values, creating a random integer between 0 and the counted integer-values and use this integer to go through my cards and count their integers until I reached the random integer. Then I reach the integer I choose this card :-)

But there must be a better solution ;-)

Edit: Rejection Sampling

N = number of cards M = score of the highest card c = random (1 - N) x = random (1 - M)  if (x <= (score of card-nr: c)) accept card! else create new c & x and goto if-querry 

That means that cards with a higher score will choosen more often.

  • 0
    @Rahul Narain: good to know, thx2011-03-24

4 Answers 4

4

You solution seems ok if you don't mind sorting the cards each time. Here is a different method: choose a card at random and a number at random from 0 to the max card score. Accept the chosen card if the number is at most the card score. Otherwise, repeat. This method is rejection sampling on the graph of card scores.

  • 0
    @lhf: thx, i have correct the example and i will give it a try :-)2011-03-24
2

Here's a more efficient algorithm, requiring some space. You keep a lookup table containing the card to pick for each value of your random integer. The table is init by letting cell $i$ point at card $i$. When you increase the prominence of card $j$, just add a new cell pointing to $j$.

If you are memory-savvy, then you can use the following algorithm. Put all your cards in a balanced binary tree. Each card maintains both its own prominence and the sum of prominence of it and all its descendants. To select a card, use binary search. When you increase the prominence of a card, you need to update only its ancestors. So both operations take logarithmic time.

0

No, that's pretty much the best way of producing that distribution. You might want to add something that slowly decreases the weight of correctly-answered flashcards or your total counts just keep inflating, which becomes a problem after a while.

  • 0
    You're right that for that application, the slow running time doesn't matter. In some applications it might, and then you can be smarter and use a more efficient algorithm.2011-03-24
0

You probably don't want to make the initial numbers 0, or you'll never see a new card once you've had a wrong answer to something else.