12 February 2009

Shuffle things in .Net

Shuffling is complicated. There are two nice pages on CodingHorror:

http://www.codinghorror.com/blog/archives/001008.html

http://www.codinghorror.com/blog/archives/001015.html

The second link shows a chart of the counts of the different possible draws from a  card pack when you do it a few hundred thousand times. This detects the bias of an algorithm.

 

How not to shuffle # 1

Here we go for the approach of picking a number at random, seeing if we have selected it already, adding it to the selected numbers if we haven't, and picking another if we have.

Public Class Form1

    Private Shared rand As New Random
    Private dic As New Dictionary(Of Integer, Integer)
    Dim tb As New TextBox With {.Dock = DockStyle.Fill, .Multiline = True}

    Private Sub Form1_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
        Dim sw As Stopwatch = Stopwatch.StartNew
        Dim shuffled(9999999)() As Integer
        For i As Integer = 0 To shuffled.GetUpperBound(0)
            shuffled(i) = GetShuffledDeck(4)
        Next
        sw.Stop()
        For i As Integer = 0 To shuffled.GetUpperBound(0)
            Dim key As Integer = shuffled(i)(0) + shuffled(i)(1) * 10 + shuffled(i)(2) * 100 + shuffled(i)(3) * 1000
            If dic.ContainsKey(key) Then
                dic(key) = dic(key) + 1
            Else
                dic.Add(key, 1)
            End If
        Next
        Dim sb As New System.Text.StringBuilder
        sb.AppendLine(sw.ElapsedMilliseconds.ToString & "ms.")
        For Each kvp As KeyValuePair(Of Integer, Integer) In dic
            sb.Append(kvp.ToString & " ")
        Next
        tb.Text = sb.ToString
        Me.Controls.Add(tb)
    End Sub

    Function GetShuffledDeck(ByVal numCards As Integer) As Integer()
        Dim cards(numCards - 1) As Integer
        For index As Integer = 0 To cards.Length - 1 ' BTW (cards.Length - 1) will be optimized.
            Dim card As Integer
            Do
                ' We don't want two identical cards.
                card = rand.Next(1, numCards + 1)
            Loop While cards.Contains(card)
            cards(index) = card
        Next
        Return cards
    End Function

End Class

This is slow, (O(n^2). My results:

29400ms.
[4312, 416350] [3241, 417266] [4231, 417358] [3214, 416887]
[2134, 416303] [3421, 416548] [4321, 415916] [1324, 415585]
[1234, 417627] [3412, 416762] [4132, 415979] [1432, 416423]
[2143, 416603] [3142, 416542] [1423, 417515] [2314, 416934]
[2341, 417028] [1243, 416388] [2431, 415852] [1342, 417022]
[2413, 416309] [4213, 418019] [3124, 417387] [4123, 415397]

They don't look biased. Standard deviation from the mean is 653.
Certainly not as much as those on the codinghorror site. Let's redo that one...

How not to shuffle # 2

Using the same code with a different shuffle function:

Function GetShuffledDeck(ByVal numCards As Integer) As Integer()
    Dim cards(numCards - 1) As Integer
    For i As Integer = 0 To cards.Length - 1
        cards(i) = i + 1
    Next
    For i As Integer = 0 To cards.Length - 1
        Dim n As Integer = rand.Next(cards.Length)
        Dim temp As Integer = cards(i)
        cards(i) = cards(n)
        cards(n) = temp
    Next
    Return cards
End Function

3618ms.
[1234, 390450] [1243, 390582] [1324, 312355] [1342, 429289]
[1423, 429646] [1432, 547601] [2134, 390203] [2143, 429178]
[2314, 351753] [2341, 351524] [2413, 429756] [2431, 545840]
[3124, 350694] [3142, 430600] [3214, 312704] [3241, 430164]
[3412, 586053] [3421, 391312] [4123, 351567] [4132, 546387]
[4213, 429804] [4231, 391291] [4312, 390192] [4321, 391055]

There's bias there. s.d. = 71880!!!

How to shuffle # 1 - Knuth shuffle / Fisher-Yates

Here I initialize the Random object with a cryptographically strong seed. Bit better than the default? The default uses the Environment.TickCount, which could be used to work out likely starting seed values.... Not likely to actually happen....
The main difference is the algorithm though, which no longer introduces bias.

Imports System.Security.Cryptography

Public Class Form1

    Private Shared rand As Random
    Private dic As New Dictionary(Of Integer, Integer)
    Dim tb As New TextBox With {.Dock = DockStyle.Fill, .Multiline = True}

    Private Sub Form1_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
        ' get a random seed rather than using the default Random constructor, which
        ' just uses the Environment.TickCount
        Dim rng As New RNGCryptoServiceProvider
        Dim seedBytes(3) As Byte
        rng.GetBytes(seedBytes)
        rand = New Random(BitConverter.ToInt32(seedBytes, 0))
        Dim sw As Stopwatch = Stopwatch.StartNew
        Dim shuffled(9999999)() As Integer
        For i As Integer = 0 To shuffled.GetUpperBound(0)
            shuffled(i) = GetShuffledDeck(4)
        Next
        sw.Stop()
        For i As Integer = 0 To shuffled.GetUpperBound(0)
            Dim key As Integer = shuffled(i)(0) + shuffled(i)(1) * 10 + shuffled(i)(2) * 100 + shuffled(i)(3) * 1000
            If dic.ContainsKey(key) Then
                dic(key) = dic(key) + 1
            Else
                dic.Add(key, 1)
            End If
        Next
        Dim sb As New System.Text.StringBuilder
        sb.AppendLine(sw.ElapsedMilliseconds.ToString & "ms.")
        Dim keys As New List(Of Integer)(dic.Keys)
        keys.Sort()
        For Each key As Integer In keys
            sb.AppendFormat("[{0}, {1}] ", key, dic(key))
        Next
        tb.Text = sb.ToString
        Me.Controls.Add(tb)
    End Sub

    Function GetShuffledDeck(ByVal numCards As Integer) As Integer()
        Dim cards(numCards - 1) As Integer
        For i As Integer = 0 To cards.Length - 1
            cards(i) = i + 1
        Next
        For i As Integer = cards.Length - 1 To 0 Step -1
            Dim n As Integer = rand.Next(i + 1)
            Dim temp As Integer = cards(i)
            cards(i) = cards(n)
            cards(n) = temp
        Next
        Return cards
    End Function

End Class

3564ms.
[1234, 417700] [1243, 416060] [1324, 416157] [1342, 416886]
[1423, 416348] [1432, 416439] [2134, 416352] [2143, 417366]
[2314, 416459] [2341, 417555] [2413, 417086] [2431, 416996]
[3124, 415841] [3142, 417191] [3214, 416693] [3241, 416009]
[3412, 416599] [3421, 416026] [4123, 416222] [4132, 416616]
[4213, 416332] [4231, 417548] [4312, 417035] [4321, 416484]

Bias free. s.d. = 526

How to shuffle # 2 - Sorting with a GUID

Again we can just alter the function:

Function GetShuffledDeck(ByVal numCards As Integer) As Integer()
    Dim cards = Enumerable.Range(1, numCards)
    cards = cards.OrderBy(Function(x) Guid.NewGuid)
    Return cards.ToArray
End Function

20129ms.
[1234, 417906] [1243, 416614] [1324, 416582] [1342, 416302]
[1423, 416529] [1432, 416558] [2134, 416695] [2143, 416061]
[2314, 417049] [2341, 415870] [2413, 416690] [2431, 417183]
[3124, 415954] [3142, 416735] [3214, 415955] [3241, 415697]
[3412, 417235] [3421, 416427] [4123, 417626] [4132, 417726]
[4213, 417010] [4231, 415691] [4312, 418397] [4321, 415508]

Slow again. s.d. = 733

How to shuffle # 2 - Sorting with a load of random Doubles
Option Infer On

Imports System.Security.Cryptography

Public Class Form1

    Private Shared rand As Random
    Private dic As New Dictionary(Of Integer, Integer)
    Private tb As New TextBox With {.Dock = DockStyle.Fill, .Multiline = True}
    Private rng As New RNGCryptoServiceProvider

    Private Sub Form1_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
        Dim sw As Stopwatch = Stopwatch.StartNew
        Dim shuffled(9999999)() As Integer
        For i As Integer = 0 To shuffled.GetUpperBound(0)
            shuffled(i) = GetShuffledDeck(4)
        Next
        sw.Stop()
        For i As Integer = 0 To shuffled.GetUpperBound(0)
            Dim key As Integer = shuffled(i)(0) + shuffled(i)(1) * 10 + shuffled(i)(2) * 100 + shuffled(i)(3) * 1000
            If dic.ContainsKey(key) Then
                dic(key) = dic(key) + 1
            Else
                dic.Add(key, 1)
            End If
        Next
        Dim sb As New System.Text.StringBuilder
        sb.AppendLine(sw.ElapsedMilliseconds.ToString & "ms.")
        Dim keys As New List(Of Integer)(dic.Keys)
        keys.Sort()
        For Each key As Integer In keys
            sb.AppendFormat("[{0}, {1}] ", key, dic(key))
        Next
        tb.Text = sb.ToString
        Me.Controls.Add(tb)
    End Sub

    Function GetShuffledDeck(ByVal numCards As Integer) As Integer()
        Dim cards = Enumerable.Range(1, numCards)
        cards = cards.OrderBy(Function(x) RandomDouble(x))
        Return cards.ToArray
    End Function

    ' This returns a function that creates a random double. The Integer bit is just ignored. 
    Private Function RandomDouble() As Func(Of Integer, Double)
        ' 8 bytes for the double. 
        Dim bytes(7) As Byte
        ' Get cryptographically strong random bytes into the array.       
        rng.GetBytes(bytes)
        ' The function uses the bytes to create a double. x is ignored. 
        Dim getDouble = Function(x As Integer) BitConverter.ToDouble(bytes, 0)
        Return getDouble
    End Function

End Class

137380ms.
[1234, 416945] [1243, 416314] [1324, 416742]
[1342, 416652] [1423, 417382] [1432, 416208]
[2134, 417577] [2143, 416464] [2314, 415925]
[2341, 416543] [2413, 416615] [2431, 418076]
[3124, 416493] [3142, 415963] [3214, 417219]
[3241, 415923] [3412, 416774] [3421, 417394]
[4123, 416466] [4132, 416706] [4213, 415900]
[4231, 416736] [4312, 415534] [4321, 417449]

Slooooooooooooooooooooooooooooooow. - The cryptographic random numbers are very expensive!

s.d. = 605

 

So there you go. But, it's interesting to note atma's comments about experienced card players - they don't like their cards shuffled by the computer, they prefer human shuffled cards and can tell the difference.

No comments:

Post a Comment