## 12 February 2009

### Shuffle things in .Net

Shuffling is complicated. There are two nice pages on CodingHorror:

http://www.codinghorror.com/blog/archives/001008.html

http://www.codinghorror.com/blog/archives/001015.html

The second link shows a chart of the counts of the different possible draws from a  card pack when you do it a few hundred thousand times. This detects the bias of an algorithm.

##### How not to shuffle # 1

Here we go for the approach of picking a number at random, seeing if we have selected it already, adding it to the selected numbers if we haven't, and picking another if we have.

```Public Class Form1

Private Shared rand As New Random
Private dic As New Dictionary(Of Integer, Integer)
Dim tb As New TextBox With {.Dock = DockStyle.Fill, .Multiline = True}

Private Sub Form1_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
Dim sw As Stopwatch = Stopwatch.StartNew
Dim shuffled(9999999)() As Integer
For i As Integer = 0 To shuffled.GetUpperBound(0)
shuffled(i) = GetShuffledDeck(4)
Next
sw.Stop()
For i As Integer = 0 To shuffled.GetUpperBound(0)
Dim key As Integer = shuffled(i)(0) + shuffled(i)(1) * 10 + shuffled(i)(2) * 100 + shuffled(i)(3) * 1000
If dic.ContainsKey(key) Then
dic(key) = dic(key) + 1
Else
End If
Next
Dim sb As New System.Text.StringBuilder
sb.AppendLine(sw.ElapsedMilliseconds.ToString & "ms.")
For Each kvp As KeyValuePair(Of Integer, Integer) In dic
sb.Append(kvp.ToString & " ")
Next
tb.Text = sb.ToString
End Sub

Function GetShuffledDeck(ByVal numCards As Integer) As Integer()
Dim cards(numCards - 1) As Integer
For index As Integer = 0 To cards.Length - 1 ' BTW (cards.Length - 1) will be optimized.
Dim card As Integer
Do
' We don't want two identical cards.
card = rand.Next(1, numCards + 1)
Loop While cards.Contains(card)
cards(index) = card
Next
Return cards
End Function

End Class```

This is slow, (O(n^2). My results:

29400ms.
[4312, 416350] [3241, 417266] [4231, 417358] [3214, 416887]
[2134, 416303] [3421, 416548] [4321, 415916] [1324, 415585]
[1234, 417627] [3412, 416762] [4132, 415979] [1432, 416423]
[2143, 416603] [3142, 416542] [1423, 417515] [2314, 416934]
[2341, 417028] [1243, 416388] [2431, 415852] [1342, 417022]
[2413, 416309] [4213, 418019] [3124, 417387] [4123, 415397]

They don't look biased. Standard deviation from the mean is 653.
Certainly not as much as those on the codinghorror site. Let's redo that one...

##### How not to shuffle # 2

Using the same code with a different shuffle function:

```Function GetShuffledDeck(ByVal numCards As Integer) As Integer()
Dim cards(numCards - 1) As Integer
For i As Integer = 0 To cards.Length - 1
cards(i) = i + 1
Next
For i As Integer = 0 To cards.Length - 1
Dim n As Integer = rand.Next(cards.Length)
Dim temp As Integer = cards(i)
cards(i) = cards(n)
cards(n) = temp
Next
Return cards
End Function```

3618ms.
[1234, 390450] [1243, 390582] [1324, 312355] [1342, 429289]
[1423, 429646] [1432, 547601] [2134, 390203] [2143, 429178]
[2314, 351753] [2341, 351524] [2413, 429756] [2431, 545840]
[3124, 350694] [3142, 430600] [3214, 312704] [3241, 430164]
[3412, 586053] [3421, 391312] [4123, 351567] [4132, 546387]
[4213, 429804] [4231, 391291] [4312, 390192] [4321, 391055]

There's bias there. s.d. = 71880!!!

##### How to shuffle # 1 - Knuth shuffle / Fisher-Yates

Here I initialize the Random object with a cryptographically strong seed. Bit better than the default? The default uses the Environment.TickCount, which could be used to work out likely starting seed values.... Not likely to actually happen....
The main difference is the algorithm though, which no longer introduces bias.

```Imports System.Security.Cryptography

Public Class Form1

Private Shared rand As Random
Private dic As New Dictionary(Of Integer, Integer)
Dim tb As New TextBox With {.Dock = DockStyle.Fill, .Multiline = True}

Private Sub Form1_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
' get a random seed rather than using the default Random constructor, which
' just uses the Environment.TickCount
Dim rng As New RNGCryptoServiceProvider
Dim seedBytes(3) As Byte
rng.GetBytes(seedBytes)
rand = New Random(BitConverter.ToInt32(seedBytes, 0))
Dim sw As Stopwatch = Stopwatch.StartNew
Dim shuffled(9999999)() As Integer
For i As Integer = 0 To shuffled.GetUpperBound(0)
shuffled(i) = GetShuffledDeck(4)
Next
sw.Stop()
For i As Integer = 0 To shuffled.GetUpperBound(0)
Dim key As Integer = shuffled(i)(0) + shuffled(i)(1) * 10 + shuffled(i)(2) * 100 + shuffled(i)(3) * 1000
If dic.ContainsKey(key) Then
dic(key) = dic(key) + 1
Else
End If
Next
Dim sb As New System.Text.StringBuilder
sb.AppendLine(sw.ElapsedMilliseconds.ToString & "ms.")
Dim keys As New List(Of Integer)(dic.Keys)
keys.Sort()
For Each key As Integer In keys
sb.AppendFormat("[{0}, {1}] ", key, dic(key))
Next
tb.Text = sb.ToString
End Sub

Function GetShuffledDeck(ByVal numCards As Integer) As Integer()
Dim cards(numCards - 1) As Integer
For i As Integer = 0 To cards.Length - 1
cards(i) = i + 1
Next
For i As Integer = cards.Length - 1 To 0 Step -1
Dim n As Integer = rand.Next(i + 1)
Dim temp As Integer = cards(i)
cards(i) = cards(n)
cards(n) = temp
Next
Return cards
End Function

End Class```

3564ms.
[1234, 417700] [1243, 416060] [1324, 416157] [1342, 416886]
[1423, 416348] [1432, 416439] [2134, 416352] [2143, 417366]
[2314, 416459] [2341, 417555] [2413, 417086] [2431, 416996]
[3124, 415841] [3142, 417191] [3214, 416693] [3241, 416009]
[3412, 416599] [3421, 416026] [4123, 416222] [4132, 416616]
[4213, 416332] [4231, 417548] [4312, 417035] [4321, 416484]

Bias free. s.d. = 526

##### How to shuffle # 2 - Sorting with a GUID

Again we can just alter the function:

```Function GetShuffledDeck(ByVal numCards As Integer) As Integer()
Dim cards = Enumerable.Range(1, numCards)
cards = cards.OrderBy(Function(x) Guid.NewGuid)
Return cards.ToArray
End Function```

20129ms.
[1234, 417906] [1243, 416614] [1324, 416582] [1342, 416302]
[1423, 416529] [1432, 416558] [2134, 416695] [2143, 416061]
[2314, 417049] [2341, 415870] [2413, 416690] [2431, 417183]
[3124, 415954] [3142, 416735] [3214, 415955] [3241, 415697]
[3412, 417235] [3421, 416427] [4123, 417626] [4132, 417726]
[4213, 417010] [4231, 415691] [4312, 418397] [4321, 415508]

Slow again. s.d. = 733

##### How to shuffle # 2 - Sorting with a load of random Doubles
```Option Infer On

Imports System.Security.Cryptography

Public Class Form1

Private Shared rand As Random
Private dic As New Dictionary(Of Integer, Integer)
Private tb As New TextBox With {.Dock = DockStyle.Fill, .Multiline = True}
Private rng As New RNGCryptoServiceProvider

Private Sub Form1_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
Dim sw As Stopwatch = Stopwatch.StartNew
Dim shuffled(9999999)() As Integer
For i As Integer = 0 To shuffled.GetUpperBound(0)
shuffled(i) = GetShuffledDeck(4)
Next
sw.Stop()
For i As Integer = 0 To shuffled.GetUpperBound(0)
Dim key As Integer = shuffled(i)(0) + shuffled(i)(1) * 10 + shuffled(i)(2) * 100 + shuffled(i)(3) * 1000
If dic.ContainsKey(key) Then
dic(key) = dic(key) + 1
Else
End If
Next
Dim sb As New System.Text.StringBuilder
sb.AppendLine(sw.ElapsedMilliseconds.ToString & "ms.")
Dim keys As New List(Of Integer)(dic.Keys)
keys.Sort()
For Each key As Integer In keys
sb.AppendFormat("[{0}, {1}] ", key, dic(key))
Next
tb.Text = sb.ToString
End Sub

Function GetShuffledDeck(ByVal numCards As Integer) As Integer()
Dim cards = Enumerable.Range(1, numCards)
cards = cards.OrderBy(Function(x) RandomDouble(x))
Return cards.ToArray
End Function

' This returns a function that creates a random double. The Integer bit is just ignored.
Private Function RandomDouble() As Func(Of Integer, Double)
' 8 bytes for the double.
Dim bytes(7) As Byte
' Get cryptographically strong random bytes into the array.
rng.GetBytes(bytes)
' The function uses the bytes to create a double. x is ignored.
Dim getDouble = Function(x As Integer) BitConverter.ToDouble(bytes, 0)
Return getDouble
End Function

End Class
```

137380ms.
[1234, 416945] [1243, 416314] [1324, 416742]
[1342, 416652] [1423, 417382] [1432, 416208]
[2134, 417577] [2143, 416464] [2314, 415925]
[2341, 416543] [2413, 416615] [2431, 418076]
[3124, 416493] [3142, 415963] [3214, 417219]
[3241, 415923] [3412, 416774] [3421, 417394]
[4123, 416466] [4132, 416706] [4213, 415900]
[4231, 416736] [4312, 415534] [4321, 417449]

Slooooooooooooooooooooooooooooooow. - The cryptographic random numbers are very expensive!

s.d. = 605

So there you go. But, it's interesting to note atma's comments about experienced card players - they don't like their cards shuffled by the computer, they prefer human shuffled cards and can tell the difference.