Shuffling is complicated. There are two nice pages on CodingHorror:
http://www.codinghorror.com/blog/archives/001008.html
http://www.codinghorror.com/blog/archives/001015.html
The second link shows a chart of the counts of the different possible draws from a card pack when you do it a few hundred thousand times. This detects the bias of an algorithm.
How not to shuffle # 1
Here we go for the approach of picking a number at random, seeing if we have selected it already, adding it to the selected numbers if we haven't, and picking another if we have.
Public Class Form1 Private Shared rand As New Random Private dic As New Dictionary(Of Integer, Integer) Dim tb As New TextBox With {.Dock = DockStyle.Fill, .Multiline = True} Private Sub Form1_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load Dim sw As Stopwatch = Stopwatch.StartNew Dim shuffled(9999999)() As Integer For i As Integer = 0 To shuffled.GetUpperBound(0) shuffled(i) = GetShuffledDeck(4) Next sw.Stop() For i As Integer = 0 To shuffled.GetUpperBound(0) Dim key As Integer = shuffled(i)(0) + shuffled(i)(1) * 10 + shuffled(i)(2) * 100 + shuffled(i)(3) * 1000 If dic.ContainsKey(key) Then dic(key) = dic(key) + 1 Else dic.Add(key, 1) End If Next Dim sb As New System.Text.StringBuilder sb.AppendLine(sw.ElapsedMilliseconds.ToString & "ms.") For Each kvp As KeyValuePair(Of Integer, Integer) In dic sb.Append(kvp.ToString & " ") Next tb.Text = sb.ToString Me.Controls.Add(tb) End Sub Function GetShuffledDeck(ByVal numCards As Integer) As Integer() Dim cards(numCards - 1) As Integer For index As Integer = 0 To cards.Length - 1 ' BTW (cards.Length - 1) will be optimized. Dim card As Integer Do ' We don't want two identical cards. card = rand.Next(1, numCards + 1) Loop While cards.Contains(card) cards(index) = card Next Return cards End Function End Class
This is slow, (O(n^2). My results:
29400ms.
[4312, 416350] [3241, 417266] [4231, 417358] [3214, 416887]
[2134, 416303] [3421, 416548] [4321, 415916] [1324, 415585]
[1234, 417627] [3412, 416762] [4132, 415979] [1432, 416423]
[2143, 416603] [3142, 416542] [1423, 417515] [2314, 416934]
[2341, 417028] [1243, 416388] [2431, 415852] [1342, 417022]
[2413, 416309] [4213, 418019] [3124, 417387] [4123, 415397]
They don't look biased. Standard deviation from the mean is 653.
Certainly not as much as those on the codinghorror site. Let's redo that one...
How not to shuffle # 2
Using the same code with a different shuffle function:
Function GetShuffledDeck(ByVal numCards As Integer) As Integer() Dim cards(numCards - 1) As Integer For i As Integer = 0 To cards.Length - 1 cards(i) = i + 1 Next For i As Integer = 0 To cards.Length - 1 Dim n As Integer = rand.Next(cards.Length) Dim temp As Integer = cards(i) cards(i) = cards(n) cards(n) = temp Next Return cards End Function
3618ms.
[1234, 390450] [1243, 390582] [1324, 312355] [1342, 429289]
[1423, 429646] [1432, 547601] [2134, 390203] [2143, 429178]
[2314, 351753] [2341, 351524] [2413, 429756] [2431, 545840]
[3124, 350694] [3142, 430600] [3214, 312704] [3241, 430164]
[3412, 586053] [3421, 391312] [4123, 351567] [4132, 546387]
[4213, 429804] [4231, 391291] [4312, 390192] [4321, 391055]
There's bias there. s.d. = 71880!!!
How to shuffle # 1 - Knuth shuffle / Fisher-Yates
Here I initialize the Random object with a cryptographically strong seed. Bit better than the default? The default uses the Environment.TickCount, which could be used to work out likely starting seed values.... Not likely to actually happen....
The main difference is the algorithm though, which no longer introduces bias.
Imports System.Security.Cryptography Public Class Form1 Private Shared rand As Random Private dic As New Dictionary(Of Integer, Integer) Dim tb As New TextBox With {.Dock = DockStyle.Fill, .Multiline = True} Private Sub Form1_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load ' get a random seed rather than using the default Random constructor, which ' just uses the Environment.TickCount Dim rng As New RNGCryptoServiceProvider Dim seedBytes(3) As Byte rng.GetBytes(seedBytes) rand = New Random(BitConverter.ToInt32(seedBytes, 0)) Dim sw As Stopwatch = Stopwatch.StartNew Dim shuffled(9999999)() As Integer For i As Integer = 0 To shuffled.GetUpperBound(0) shuffled(i) = GetShuffledDeck(4) Next sw.Stop() For i As Integer = 0 To shuffled.GetUpperBound(0) Dim key As Integer = shuffled(i)(0) + shuffled(i)(1) * 10 + shuffled(i)(2) * 100 + shuffled(i)(3) * 1000 If dic.ContainsKey(key) Then dic(key) = dic(key) + 1 Else dic.Add(key, 1) End If Next Dim sb As New System.Text.StringBuilder sb.AppendLine(sw.ElapsedMilliseconds.ToString & "ms.") Dim keys As New List(Of Integer)(dic.Keys) keys.Sort() For Each key As Integer In keys sb.AppendFormat("[{0}, {1}] ", key, dic(key)) Next tb.Text = sb.ToString Me.Controls.Add(tb) End Sub Function GetShuffledDeck(ByVal numCards As Integer) As Integer() Dim cards(numCards - 1) As Integer For i As Integer = 0 To cards.Length - 1 cards(i) = i + 1 Next For i As Integer = cards.Length - 1 To 0 Step -1 Dim n As Integer = rand.Next(i + 1) Dim temp As Integer = cards(i) cards(i) = cards(n) cards(n) = temp Next Return cards End Function End Class
3564ms.
[1234, 417700] [1243, 416060] [1324, 416157] [1342, 416886]
[1423, 416348] [1432, 416439] [2134, 416352] [2143, 417366]
[2314, 416459] [2341, 417555] [2413, 417086] [2431, 416996]
[3124, 415841] [3142, 417191] [3214, 416693] [3241, 416009]
[3412, 416599] [3421, 416026] [4123, 416222] [4132, 416616]
[4213, 416332] [4231, 417548] [4312, 417035] [4321, 416484]
Bias free. s.d. = 526
How to shuffle # 2 - Sorting with a GUID
Again we can just alter the function:
Function GetShuffledDeck(ByVal numCards As Integer) As Integer() Dim cards = Enumerable.Range(1, numCards) cards = cards.OrderBy(Function(x) Guid.NewGuid) Return cards.ToArray End Function
20129ms.
[1234, 417906] [1243, 416614] [1324, 416582] [1342, 416302]
[1423, 416529] [1432, 416558] [2134, 416695] [2143, 416061]
[2314, 417049] [2341, 415870] [2413, 416690] [2431, 417183]
[3124, 415954] [3142, 416735] [3214, 415955] [3241, 415697]
[3412, 417235] [3421, 416427] [4123, 417626] [4132, 417726]
[4213, 417010] [4231, 415691] [4312, 418397] [4321, 415508]
Slow again. s.d. = 733
How to shuffle # 2 - Sorting with a load of random Doubles
Option Infer On Imports System.Security.Cryptography Public Class Form1 Private Shared rand As Random Private dic As New Dictionary(Of Integer, Integer) Private tb As New TextBox With {.Dock = DockStyle.Fill, .Multiline = True} Private rng As New RNGCryptoServiceProvider Private Sub Form1_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load Dim sw As Stopwatch = Stopwatch.StartNew Dim shuffled(9999999)() As Integer For i As Integer = 0 To shuffled.GetUpperBound(0) shuffled(i) = GetShuffledDeck(4) Next sw.Stop() For i As Integer = 0 To shuffled.GetUpperBound(0) Dim key As Integer = shuffled(i)(0) + shuffled(i)(1) * 10 + shuffled(i)(2) * 100 + shuffled(i)(3) * 1000 If dic.ContainsKey(key) Then dic(key) = dic(key) + 1 Else dic.Add(key, 1) End If Next Dim sb As New System.Text.StringBuilder sb.AppendLine(sw.ElapsedMilliseconds.ToString & "ms.") Dim keys As New List(Of Integer)(dic.Keys) keys.Sort() For Each key As Integer In keys sb.AppendFormat("[{0}, {1}] ", key, dic(key)) Next tb.Text = sb.ToString Me.Controls.Add(tb) End Sub Function GetShuffledDeck(ByVal numCards As Integer) As Integer() Dim cards = Enumerable.Range(1, numCards) cards = cards.OrderBy(Function(x) RandomDouble(x)) Return cards.ToArray End Function ' This returns a function that creates a random double. The Integer bit is just ignored. Private Function RandomDouble() As Func(Of Integer, Double) ' 8 bytes for the double. Dim bytes(7) As Byte ' Get cryptographically strong random bytes into the array. rng.GetBytes(bytes) ' The function uses the bytes to create a double. x is ignored. Dim getDouble = Function(x As Integer) BitConverter.ToDouble(bytes, 0) Return getDouble End Function End Class
137380ms.
[1234, 416945] [1243, 416314] [1324, 416742]
[1342, 416652] [1423, 417382] [1432, 416208]
[2134, 417577] [2143, 416464] [2314, 415925]
[2341, 416543] [2413, 416615] [2431, 418076]
[3124, 416493] [3142, 415963] [3214, 417219]
[3241, 415923] [3412, 416774] [3421, 417394]
[4123, 416466] [4132, 416706] [4213, 415900]
[4231, 416736] [4312, 415534] [4321, 417449]
Slooooooooooooooooooooooooooooooow. - The cryptographic random numbers are very expensive!
s.d. = 605
So there you go. But, it's interesting to note atma's comments about experienced card players - they don't like their cards shuffled by the computer, they prefer human shuffled cards and can tell the difference.
No comments:
Post a Comment