Bug 354238

Summary: Future Optimization: Use Dictionary<string,int> for switch on string
Product: [Mono] Mono: Compilers Reporter: Juraj Skripsky <juraj>
Component: C#Assignee: Marek Safar <msafar>
Status: RESOLVED FIXED QA Contact: Mono Bugs <mono-bugs>
Severity: Enhancement    
Priority: P5 - None    
Version: SVN   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: test case
cil code generated by gmcs
cil code generated by csc
performance test case

Description Juraj Skripsky 2008-01-16 18:12:11 UTC
Created attachment 190742 [details]
test case

Attached you'll find a simple C# program using a switch() on string.

- gmcs emits code equivalent to a chain of if-else statements
- csc emits the construction and initialization of a static Dictionary<string,int> object, and the switch opcode which uses the output of Dictionary<string,int>.TryGetValue.

The CIL generated by gmcs and csc is attached as well.

Issuing  'find -name "*.cs" | grep -v "Test" | xargs grep "case \""' in mcs/class will give you a rough idea how often a switch() on strings (with many case-values) is used (and whether it's worthwhile to implement this optimization).
Comment 1 Juraj Skripsky 2008-01-16 18:12:58 UTC
Created attachment 190743 [details]
cil code generated by gmcs
Comment 2 Juraj Skripsky 2008-01-16 18:13:35 UTC
Created attachment 190744 [details]
cil code generated by csc
Comment 3 Marek Safar 2008-01-16 20:52:06 UTC
This has been on my radar for quite long time.

I am actually waiting for faster Dictionary implementation to utilize the real benefits.
Comment 4 Juraj Skripsky 2008-01-17 14:14:50 UTC
Created attachment 190848 [details]
performance test case

I've done a little performance testing. The attached program compares the performance of a switch-no-string with a switch-on-dictionary<string,int>. It 
takes one argument - the string to feed into the switch.

The switch-no-string case is tested twice - the first time with non-interned string, the second with interned string.

Results on Mono:

[js@leonardo ~]$ mono switch.exe Hello
Switch:     415710
IsInterned: 426250

 -- string is interned from now on --

IsInterned: 328600
Switch:     317760
DictSwitch: 188090


[js@leonardo ~]$ mono switch.exe xxxxxxxx
Switch:     602920
IsInterned: 671700

 -- string is interned from now on --

IsInterned: 383080
Switch:     360620
DictSwitch: 151780


Interestingly "DictSwitch" is faster even when an interned string is feed into "Switch" _and_ the interned string is the first match. That's why I included a performance test for "String.IsInterned".
It turns out that a switch-on-string spents most of its time in "String.IsInterned"!
Comment 5 Juraj Skripsky 2008-01-17 14:16:46 UTC
PS to my last comment: I've meant "switch-on-string", not "switch-no-string" :-/
Comment 6 Marek Safar 2008-07-30 16:12:29 UTC
Fixed in SVN.