Category Archives: Lambda

Refactoring Switches Advanced

KevDog posted a question in response to my post about refactoring switch statements, Pulling out the Switch: It’s Time for a Whooping. I thought it would be good to go ahead and answer it as a post since it is a pretty interesting real world example of a somewhat difficult switch statement to get rid of.

Here is the original code:

public static DataFileParser GetParser(DataFile dataFile)
{
    switch ((FileType)dataFile.ValidationFormat.FileType)
    {
        case FileType.PDF:
            return new PdfParser(dataFile);
        case FileType.Image:
            return new ImageParser(dataFile);
        case FileType.CsvWithHeaderRow:
            return new CsvParser(dataFile, true);
        case FileType.Csv:
            return new CsvParser(dataFile, false);
        default:
            throw new NotImplementedException("There is no parser for " +
dataFile.ValidationFormat.FileType.ToString());
    }
}

Surprisingly simple solution

Try and say that 3 times fast.

I thought about this a bit and at first was having a hard time coming up with a solution.  Then I typed the code into an editor and realized how easy it is.

The trick here is that it looks like something other than the simple case of data mapping to logic, but it isn’t.

  • The logic in this case is the creation of the Parser.
  • The data is the file type.

Once you think of it in those terms you can easily solve it using the pattern I mention in my previous post on refactoring switch statements.

Switch be gone!

private static Dictionary<FileType, Func<DataFile, DataFileParser>> DataFileTypeToParserCreatorMap =
    new Dictionary<FileType, Func<DataFile, DataFileParser>>()
{
    {FileType.PDF, file => new PdfParser(file)},
    {FileType.Image, file => new ImageParser(file)},
    {FileType.CsvWithHeaderRow, file => new CsvParser(file, true)},
    {FileType.Csv, file => new CsvParser(file, false)}
};

public static DataFileParser GetParserRefactored(DataFile dataFile)
{
    Func<DataFile, DataFileParser> parserCreator;
    if(!DataFileTypeToParserCreatorMap.TryGetValue(
                             dataFile.ValidationFormat.FileType,
                             out parserCreator))
        throw new NotImplementedException("There is no parser for " +
            dataFile.ValidationFormat.FileType.ToString());

    return parserCreator(dataFile);
}

That is the quickest solution that preserves the existing code as much as possible.

Another solution

With the first solution we pushed the object creation into the map.

If we can make the constructor for all the parsers the same, we can use reflection to dynamically create our instances by looking up the type in the dictionary.

In this example, I assume that we have refactored CsvParser to have a constructor that only takes one parameter and internally sets a value of usesHeader to false, and we have created a CsvWithHeaderParser that inherits from the CsvParser and sets usesHeader to true.

private static Dictionary<FileType, Type> DataFileTypeToParserTypeMap = new Dictionary<FileType, Type>()
{
    { FileType.PDF, typeof(PdfParser)},
    { FileType.Image, typeof(ImageParser)},
    { FileType.CsvWithHeaderRow, typeof(CsvWithHeaderParser)},
    { FileType.Csv, typeof(CsvParser)}
};

public static DataFileParser GetParser(DataFile dataFile)
{
    Type parserType;
    if(!DataFileTypeToParserTypeMap.TryGetValue(
            dataFile.ValidationFormat.FileType,
            out parserType))
        throw new NotImplementedException("There is no parser for " +
                    dataFile.ValidationFormat.FileType.ToString());

    return (DataFileParser) Activator.CreateInstance(parserType, dataFile);
}

Pretty similar solution.  I prefer the first though for several reasons:

  • The refactor is localized, where the second solution has to touch other classes.
  • Reflection makes you lose compile time safety.
  • You may create a new parser that you want to have more parameters for the constructor.  With the second solution, you will have a hard time doing that.
  • The first solution gives you ultimate flexibility in setting up the constructor of the parser.  If you wanted to do 5 steps for a particular parser, you could.

Anyway, next time you try refactoring switch statements that are hard to figure out how to refactor, try to break it into a mapping between data and logic.  There is always a solution.

Aspect Oriented Programming with Action<>

Aspect Oriented Programming (AOP) is a pretty great concept.

It is a little difficult to implement though.

To be honest, I don’t think I’ve ever really seen it successfully implemented.  I mean sure, I’ve seen examples of how you could use it for “cross-cutting” concerns like logging.

mad scientist thumb Aspect Oriented Programming with Action<>

The problem is it is usually pretty difficult to use and the only real practical application I can ever come up with is logging.  I know, it is probably just my lack of knowledge in the area, but if you bear with me I’ll show you a neat little trick you can use to address cross-cutting concerns by doing something similar to what AOP does using Action<>.

Putting it all in one place

The main problem AOP tries to solve is taking aspects of your software that exist in many different places and condensing them into one place for you to maintain.

Exception handling and logging tend to be the most infamous of these cross-cutting concerns.  Many places in your code you no doubt have many instances where you catch an exception and the only thing you can really do is log it.

I’m going to show you a little easy way to do that using Action<>.

Giving credit where credit is due, I got this idea from some code that a coworker of mine, Subha Tarafdar, wrote.  (He is a genius.)

He wrote some code to basically do what I am going to show you, but with retrying database queries.  He was able to reduce many places in the code base where we had repeated logic to retry executing database queries when getting a deadlock or timeout.

Does this code belong to you?

public void MakeRice()
{
    try
    {
        _riceCooker.Cook();
    }
    catch (Exception exception)
    {
        // Don't care if this fails,
        // there is nothing we can do about it.
        Logger.Log(exception.Message);
    }
}

Ignore that I am catching a general exception here.  It is a bad practice, but sometimes all you are going to do is log whatever bad thing happens and move on.

It’s pretty common to do something in a try block and catch an exception only to log it.

Think about how many times this code or something similar to it might be sprinkled throughout your code base.

Action<> to the rescue

If you’re not familiar with Action<> take a look at this post I did that gives a very simple explanation for how it works.

We can take the logic of the try, catch, and log exceptions and put it into a method that only varies by what action we do.

public static void LogOnFailure(Action action)
{
    try
    {
        action();
    }
    catch (Exception exception)
    {
        Logger.Log(exception.Message);
    }
}

I’m not a big fan of static methods but in certain cases they make sense.  The alternative is to have all of this code sprinkled throughout your code base.

Now that we have this method, we can do anything we want and know that if there is an error it will be logged.

Check this out:

LogOnFailure(_riceCooker.Cook);
LogOnFailure(KickACat);
LogOnFailure(() =&gt;
                    {
                        Wakeup();
                        SmellTheRoses();
                    }
            );

And if you decide you want to change how you log the error or what you do on it, you can change it all in one place.

Not just for logging

You can apply this kind of solution in many places where you have cross cutting concerns in your code base.

Here are a few suggestions for places you might consider this kind of a solution:

  • Retrying on failure logic
  • Using an alternative service for a failure (web service “A” failed, but we can use web service “B”)
  • Database connection and connection closing logic.  (Open connection, do something, close connection.)

Pulling out the Switch: It’s Time for a Whooping

In my previous post I talked about how if-else and switch statements are very similar in that they both ignore the problem of combining data with code.

Today I am going to show you how to refactor switch statements to alleviate that problem.

There are some varied opinions on how to refactor switch statements which I believe derive from trying to treat all switch statements as the same.  I want to look at the kinds of switch statements that exist and why I recommend to refactor each one in a particular way.

Data to data Switch Statement

The first, most obvious kind of switch statement is one that maps one form of data to another form of data.

switch (state)
{
    case "Florida": return "Tallahassee";
    case "Idaho": return "Boise";
    case "Arizona": return "Phoenix";
    case "South Carolina": return "Columbia";
}

This example is clearly mapping one piece of data to another.  The best refactor for this situation is to use a map.  In C# it is a dictionary, in Java a map.

var stateToCapitolMap = new Dictionary<string, string>()
{
    {"Florida", "Tallahassee"},
    {"Idaho", "Boise"},
    {"Arizona", "Phoenix"},
    {"South Carolina", "Columbia"}
};

return stateToCapitolMap[state];

I cannot believe how many people argue against this refactoring.  It doesn’t look like much, but we have greatly separated logic from data and increased the maintainability of the code.

Before our refactoring, consider, how you would be able to read all the states and capitols from a file and insert them into the switch statement?  The only possible way would be through code generation.  Clearly this indicates a coupling of code and data.  The switch statement is formatted in such a way that it almost looks like data, but don’t let that fool you, it is code.

Consider the refactored example.  If we want to read the values from a file, it is simple.  So simple, that I’ll even show the code right here.

var stateToCapitolMap = new Dictionary<string, string>();
foreach (string line in File.ReadAllLines("StatesAndCapitols.txt"))
{
    string[] stateCapitol = line.Split(',');
    stateToCapitolMap.Add(stateCapitol[0], stateCapitol[1]);
}
return stateToCapitolMap[state];

Data to action Switch Statement

This kind of switch statement appears different than data to data, but it is actually very similar.  In this case we are mapping some data to a direct action that should be performed given that data.

Often this form of logic can be disguised by multiple actions happening in a case statement.

switch (move)
{
    case "Up": MoveUp();
        break;
    case "Down": MoveDown();
        break;
    case "Left": MoveLeft();
        break;
    case "Right": MoveRight();
        break;
    case "Combo":
        MoveUp();
        MoveUp();
        MoveDown();
        MoveDown();
        break;
}

We can take the same approach here because really this is a form of mapping data to data.  The second data item is essentially the name of a method to call to perform an action.  We can illustrate this intent easily in C#.  (In Java, you will need to wrap the action into a set of classes with a common interface.)

var moveMap = new Dictionary<string, Action>()
{
    {"Up", MoveUp},
    {"Down", MoveDown},
    {"Left", MoveLeft},
    {"Right", MoveRight},
    {"Combo", () => { MoveUp(); MoveUp(); MoveDown(); MoveDown(); }}
};

moveMap[move]();

What we are essentially doing now is a dynamic look-up of the method to call based on the data.  We could even make this example data driven from a text file that specified how to map a move to a method name or list of method names, but that is far beyond on the scope of this post, and I don’t think I would recommend it unless you have a really good reason.

Data to multiple actions Switch Statement

If you are familiar with the techniques of refactoring switch statements, you make be shaking your head by now saying, “that guy is wrong, he needs to use a factory.”  Okay, well now we are going to do it.

In the data-to-action refactoring, I opted for the simplest solution that can work, instead of trying to over solve the problem by adding complexity in the form of a factory.

But, what happens when you have multiple switch statements in your code that operate on the same set of data?

switch (move)
{
    case "Up": MoveUp();
        break;
    case "Down": MoveDown();
        break;
    case "Left": MoveLeft();
        break;
    case "Right": MoveRight();
        break;
    case "Combo":
        MoveUp();
        MoveUp();
        MoveDown();
        MoveDown();
        break;
}

// ... somewhere else

switch (move)
{
    case "Up":
        moveName = "Basic Up Move";
        break;
    case "Down":
        moveName = "Basic Down Move";
        break;
    case "Left":
        moveName = "Basic Left Move";
        break;
    case "Right":
        moveName = "Basic Right Move";
        break;
    case "Combo":
        moveName = "Up Up Down Down Combo Move";
        break;
}

Sure, we could refactor these both into maps or dictionaries.  But what will happen when we try and add a new move?  We’ll have to remember to add logic in both places or we’ll have a problem.  In the prior example we recognized that data was being mapped to an action, so we represented that as succinctly in code as possible.

In this example, the same data is being mapped to some data that describes a move and actions to perform.  We need some way to house these attributes that belong to the data we are switching on together, and we would like to have this all in one place, so we don’t have to change the code in multiple places.

Our best solution here is to use a factory that gives us the right kind of object that implements the behavior that should be tied to the data we are currently switching on.

public interface IMove
{
    void DoMove();
    string GetFriendlyName();
}

public class UpMove : IMove
{
    public void DoMove()
    {
        // Do the move
    }

    public string GetFriendlyName()
    {
        return "Basic Up Move";
    }
}

public static class MoveFactory
{
    private static Dictionary<string, Func<IMove>> moveMap = new Dictionary<string, Func<IMove>>()
    {
        {"Up", () => { return new UpMove(); }},
        {"Down", () => { return new DownMove(); }},
        {"Left", () => { return new LeftMove(); }}
        // ...
    };

    public static IMove CreateMoveFromName(string name)
    {
        return moveMap[name]();
    }
}

You can see here that we are creating a factory which contains a dictionary which maps a move name to what kind of object to create.  Each move implements a common IMove interface.  (I only show some of the implementation here.)

Now in our code we can replace those switch statements with polymorphic behavior from our object returned from the factory.

IMove move = MoveFactory.CreateMoveFromName(name);
move.DoMove();
String friendlyName = move.GetFriendlyName();

The nice thing about this implementation is that if we try to add a new move the IMove interface will require us to implement all the proper methods.  We make a change in one place and the compiler reminds us what we need to do.

Don’t jump straight to the factory

You may have heard the argument between using a factory or a dictionary to refactor switch statements before.  What I am trying to show in this blog post is that it depends on your situation.

The simplest solution is a dictionary or map.  Once you have a second place you are mapping the same data, you should move to a factory.  The factory then contains the mapping between a piece of data and a class.

I also wanted to note here that I didn’t use enumerations.  In real code you should.  I avoided them here to prevent adding one more layer of abstraction so that my example would not require as much explanation.

The Power of Func<>

I remember why I love C#.

After spending the last two years or so writing mainly Java code, getting back into Visual Studio felt a little awkward and painful.

Where did all my keyboard shortcuts go?  Why can’t I navigate to members in a class?  Oh, yes Resharper… ah that’s better.

But then it happened.

I was minding my own business writing some code… la la la

public bool IsYummy()
{
    if(!areOverridesLoaded)
    {
        LoadOverrides();  // loads the override from the database into this.theOverride
    }

    if(this.theOverride != null)
    {
        return theOverride.IsYummy.GetValueOrDefault(food.IsYummy);
    }

    return food.IsYummy;
}

public bool IsFruit()
{
    if(!areOverridesLoaded)
    {
        LoadOverrides();
    }

    if(this.theOverride != null)
    {
        return theOverride.IsFruit.GetValueOrDefault(food.IsFruit);
    }

    return food.IsFruit;
}

public bool IsVegetable()
{
    if(!areOverridesLoaded)
    {
        LoadOverrides();
    }

    if(this.theOverride != null)
    {
        return theOverride.IsVegetable.GetValueOrDefault(food.IsVegetable);
    }

    return food.IsVegetable;
}

Oh my, repeated code!  What ever shall I do?

Hmm… It varies by property names only.

Surely I can refactor out the override loading code.  Can’t put it in the constructor, because the goal is to lazy load the overrides from the database.

Func<> to the rescue!

This is the power of Func.  Of delegates really.  You can’t do this in Java folks… Hold on to your chair.  Here we go…. Whee…..

private bool ReturnValueOrDefault(Func&lt;Food, bool?&gt; FoodProperty, bool defaultValue)
{
    if (!areOverridesLoaded)
    {
        LoadOverrides();
    }

    if (this.theOverride != null)
    {
        return foodProperty(this.theOverride).GetValueOrDefault(defaultValue);
    }

    return defaultValue;
}

public bool IsYummy()
{
    ReturnValueOrDefault(o =&gt; o.IsYummy, food.IsYummy);
}

public bool IsFruit()
{
    ReturnValueOrDefault(o =&gt; o.IsFruit, food.IsFruit);
}

public bool IsVegetable()
{
   ReturnValueOrDefault(o =&gt; o.IsVegetable, food.IsVegetable);
}

Good has won…

Evil has been defeated…

Code has been deleted

Thank you Func<>

Don’t forget the Super Combo!

Also, check out the new page I created, a compilation of useful posts I have written about building software in an Agile way.

C# vs Java Part 3: The Frameworks (Network, Reflection, Security, Text)

Network Programming

Both C# and Java provide framework libraries for reading and writing data to a network.  Java is again hindered by the same kind of problems it has with file IO.  It is difficult to do network programming in Java.  Let’s compare downloading a web page in both languages.

Java:


URL url = new URL(&quot;http://google.com&quot;);
InputStreamReader streamReader = new InputStreamReader(url.openStream);
BufferedReader bufferedReader = new BufferedReader(streamReader);
String line = bufferedReader.readLine();

while (line != null)
{
    System.out.println(line);
    line = reader.readLine();
}

C#


WebClient client = new WebClient();
String webPage = client.DownloadString( url );
Console.out.writeLine(webPage);

In this area, the .NET base library is much larger than Java.  I didn’t really expect this would be the case, but when I compare java.net and java.nio to system.net, it is pretty clear that system.net has a much larger range of functionality.  Without drilling down into too much detail, here are some of the things that are missing or obscured from Java base classes, which are available in C#/.NET:

  • FTP support
  • A base mail package, that doesn’t have to be added separately.
  • TCP and UDP clients instead of using raw sockets
  • Ping
  • Peer to Peer

If we start to talk about Windows Communication Foundation (WCF), the divide grows much further.  WCF is much more than just a web services framework.  It is an adaptable distributed computing framework allowing a large range or protocols all abstracted from the business logic.  The clear winner in my opinion is C# and .NET here.

Reflection

I have to admit, I am not that big of a fan of reflection.  I know it is “neato” and can do lots of cool things.  But I have seen too much “cool” reflection code that breaks when I use the refactor tool.  Sometimes though, even I have to admit, it is the only tool that will do the job.  When that is the case, you expect it to be easier than rocket science.  At least I do.

Java and C# both provide similar capabilities in reflection through the standard framework libraries.  With either language you can easily get a class, find out what methods are on that class and dynamically call them.  There is, however, one important difference which C# is able to provide through the use of Lamba Expressions.

I don’t want to go into the technical details here because they are pretty complicated, but C# will basically allow strongly typed reflection through the use of LINQ to create an expression tree that will provide the information to strongly type the reflected information.  This is pretty cool, although I still don’t like reflection.

One more point here, the dynamic keyword in .NET 4.0 probably belongs here because it is reflection-like.  It will allow for very easy COM interoperability and allow certain parts of C# code to be dynamically typed instead of static.

Security

Both C# and Java frameworks are vast in areas of security.  Unfortunately, both are fairly complex and difficult to use or understand.  Couple this complexity with the changing demands of security and theories about security, spread across multiple platforms and deployment situations, and you get a mess.

Both choices have similar functionality in cryptography.  Both use provider models for authentication and support a wide variety of authentication services, including user defined services.  Both allow for role based security through a provider model.

The differences are what you would expect considering the language and framework targets.  C# and .NET are better equipped in a Windows environment and allow very easy use of windows authentication schemes and active directory.  Java is more flexible, allowing easier interoperability with multiple operating systems and authentication methods, but at the cost of a slightly more complex and burdensome API.  C# and .NET allow the usage of Code Access Security (CAS), which is a very complex concept that basically allows individual level rights to be applied to sections of code controlled from the machine configuration.  Unfortunately, this turned out to be overly complex and something that almost no one used correctly.  For that reason Microsoft is getting rid of the concept of CAS in .NET framework 4.0.

I really don’t like either choice for Security at this point.  Both are confusing, and there are no clear-cut best practices for applying security to the application.  I think both frameworks have a way to go to make security something that is very easy to implement and understand.  We will probably see a large amount of churn in this area because of the changing needs of security, as we transition from a primarily web based application model to this mixed model, using applications that are able to run outside of the browser but start their life inside, and hybrid systems utilizing both.

Text Manipulation

At some point in time every developer faces this challenge.  For this reason, I consider it an important part of any framework.  Some of the most common problems involve parsing files, splitting strings, and searching for patterns.  While regular expressions is part of the equation, it is not the only tool for the job, and as frequently said

Some people, when confronted with a problem, think
“I know, I’ll use regular expressions.”   Now they have two problems.

Let’s take a look at String.  Very similar in Java and C#, but some subtle differences.  My first beef is that in Java I cannot do “string1″ == “string2″.  In C#, this is fine because it ends up just calling the .equals.  Having done a large amount of string comparisons in Java, after having been able to do them with == in C#, I can say this is really annoying.  I have to say though, I do like the substring better in Java than C#.  In C#, substring takes the beginning index and the number of characters to grab.  In Java, substring takes the beginning index and the ending index.  Much more often I have found that I know a beginning and ending index rather than a length.  Often in C# you will have to write confusing code like


String myString = &quot;123FindMe456&quot;;

String newString = myString.Substring(myString.IndexOf(&quot;123&quot;), myString.Length - myString.IndexOf(&quot;456&quot;));

Which is pretty confusing and I am always wondering if I am off by 1.  C# provides a method String.IsNullOrEmpty, which is very useful.  4.0 will provide IsNullOrWhitespace, which will be even more useful.  In Java, you have to check for null and empty separately, or write your own and then curse everyone for not using it.  String.Format is also nicer in C# vs Java.  I like being able to do String.Format(“The number {0} plus {1} is equal to {2}”, “one”, 3, “four”) vs in Java using String.Format(“The number %s plus %d is equal to %s”, “one”, 3, “four”);

Both languages have string builders.  In C# I tend to use string.split, because it is easy and in Java I tend to use StringTokenizer, because I am always afraid I am going to insert an accidental regular expression in the Java String.split method.  Regular expressions are supported in both frameworks, although I think it is slightly easier to use the regular expressions in Java, while C# regular expressions are more powerful.  I couldn’t find a way to do a named capture in Java regular expressions.

Final words

In my mind, C#’s .NET framework is a little larger and easier to use than the standard Java framework.  I don’t think they are very far apart in functionality or ease-of-use in most areas, but there are some notable areas where .NET stands out.  I don’t have the time or knowledge to cover all the various aspects of the two languages across all possible comparison avenues, but I have tried to focus on what I think is important as a developer.  I am currently developing mostly in Java, but not too long ago I was developing mostly in C#, so I feel I can make some good judgments between the two.  In my final part of this series I will talk about the tooling support for Java and C#, then I will move onto more interesting things.  (I am starting to get sick of talking about Java vs C#.)

C# vs Java Part 1: Language

C# vs Java Part 2: Platforms

C# vs Java Part 3: Frameworks

C# vs Java Part 1: The Languages (Continued)

Continued from C# vs Java Part 1: The Languages

Primitive Types and Boxing

There are a few subtle differences here that can really throw a programmer off.  I think the best way to describe this section is by examples.

In C# this is valid


56.ToString();

In Java it looks like this


((Integer)56).toString();

Because both languages can autobox, I expect that I can use a primitive type like an object.  The C# syntax seems more natural to me.  I don’t want to have to think about Integer vs int.  Although a programmer should be aware about boxing, for the most part it should seem transparent.  Although, sometimes it is too transparent.  Take a look at this innocent looking Java code.

if(myObject.isFantastic())
{
   // do something
}

If this code throws a null pointer exception, what can you assume?  Ok, now what if I told you myObject is not null?  Confused a bit?  Here is the strange part.  If isFantastic returns a Boolean instead of a boolean it will be automatically unboxed into a boolean.  If isFantastic returns a Boolean with a value of null, and it tries to unbox it, the result will be a null pointer exception.  The compiler is automatically doing this for you.

if((boolean)myObject.isFantastic())
{
   // do something
}

This happens to be the syntax that you must use for C#.  Is this a bad thing or a good thing?  There are arguments either way, but if you have been bit by this automatic unboxing null pointer, you would probably say it was bad.

One final word on primitives.  C# language allows a nullable primitive type by using a ? before the variable declaration.  So, you can declare a primitive type which is null by typing int? myNullableInt. Whether this is good or bad, I am not completely sold on.  It is very useful when you use it properly to eliminate repetitive code, but in general nulls are bad.  Nulls create extra complexity and should be avoided if at all possible.

Generics

The handling of generics between C# and Java is pretty transparent at a surface level.  I would say that when I am writing Java code vs C#, I don’t really think about the differences in generics.

Basically Java uses type erasure, and C# includes generics support in the IL code that is generated.   Translation: Java compiler erases your generic types and replaces them with casts.  C# includes in the generic types in the byte code that runs inside the Common Language Runtime (CLR).

The easiest way to demonstrate the differences is with these code snippets, which I stole from a very informative post on generics in C# and Java.

static  void genericMethod (T t) {
T newInstance = new T (); // error: type creation
T[] array = new T [0];    // error: array creation

Class c = T.class;        // error: Class querying

List&lt;T&gt; list = new ArrayList&lt;T&gt; ();
if (list instanceof List) {}
// error: illegal generic type for instanceof
}

Contrast this to C# generics implementation which allows these things

static void GenericMethod&lt;T&gt; (T t)
where T : new() {
T newInstance = new T (); // OK - new() constraint.
T[] array = new T [0];    // OK

Type type = typeof(T);    // OK

List&lt;T&gt; list = new List&lt;T&gt; ();
if (list is List&lt;String&gt;) {} // OK
}

A few times working in Java I have wanted to know type of the generic parameter, but in general I haven’t seen the type erasure of Java pose much of a problem.  So why did Java chose to implement generics as erasures?  To maintain backwards compatibility with the JVM.  By not changing the structure of the .class files compatibility is maintained.

Exceptions

There is only one major difference in exception handling between C# and Java.  Java contains the notion of checked exceptions, which are basically exceptions that must be handled by the caller of that code.  In Java you can declare “throws ExceptionType” in the method declaration, and any callers of that method must somehow handle that exception.  This seemed like a good idea, but in general really tends to be an annoyance more than anything.  I have seen a large amount of Java code where the programmer simply declares any empty catch block and “eats” the exception in order to make the code compile, because he/she is required to catch a checked exception.  Checked exceptions also tend to break encapsulation as changes to internal implementations are propagated upwards and can break the caller.

There is also a small difference in finally blocks.  In Java a finally block can contain a return or break statement, in C# it cannot.  The net effect of this is that you can have some strange Java code that returns in the method and in the finally block.  C# also does not process the finally block code when an uncaught exception is thrown, (looks like I was wrong here.  The CLI standard seems to be conflicting on this issue.  In 12.4.2.5 Overview of exception handling, it indicates that when there is an uncaught exception the finally block is not called.  When I tested this out with real code, the finally block was indeed called), while Java will process the finally block.  To make up for this C# uses the IDisposable interface with a using statement to allow for clean up when objects go out of scope.

Syntactic Sugar and Other Niceties

This is the part from a language perspective where Java really takes a pounding.  C# has been implementing many language level features that make writing C# code very easy and nice.

LINQ – this feature allows writing code that looks more like a SQL query.  This is best demonstrated with an example.

string[] names = { &quot;Burke&quot;, &quot;Connor&quot;, &quot;Frank&quot;,
                       &quot;Everett&quot;, &quot;Albert&quot;, &quot;George&quot;,
                       &quot;Harris&quot;, &quot;David&quot; };

IEnumerable&lt;string&gt; query = from s in names
                               where s.Length == 5
                               orderby s
                               select s.ToUpper();

foreach (string item in query)
     Console.WriteLine(item);

This “magic” is actually equivalent to:

IEnumerable&lt;string&gt; query = names
                            .Where(s =&gt; s.Length == 5)
                            .OrderBy(s =&gt; s)
                            .Select(s =&gt; s.ToUpper());

The best thing about LINQ is it allows full intellisense support, so when you type “s.” it auto-suggests all the methods available on a String.

Extension Methods – basically this allows the creation of static methods that look like they are part of the class they extend.  LINQ uses this feature to create the simple syntax and allow any type that implements IEnumerable to gain all the abilities that are provided in the LINQ namespace like the Where or OrderBy method.

Properties – C# allows a programmer to not have to create getter and setter methods.  In addition, properties can be specified without even declaring the private member variable in the class.  A default is created and can still be later replaced with a user defined implementation so it does not violate encapsulation.

Operator Overloading – Specifically indexers.  It is nice to have a dictionary that you can put things in with dictionary["name"] = value as opposed to dictionary.put(“name”) = value.

Anonymous Types and Var – C# will allow a code to create a new type inline with properties automatically created.  This is very useful for reducing large amounts of code where needed.  Here is an example:

var productQuery =
    from prod in products
    select new { prod.Color, prod.Price };

foreach (var v in productQuery)
{
    Console.WriteLine(&quot;Color={0}, Price={1}&quot;, v.Color, v.Price);
}

You can see how with anonymous types and the var keyword, (which is still statically typed but allows for the compiler to specify the type instead of the programmer), we can create a type which we only need for a brief period of time.

?? – the null-coalescing operator in C# is useful for trimming down code that would normally have an if null do this/else copy value.

As I said at the beginning of this section,  C# kind of pounds Java into the ground here as far as language features that are really just ease-of-use or syntactic sugar.  Java does have one syntactic sugar feature I can think of.

Static imports – With Java you can declare

import static java.lang.Math.PI;

This import statement will allow the use of the constant PI anywhere in the class file as if it were locally declared.

Conclusion

From a language standpoint I have to admit that I do clearly prefer C# over Java.  I think it is very hard to objectively compare Java and C# and pick Java purely from a language perspective.  In the other parts of my series I will cover other areas where Java has more advantages.  I’m not against Java, I use it almost everyday, but from a perspective of language design, it has been left behind in the dust.  Microsoft seemed to make the C# language design adopt to the ease-of-use of the programmer.  I look forward to seeing how Java advances to catch up with some of these C# features and make the language experience better.

What do you think?  Do you think I have fairly portrayed both languages from a developer perspective?

Lambda Extensible FizzBuzz 2.0 With LINQ

Many of you have probably heard of the FizzBuzz challenge:

Jeff Atwood blogged about it here

I think it started from here

Anyway, if you don’t want to click those links and you’re not familiar with it, it is a small programming problem designed to screen someone who can actually write code from someone who pretends to write code.  The problem is:

Write a program that prints the numbers from 1 to 100. But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz”. For numbers which are multiples of both three and five print “FizzBuzz”

The point is this is a simple problem.  Just simple enough to be interesting.  I actually got asked to solve this problem this week.  I solved it after bumbling through it way more than I should have (I was trying to think of an elegant solution on the spot.)  But it kept bugging me.  I knew there was a solution that I wanted to give it that would make me happy with feelings of maps, function pointers and linq.  Pretty much 3 of my favorite things.  So I thought about it for a bit and here is the solution I came up with (3rd cut.)

        public static void FizzBuzz()
        {
            Dictionary&lt;Func&lt;int, bool&gt;, Func&lt;int, string&gt;&gt; rules = new Dictionary&lt;Func&lt;int, bool&gt;, Func&lt;int, string&gt;&gt;();
            rules.Add(x =&gt; x % 3 == 0, x =&gt; &quot;fizz&quot;);
            rules.Add(x =&gt; x % 5 == 0, x =&gt; &quot;buzz&quot;);
            rules.Add(x =&gt; x % 5 != 0 &amp;&amp; x % 3 != 0, x =&gt; x.ToString());
            rules.Add(x =&gt; true, x =&gt; &quot;\n&quot;);

            var output = from n in Enumerable.Range(1, 100)
                         from f in rules
                         where f.Key(n)
                         select f.Value(n);

            output.ToList().ForEach(x =&gt; Console.Write(x));
        }

That’s my crack at it. If you’re not familiar with all the C# magic going on here, basically it works like this:

You have a map which maps one lambda expression which defines the rule we are checking for to a second lambda expression which is what string to return if that rule matches.
Then we have a LINQ query that does a Cartesian join on the set of numbers from 1 to 100 and the 4 rules.
Finally, we take the output and for each string we write it out.

The 4th rule in the set makes sure that after evaluating the set of rules for each number a newline gets put in.

It is kind of interesting how thinking about different ways to solve a fairly simple problem can help you think about the tools you have in new ways.