Category Archives: LINQ

C# vs Java Part 3: The Frameworks (Network, Reflection, Security, Text)

Network Programming

Both C# and Java provide framework libraries for reading and writing data to a network.  Java is again hindered by the same kind of problems it has with file IO.  It is difficult to do network programming in Java.  Let’s compare downloading a web page in both languages.

Java:


URL url = new URL("http://google.com");
InputStreamReader streamReader = new InputStreamReader(url.openStream);
BufferedReader bufferedReader = new BufferedReader(streamReader);
String line = bufferedReader.readLine();

while (line != null)
{
    System.out.println(line);
    line = reader.readLine();
}

C#


WebClient client = new WebClient();
String webPage = client.DownloadString( url );
Console.out.writeLine(webPage);

In this area, the .NET base library is much larger than Java.  I didn’t really expect this would be the case, but when I compare java.net and java.nio to system.net, it is pretty clear that system.net has a much larger range of functionality.  Without drilling down into too much detail, here are some of the things that are missing or obscured from Java base classes, which are available in C#/.NET:

  • FTP support
  • A base mail package, that doesn’t have to be added separately.
  • TCP and UDP clients instead of using raw sockets
  • Ping
  • Peer to Peer

If we start to talk about Windows Communication Foundation (WCF), the divide grows much further.  WCF is much more than just a web services framework.  It is an adaptable distributed computing framework allowing a large range or protocols all abstracted from the business logic.  The clear winner in my opinion is C# and .NET here.

Reflection

I have to admit, I am not that big of a fan of reflection.  I know it is “neato” and can do lots of cool things.  But I have seen too much “cool” reflection code that breaks when I use the refactor tool.  Sometimes though, even I have to admit, it is the only tool that will do the job.  When that is the case, you expect it to be easier than rocket science.  At least I do.

Java and C# both provide similar capabilities in reflection through the standard framework libraries.  With either language you can easily get a class, find out what methods are on that class and dynamically call them.  There is, however, one important difference which C# is able to provide through the use of Lamba Expressions.

I don’t want to go into the technical details here because they are pretty complicated, but C# will basically allow strongly typed reflection through the use of LINQ to create an expression tree that will provide the information to strongly type the reflected information.  This is pretty cool, although I still don’t like reflection.

One more point here, the dynamic keyword in .NET 4.0 probably belongs here because it is reflection-like.  It will allow for very easy COM interoperability and allow certain parts of C# code to be dynamically typed instead of static.

Security

Both C# and Java frameworks are vast in areas of security.  Unfortunately, both are fairly complex and difficult to use or understand.  Couple this complexity with the changing demands of security and theories about security, spread across multiple platforms and deployment situations, and you get a mess.

Both choices have similar functionality in cryptography.  Both use provider models for authentication and support a wide variety of authentication services, including user defined services.  Both allow for role based security through a provider model.

The differences are what you would expect considering the language and framework targets.  C# and .NET are better equipped in a Windows environment and allow very easy use of windows authentication schemes and active directory.  Java is more flexible, allowing easier interoperability with multiple operating systems and authentication methods, but at the cost of a slightly more complex and burdensome API.  C# and .NET allow the usage of Code Access Security (CAS), which is a very complex concept that basically allows individual level rights to be applied to sections of code controlled from the machine configuration.  Unfortunately, this turned out to be overly complex and something that almost no one used correctly.  For that reason Microsoft is getting rid of the concept of CAS in .NET framework 4.0.

I really don’t like either choice for Security at this point.  Both are confusing, and there are no clear-cut best practices for applying security to the application.  I think both frameworks have a way to go to make security something that is very easy to implement and understand.  We will probably see a large amount of churn in this area because of the changing needs of security, as we transition from a primarily web based application model to this mixed model, using applications that are able to run outside of the browser but start their life inside, and hybrid systems utilizing both.

Text Manipulation

At some point in time every developer faces this challenge.  For this reason, I consider it an important part of any framework.  Some of the most common problems involve parsing files, splitting strings, and searching for patterns.  While regular expressions is part of the equation, it is not the only tool for the job, and as frequently said

Some people, when confronted with a problem, think
“I know, I’ll use regular expressions.”   Now they have two problems.

Let’s take a look at String.  Very similar in Java and C#, but some subtle differences.  My first beef is that in Java I cannot do “string1″ == “string2″.  In C#, this is fine because it ends up just calling the .equals.  Having done a large amount of string comparisons in Java, after having been able to do them with == in C#, I can say this is really annoying.  I have to say though, I do like the substring better in Java than C#.  In C#, substring takes the beginning index and the number of characters to grab.  In Java, substring takes the beginning index and the ending index.  Much more often I have found that I know a beginning and ending index rather than a length.  Often in C# you will have to write confusing code like


String myString = "123FindMe456";

String newString = myString.Substring(myString.IndexOf("123"), myString.Length - myString.IndexOf("456"));

Which is pretty confusing and I am always wondering if I am off by 1.  C# provides a method String.IsNullOrEmpty, which is very useful.  4.0 will provide IsNullOrWhitespace, which will be even more useful.  In Java, you have to check for null and empty separately, or write your own and then curse everyone for not using it.  String.Format is also nicer in C# vs Java.  I like being able to do String.Format(“The number {0} plus {1} is equal to {2}”, “one”, 3, “four”) vs in Java using String.Format(“The number %s plus %d is equal to %s”, “one”, 3, “four”);

Both languages have string builders.  In C# I tend to use string.split, because it is easy and in Java I tend to use StringTokenizer, because I am always afraid I am going to insert an accidental regular expression in the Java String.split method.  Regular expressions are supported in both frameworks, although I think it is slightly easier to use the regular expressions in Java, while C# regular expressions are more powerful.  I couldn’t find a way to do a named capture in Java regular expressions.

Final words

In my mind, C#’s .NET framework is a little larger and easier to use than the standard Java framework.  I don’t think they are very far apart in functionality or ease-of-use in most areas, but there are some notable areas where .NET stands out.  I don’t have the time or knowledge to cover all the various aspects of the two languages across all possible comparison avenues, but I have tried to focus on what I think is important as a developer.  I am currently developing mostly in Java, but not too long ago I was developing mostly in C#, so I feel I can make some good judgments between the two.  In my final part of this series I will talk about the tooling support for Java and C#, then I will move onto more interesting things.  (I am starting to get sick of talking about Java vs C#.)

C# vs Java Part 1: Language

C# vs Java Part 2: Platforms

C# vs Java Part 3: Frameworks

C# vs Java Part 1: The Languages (Continued)

Continued from C# vs Java Part 1: The Languages

Primitive Types and Boxing

There are a few subtle differences here that can really throw a programmer off.  I think the best way to describe this section is by examples.

In C# this is valid


56.ToString();

In Java it looks like this


((Integer)56).toString();

Because both languages can autobox, I expect that I can use a primitive type like an object.  The C# syntax seems more natural to me.  I don’t want to have to think about Integer vs int.  Although a programmer should be aware about boxing, for the most part it should seem transparent.  Although, sometimes it is too transparent.  Take a look at this innocent looking Java code.

if(myObject.isFantastic())
{
   // do something
}

If this code throws a null pointer exception, what can you assume?  Ok, now what if I told you myObject is not null?  Confused a bit?  Here is the strange part.  If isFantastic returns a Boolean instead of a boolean it will be automatically unboxed into a boolean.  If isFantastic returns a Boolean with a value of null, and it tries to unbox it, the result will be a null pointer exception.  The compiler is automatically doing this for you.

if((boolean)myObject.isFantastic())
{
   // do something
}

This happens to be the syntax that you must use for C#.  Is this a bad thing or a good thing?  There are arguments either way, but if you have been bit by this automatic unboxing null pointer, you would probably say it was bad.

One final word on primitives.  C# language allows a nullable primitive type by using a ? before the variable declaration.  So, you can declare a primitive type which is null by typing int? myNullableInt. Whether this is good or bad, I am not completely sold on.  It is very useful when you use it properly to eliminate repetitive code, but in general nulls are bad.  Nulls create extra complexity and should be avoided if at all possible.

Generics

The handling of generics between C# and Java is pretty transparent at a surface level.  I would say that when I am writing Java code vs C#, I don’t really think about the differences in generics.

Basically Java uses type erasure, and C# includes generics support in the IL code that is generated.   Translation: Java compiler erases your generic types and replaces them with casts.  C# includes in the generic types in the byte code that runs inside the Common Language Runtime (CLR).

The easiest way to demonstrate the differences is with these code snippets, which I stole from a very informative post on generics in C# and Java.

static  void genericMethod (T t) {
T newInstance = new T (); // error: type creation
T[] array = new T [0];    // error: array creation

Class c = T.class;        // error: Class querying

List<T> list = new ArrayList<T> ();
if (list instanceof List) {}
// error: illegal generic type for instanceof
}

Contrast this to C# generics implementation which allows these things

static void GenericMethod<T> (T t)
where T : new() {
T newInstance = new T (); // OK - new() constraint.
T[] array = new T [0];    // OK

Type type = typeof(T);    // OK

List<T> list = new List<T> ();
if (list is List<String>) {} // OK
}

A few times working in Java I have wanted to know type of the generic parameter, but in general I haven’t seen the type erasure of Java pose much of a problem.  So why did Java chose to implement generics as erasures?  To maintain backwards compatibility with the JVM.  By not changing the structure of the .class files compatibility is maintained.

Exceptions

There is only one major difference in exception handling between C# and Java.  Java contains the notion of checked exceptions, which are basically exceptions that must be handled by the caller of that code.  In Java you can declare “throws ExceptionType” in the method declaration, and any callers of that method must somehow handle that exception.  This seemed like a good idea, but in general really tends to be an annoyance more than anything.  I have seen a large amount of Java code where the programmer simply declares any empty catch block and “eats” the exception in order to make the code compile, because he/she is required to catch a checked exception.  Checked exceptions also tend to break encapsulation as changes to internal implementations are propagated upwards and can break the caller.

There is also a small difference in finally blocks.  In Java a finally block can contain a return or break statement, in C# it cannot.  The net effect of this is that you can have some strange Java code that returns in the method and in the finally block.  C# also does not process the finally block code when an uncaught exception is thrown, (looks like I was wrong here.  The CLI standard seems to be conflicting on this issue.  In 12.4.2.5 Overview of exception handling, it indicates that when there is an uncaught exception the finally block is not called.  When I tested this out with real code, the finally block was indeed called), while Java will process the finally block.  To make up for this C# uses the IDisposable interface with a using statement to allow for clean up when objects go out of scope.

Syntactic Sugar and Other Niceties

This is the part from a language perspective where Java really takes a pounding.  C# has been implementing many language level features that make writing C# code very easy and nice.

LINQ – this feature allows writing code that looks more like a SQL query.  This is best demonstrated with an example.

string[] names = { "Burke", "Connor", "Frank",
                       "Everett", "Albert", "George",
                       "Harris", "David" };

IEnumerable<string> query = from s in names
                               where s.Length == 5
                               orderby s
                               select s.ToUpper();

foreach (string item in query)
     Console.WriteLine(item);

This “magic” is actually equivalent to:

IEnumerable<string> query = names
                            .Where(s => s.Length == 5)
                            .OrderBy(s => s)
                            .Select(s => s.ToUpper());

The best thing about LINQ is it allows full intellisense support, so when you type “s.” it auto-suggests all the methods available on a String.

Extension Methods – basically this allows the creation of static methods that look like they are part of the class they extend.  LINQ uses this feature to create the simple syntax and allow any type that implements IEnumerable to gain all the abilities that are provided in the LINQ namespace like the Where or OrderBy method.

Properties – C# allows a programmer to not have to create getter and setter methods.  In addition, properties can be specified without even declaring the private member variable in the class.  A default is created and can still be later replaced with a user defined implementation so it does not violate encapsulation.

Operator Overloading – Specifically indexers.  It is nice to have a dictionary that you can put things in with dictionary["name"] = value as opposed to dictionary.put(“name”) = value.

Anonymous Types and Var – C# will allow a code to create a new type inline with properties automatically created.  This is very useful for reducing large amounts of code where needed.  Here is an example:

var productQuery =
    from prod in products
    select new { prod.Color, prod.Price };

foreach (var v in productQuery)
{
    Console.WriteLine("Color={0}, Price={1}", v.Color, v.Price);
}

You can see how with anonymous types and the var keyword, (which is still statically typed but allows for the compiler to specify the type instead of the programmer), we can create a type which we only need for a brief period of time.

?? – the null-coalescing operator in C# is useful for trimming down code that would normally have an if null do this/else copy value.

As I said at the beginning of this section,  C# kind of pounds Java into the ground here as far as language features that are really just ease-of-use or syntactic sugar.  Java does have one syntactic sugar feature I can think of.

Static imports – With Java you can declare

import static java.lang.Math.PI;

This import statement will allow the use of the constant PI anywhere in the class file as if it were locally declared.

Conclusion

From a language standpoint I have to admit that I do clearly prefer C# over Java.  I think it is very hard to objectively compare Java and C# and pick Java purely from a language perspective.  In the other parts of my series I will cover other areas where Java has more advantages.  I’m not against Java, I use it almost everyday, but from a perspective of language design, it has been left behind in the dust.  Microsoft seemed to make the C# language design adopt to the ease-of-use of the programmer.  I look forward to seeing how Java advances to catch up with some of these C# features and make the language experience better.

What do you think?  Do you think I have fairly portrayed both languages from a developer perspective?

Lambda Extensible FizzBuzz 2.0 With LINQ

Many of you have probably heard of the FizzBuzz challenge:

Jeff Atwood blogged about it here

I think it started from here

Anyway, if you don’t want to click those links and you’re not familiar with it, it is a small programming problem designed to screen someone who can actually write code from someone who pretends to write code.  The problem is:

Write a program that prints the numbers from 1 to 100. But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz”. For numbers which are multiples of both three and five print “FizzBuzz”

The point is this is a simple problem.  Just simple enough to be interesting.  I actually got asked to solve this problem this week.  I solved it after bumbling through it way more than I should have (I was trying to think of an elegant solution on the spot.)  But it kept bugging me.  I knew there was a solution that I wanted to give it that would make me happy with feelings of maps, function pointers and linq.  Pretty much 3 of my favorite things.  So I thought about it for a bit and here is the solution I came up with (3rd cut.)

        public static void FizzBuzz()
        {
            Dictionary<Func<int, bool>, Func<int, string>> rules = new Dictionary<Func<int, bool>, Func<int, string>>();
            rules.Add(x => x % 3 == 0, x => "fizz");
            rules.Add(x => x % 5 == 0, x => "buzz");
            rules.Add(x => x % 5 != 0 && x % 3 != 0, x => x.ToString());
            rules.Add(x => true, x => "\n");

            var output = from n in Enumerable.Range(1, 100)
                         from f in rules
                         where f.Key(n)
                         select f.Value(n);

            output.ToList().ForEach(x => Console.Write(x));
        }

That’s my crack at it. If you’re not familiar with all the C# magic going on here, basically it works like this:

You have a map which maps one lambda expression which defines the rule we are checking for to a second lambda expression which is what string to return if that rule matches.
Then we have a LINQ query that does a Cartesian join on the set of numbers from 1 to 100 and the 4 rules.
Finally, we take the output and for each string we write it out.

The 4th rule in the set makes sure that after evaluating the set of rules for each number a newline gets put in.

It is kind of interesting how thinking about different ways to solve a fairly simple problem can help you think about the tools you have in new ways.