IDevResource.com - ATR Articles: Learning C# by Dr Richard Grimes

The Developer's Resource & Community Site

COM	XML	ASP	Java & Misc.	NEW: VS.NET
International	This Week	Forums	Author Central	Find a Job

Learning C#, part 2

Interfaces

An interface is a declaration of a collection of class members. They do not contain implementation, instead they indicates that a class that implements the interface must provide those members. Consider this interface:


public interface INamed
{
   String Name {get;}
   void PrintName();
}

This indicates that a class that implements this interface must have a read-only property called Name and a method called PrintName(). For example:


public class NamedClass : INamed
{
   protected String name = "NamedClass";
   public String Name
   {
      get{return name;}
   }
   public void PrintName()
   {
      Console.WriteLine("Name is {0}", name);
   }
}

The NamedClass class can be used like this:


NamedClass c = new NamedClass();
c.PrintName();
 
INamed ic = c;
ic.PrintName();

the output from PrintName() is the same in both cases. It is possible to prevent this from happening, that is, to ensure that the interface members are only accessed through an interface. To do this you have to qualify the method name with the interface name, the syntax is shown here:


public class NamedClass : INamed
{
   protected String name = "NamedClass";
   String INamed.Name
   {
      get{return name;}
   }
   void INamed.PrintName()
   {
      Console.WriteLine("Name is {0}", name);
   }
}

Notice that the interface members are not public, this means that they are not accessible through a NamedClass reference; but because they are qualified with the interface name, they are accessible through an interface:


INamed i = new NamedClass();
i.PrintName(); // compiles OK

A class can implement more than one interface, so if you have another class that implements INamed and IViewable the code can just ask for the behavior determined by one of these interfaces, for example INamed:


public class NVOne : INamed, IViewable
{
/* implement INamed and IViewable members*/
}
 
public class NVTwo : INamed, IViewable
{
/* implement INamed and IViewable members*/
}
 
public class A
{
   private void f(INamed n)
   {
      n.PrintName();
   }
   public void g()
   {
      NVOne n1 = new NVOne();
      f(n1);
      NVTwo n2 = new NVTwo();
      f(n2);
   }
}

Interfaces allow you to select the behavior that you want, in this code the method A.g() creates two objects, one each of two different types. These two objects are passed to the method A.f() that uses just the INamed behavior of each of these objects. Interface programming like this is very powerful, indeed, it is the basis of COM. Classes can have a base class in addition to indicating that it implements interfaces. So, if NamedBase has the code for the INamed interface members:


public class NamedBase
{
   protected String name = "empty";
   public String Name
   {
      get{return name;}
   }
   public void PrintName()
   {
      Console.WriteLine("Name is {0}", name);
   }
}

You can use this with classes that implement the INamed interface, which will inherit the NamedBase implementation:


public class One : NamedBase, INamed
{
   public One()
   {
      name = "One";
   }
}
 
public class Two : NamedBase, INamed
{
   public Two()
   {
      name = "Two";
   }
}

These classes could be used like this:


INamed[] namedObjects = {new One(), new Two()};
foreach (INamed o in namedObjects)
   o.PrintName();

This creates an array with instances of the two object types. The foreach construct is used to access all the members in the array, and since the members are INamed, the code can call PrintName() irrespective of the actual type of the object.

Value Data Types

C# has data types similar to C++. However, the major difference is that all variables are affected by the C# unified type system, even seemingly non-object types. I say 'seemingly' because in .NET every type is actually an object and is an instance of a class or struct. Every .NET class has System.Object as the ultimate base class in the class hierarchy and what this means is that you can call the methods of the Object class through basic types like int.

C# makes the difference between value types and reference types. When you pass a variable as a function parameter a value type is passed by value, that is, the actual value is passed to the function. When a reference type is passed it is not the actual value but a reference - an identifier that C# used to identify the actual variable - that is passed. When a function alters a parameter passed by value, it is the local copy that is altered:


void AddOne(int i)
{
   i = i + 1;
   Console.WriteLine("AddOne: {0}", i);
}
 
void UseIt()
{
   int j = 10;
   Console.WriteLine("j = {0}", j);
   AddOne(j);
   Console.WriteLine("j = {0}", j);
}

The results from this code will be:


j = 10
AddOne: 11
j = 10

In other words, the value of j in UseIt() is unchanged by calling AddOne() because a copy of the value if j is passed to the function. However, the i parameter of AddOne() is changed, but this only has an effect in the function. To make a value type passed by reference you have to use the ref modifier in both the function declaration and the function call (I'll explain why later). For example,


void AddOne(ref int i)
{
   i = i + 1;
   Console.WriteLine("AddOne: {0}", i);
}
 
void UseIt()
{
   int j = 10;
   Console.WriteLine("j = {0}", j);
   AddOne(ref j);
   Console.WriteLine("j = {0}", j);
}

These will give the following results:


j = 10
AddOne: 11
j = 11

That is, the value that AddOne() acts on is j declared in UseIt() because j was passed by reference.

One reason for ref is to allow a method to return values through parameters when the function return value is used to return a status code. The big problem with this approach comes when the method being called is on an object on another machine, in this case whenever i is accessed (3 times in AddOne() in the example above) a call is made across the network. C# has a way to get round this with the out specifier. This indicates that the parameter is write-only:


void now(out String str)
{
   // str does not exist at this point
   DateTime dt = DateTime.Now;
   str = dt.ToString();
}

The DateTime.Now property creates a new instance of the DateTime class initialized to the current date and time. The ToString() is a method inherited from System.Object that returns a string with the stringized form of the date. (Note that it would have been better for now() to return a String, but then I would not be able to make this point about out.) The use of the out modifier says that the parameter is created in the method and a reference is returned to the caller. now() is called like this:


String current;
now(out current);
Console.WriteLine(current);

Again, the calling code has to use the out modifier too. The reason is that functions can be overloaded with out and ref:


void f(A a)
{/* a is passed in, by reference or by value 
    depending on the type of A */}
void f(ref A a)
{/* a is passed by reference and so is 
    in/out */}
void f(out A a)
{/* a is passed out by reference */}

The only way that the C# compiler can tell which overload of f() you want is if you specify it explicitly in the calling code.

As mentioned earlier, when you create an object the variable you assign is a reference. Thus, when you pass an object via a method call and the parameter is changed, it means that so is the original object, for example:


class Q
{
   public int i;
}
 
void AddOne(Q qq)
{
   qq.i = qq.i + 1;
   Console.WriteLine("AddOne: {0}", qq.i);
}
 
void UseIt()
{
   Q q = new Q();
   q.i = 10;
   Console.WriteLine("q.i = {0}", q.i);
   AddOne(q);
   Console.WriteLine("q.i = {0}", q.i);
}

In this case q is passed by reference (because it is an object) so AddOne() increments q rather than the local qq. The output of this code would be:


q.i = 10
AddOne: 11
q.i = 11

C# has two types that are value types, and these should be familiar to C++ developers: enum and struct. A struct is essentially a class that is passed by value, they can have the same members as classes and they can implement interfaces. However you cannot derive from a struct, they cannot have parameterless constructors and they cannot have field initializers. Instances of structs are created on the stack and are not garbage collected and because of this they are passed by value. All structs are derived from System.Object, so you can call its methods, for example with this struct:


struct MyStruct
{
   public int i;
}

you can use it like this:


MyStruct ms;
ms.i = 10;
Console.WriteLine("{0}.i = {1}", 
   ms.ToString(), ms.i);

This will result in:


MyStruct.i = 10

The basic types in C# are actually structs as shown in the following table:

Type	Struct	Description
sbyte	SByte	8 bit signed
short	Int16	16 bit signed
int	Int32	32 bit signed
long	Int64	64 bit signed
byte	Byte	8 bit unsigned
ushort	UInt16	16 bit unsigned
uint	UInt32	32 bit unsigned
ulong	UInt64	64 bit unsigned
float	Single	single precision floating point
Double	Double	double precision floating point
char	Char	unicode character
decimal	Decimal	decimal type with 32 significant digits

Notice the syntax for creating structs: you don't have to use new, because they are created 'inline'. However, if you wish you can use new and the operator will still create the struct on the stack:


int i = 1;
int j = new int(); j = 1;

The structs in the table all implement the the IFormattable interface, one method of which is Format(). This overloaded method takes a format string which is used to generate a string. For example:


int i = 10;
Console.WriteLine("{0}", i.Format("D4", null));

will write:

to the console. The reason for this is that the format type "D4" means "four digits, giving 0 when a digit is empty". This overload takes two parameters, the first is the format string, while the second is a service provider to use to do the formatting. Since we will use Int32, we do not need to provide a service provider and so the special value of null is passed to indicate that no object exists.

There is a weird twist to these base type classes. A literal value is also an object, so you can call methods through that literal. The following is perfectly legal, if a bit strange looking:


Console.WriteLine("{0}", 
   45.Format("D4", null));

As the name suggests, an enum is an enumerated value, for example:


enum Rainbow
{Red, Orange, Yellow, Green, 
 Blue, Indigo, Violet}
 
void Custard()
{
   Rainbow color = Rainbow.Yellow;
   if (color == Rainbow.Yellow) 
      Console.WriteLine("custard color!");
}

Here, I have defined a new type called Rainbow which enumerates all the colors of the rainbow (well, 7 of them really). The Custard() method defines a variable of that type and assigns it a color. Since the enumerator is a value type you do not have to create it using new. Notice that the color has to be scoped using the enumeration name and the dot operator.

Variables of enumerations can be acted on by the common C# operators, so in this example I have used the equality operator == to test to see if the value is Rainbow.Yellow. If I wanted to change this variable to Rainbow.Green I could use the C# increment operator ++:


color++;
if (color == Rainbow.Yellow) 
   Console.WriteLine("custard color!");
if (color == Rainbow.Green) 
   Console.WriteLine("lime color!");

Value types can be converted to object which then allows you to perform type operations. This is a process called boxing and the inverse - conversion from an object to a value type - is called unboxing. For example:


int a = 42;
object o = a;   // boxing
Console.WriteLine(o.GetType().ToString());
int b = (int)o; // unboxing

The process of boxing causes a copy to be made, so


b = 43;
Console.WriteLine("{0} {1}", a, b);

results in 42 43 printed on the console.

Language Constructs

Throughout this article I have expressed that C# evolved from C++ and so much of the syntax is similar. In this section I want to explain some of the new language constructs and changes that you'll find.

C# has the same code flow statements as C++: do, while, if, switch; these work essentially in the same way as their namesakes although there are some differences. The C# switch statement allows the branching to be made on integral and string values. It prohibits the fall through effect of C++ switches, so this code will not compile:


// C++ code
switch (x)
{
case 1:
   // handle 1
   break;
case 2:
   // handle 2
case 3:
   // handle 2 and 3
   break;
default:
   // handle everything else
}

Instead, C# allows you to go to a branch, explicitly:


// C# code
switch (x)
{
case 1:
   // handle 1
   goto default;
case 2:
   goto case 3;
case 3:
   // handle 2 and 3
   break;
default:
   // handle 1 and everything else
}

goto can also be used to make execution flow to a specified label similar to how it is done in C++.

The one new statement is foreach which acts on a collection type. A collection type is something that implements the IEnumerable interface. This interface has a single method called GetEnumerator() that returns the IEnumerator interface on an object that will be called to enumerate on the values. .NET provides several classes to do this, for example HashTable, ObjectList and Dictionary, but the collection you are most likely to use is an array.

foreach has a variable that is the current item in the enumerator, so in the following


foreach (int i in 
   new int[] {1, 2, 3, 5, 8, 13}) 
{
   Console.Write("{0}, ", i);
}

the collection is created on the fly and is an array of ints; the current item is the variable i, so the output of the code above is


1, 2, 3, 5, 8, 13

Array declaration in C# is different to C++. The array variable uses commas to determine how many dimensions that the array should have, for example:


int[] one;
int[,] two;

one has a single dimension, whereas two has two dimensions. Since arrays are reference types, you have to use new to create an instance:


one = new int[4];

this makes the one reference a single dimensional array with four items indexed from 0 to 3. As the example in foreach showed above, you can provide initial values for array elements:


int[,] two = new int[,]{{1,2},{3,4}};

In this case the array has two dimensions of two items each. Array declarations have another format that allows for 'ragged' arrays, that is, the dimensions are not all the same size, for example:


int[][] ragged = new int[2][];
ragged[0] = new int[]{1, 2};
ragged[1] = new int[]{3, 4, 5, 6};
foreach (int[] i in ragged)
{
   foreach(int j in i)
      Console.Write("{0} ", j);
   Console.WriteLine();
}

Here, the ragged array has two dimensions, the first one has two items, whereas the second dimension has four. The output from this code is:


1 2
3 4 5 6

C# allows you to define unary (+, -, !, ~, ++, --, true, or false), binary (+, -, *, /, %, &, |, ^, <<, >>, ==, !=, >, <, >=, or <=) and conversion operators. There are differences to what you can do in C++ (for example the unary dereference operator does not exist because there are no pointers), but the definition is similar.


class A
{
   public int val = 0;
   public static bool operator !(A a)
   {
      return a.val != 0;
   }
   public static bool operator !=(A a, A b)
   {
      return a.val != b.val;
   }
   public static bool operator ==(A a, A b)
   {
      return a.val == b.val;
   }
}

Here, I have defined an operator ! to test if an object has a zero value, and != to do a comparison between two objects of the same type; the first is a unary operator, the second is a binary operator. Because I have defined the != operator the compiler requires that I also define the == operator.

One difference between C# and CPP is that you cannot define an assignment operator, instead you have to use a conversion operator. C# gives two options, implicit and explicit conversions. Implicit conversions occur in assignments whereas explicit conversions occur when you explicitly cast from one type to another:


A a = new A(10);
int i = (int)a; // explicit
a = 15;         // implicit

The code for these conversions is:


class A
{
   public int val = 0;
   public A(int x)
   {
      val = x;
   }
   public static explicit operator int(A a) 
   {
      return a.val;
   }
   public static implicit operator A(int a) 
   {
      return new A(a);
   }
}

C++ has the facility (through varargs and ...) to allow you to write methods with a variable number of parameters. C# has a similar facility with the params modifier. A parameter marked as such should be the last one in the parameter list and must be a single dimension array, for example:


int Sum(params int[] vals)
{
   int sum = 0;
   foreach (int i in vals) sum += i;
   return sum;
}

Sum() can be called like this:


Console.WriteLine("Sum is {0}", Sum(1, 2, 3, 4));

The final aspect that I want to explore here is exceptions. C# supports the try-catch-finally syntax from C++. When execution leaves the code in the try block the code in the finally block is executed and if the reason for leaving the try block is because an exception was thrown, the catch block is called.


try
{
   Console.WriteLine("in try");
   throw new Exception("exception thrown");
}
catch(Exception e)
{
   Console.WriteLine("in catch: {0}", 
      e.Message);
}
finally
{
   Console.WriteLine("in finally");
}

This will result in:


in try
in catch: exception thrown
in finally

Exceptions that are thrown have to be of type System.Exception, or a class derived from it. As you can see from the code above, if the exception is constructed with a string you can retrieve the string later through the Exception.Message property.

Of course exceptions are used for exceptional conditions, and you really should not use them in situations like this when one of the C# flow statements like if could be used. One confusing aspect of C# is that there is no equivalent of the C++ or Java throws statement, so you must scrutinize a class's documentation to see if it could throw a specific exception type. If you cannot get this information from the class's documentation then you have no choice but to surround the use of the class with a try block and catch the generic System.Exception type.

Conversions from one type to another can lose information, for example:


uint ui = 0x10000;
ushort us = (ushort)ui;

After this (and assuming it was compiled with the default settings of csc) us will have a value of 0 because a ushort is 16 bits and ui has 17 bits of information. This may be your intention in doing the cast, but often this is an error. By default, the C# compiler will not check for overflow but this can be changed using the /checked switch. If you use /checked+ then all integer code will be checked for overflow and the previous code will throw a System.OverflowException exception. Of course, the switch has an affect on all code in a source file.

To specify that an explicit check is made for overflows on individual code blocks you can surround that code with a checked block:


uint ui = 0x10000;
ushort us;
try
{
   checked
   {
      us = (ushort)ui;
   }
}
catch(Exception e)
{
   Console.WriteLine(e.ToString());
}

This code will throw a System.OverflowException exception. There is a corresponding unchecked keyword that allows you to turn off overflow checking which is useful when you have compiled a file with /checked+.

Finally

If you like this article and know where I can get an MSDN subscription then let me know, my subscription through the MVP program (www.mvp.org) has floundered through bureaucracy and I need to be kept up to date.

Acknowledgements

Thanks to the following, who have pointed out errors and suggested changes, if you would like to comment, please get in touch ([email protected]).

Simon Robinson
Peter Drayton

Glossary

Item	Description
.NET	The general term for Microsoft's managed platform. It consists of the CLR execution engine that executes IL code compiled from .NET classes.
Assembly	Deployment unit of classes in .NET, for Windows this is a DLL.
Boxing	Conversion from a value type to an object
C#	C Sharp, a new language used to write managed code/ Regarded as an evolution of C++.
CLR	The Common Language Runtime, the CLR loads and executes IL.
Garbage Collector	The part of the .NET CLR that manages objects. The GC keeps track of all references to an object, and when there are no more references it can destroy the object and free its resources.
IL	Intermediate Language. All .NET classes are compiled to IL, which means that classes written in one language can be used by classes written in another language.
Managed	All .NET classes are 'managed'. This means that aspects like memory allocation, memory protection, security and type safety are all managed by the CLR.
Managed Extensions for C++	Microsoft's C++ compiler will produce both managed and unmanaged code. The managed extensions for C++ are exposed as keywords that allow you to mark a class as a .NET class, such code is compiled to IL. Classes that are not marked as .NET classes will compile to native machine code.
Unmanaged	Code that does not run under the CLR.
VB.NET	A .NET language evolved from Visual Basic. VB.NET code only produces managed code and has significant differences with Visual Basic, in particular the data types that can be used.

Contribute to IDR:

To contribute an article to IDR, a click here.

To contact us at IDevResource.com, use our feedback form, or email us.

To comment on the site contact our webmaster.