The price of complexity

April 23, 2008, 1:03 am

My house was haunted. One of the lights would randomly go on or off and random times without anybody fiddling the switch.

The previous owner of our house had installed fancy dimmer light switches. On a whim, we replaced one of the fancy switches with a simple on/off switch. As we took the old fancy switch out, we noticed that it had capacitors, a circuit board, resistors, even a microchip! That's a lot of different things to break down.

The new simple switch worked great, and the associated light no longer randomly toggles. So I conclude the fancy light switch was the problem.

Furthermore, the dimmer switches were more complicated to use because they had some fancy pressure touch thing: a quick fast press would toggle the light; a soft press would dim the light. It took us a while to get used to them.

When I was at Home Depot, the fancy switches were $40; whereas the simple on/off switches were ~$3. The old simple switches throughout our house are still working; and we never really needed the dimming features for the fancy switches.

So the fancy switches were more expensive, harder to use, and haunted. Overall, not an ideal tradeoff for our needs. We'll stick with the simple on/off switches.

I think the same things applies to software design. You can build complicated software with lots of features, but that also means more places where it could break down. If a simple solution does the trick, it may not only be cheaper to build, but easier to use and cheaper to maintain in the long run.

↧

Nice MSDN URLs

May 19, 2008, 4:46 pm

≫ Next: Stuff in Reflection that's not in Metadata

≪ Previous: The price of complexity

I noticed that MSDN finally has nice URLs for the BCL. (Or perhaps that should be "I finally noticed that ...", depending on how long this has been)

So instead of:

http://msdn.microsoft.com/en-us/library/1009fa28.aspx

You can do:

http://msdn.microsoft.com/en-us/library/system.reflection.assembly.loadfrom.aspx

For overloaded methods, it takes you to the disambiguation page.

This actually makes it a lot easier to find BCL APIs when you know the name. Now most of the time, I can just type the name right into the URL. And I have a higher confidence that this link won't get broken.

↧

Stuff in Reflection that's not in Metadata

May 23, 2008, 1:01 am

≫ Next: Updated MSDN forums

≪ Previous: Nice MSDN URLs

Previously, I mentioned some things in Metadata that aren't exposed in Reflection. Here's an opposite case.

While metadata represents static bits on disk, Reflection operates in a live process with access to the CLR's loader. So reflection can represent things the CLR loader and type system may do that aren't captured in the metadata.

For example, an array T[] may implement interfaces, but which set of interfaces is not captured in the metadata. So consider the following code that prints out the interfaces an int[] implements:

using System;

class Foo
{
    static void Main()
    {
        Console.WriteLine("Hi");
        Console.WriteLine(Environment.Version);
        Type[] t2 = typeof(int[]).GetInterfaces();
        foreach (Type t in t2)
        {
            Console.WriteLine(t.FullName);
        }
        Console.WriteLine("done");
    }
}

Compile it once for v1.0. Then run the same binary against v1 and v2. It's the same file (and therefore same metadata) in both cases.
The V1 case is inconsistent because it doesn't show any interfaces, it should at least show a few builtins like IEnumerable (typeof(IEnumerable).IsAssignableFrom(typeof(int[])) == True). But in the v2 case, you see reflection showing the interfaces, particularly the new 2.0 generic interfaces, that the CLR's type system added to the the array. So the V2 list is not the same as the V1 list, but this difference is not captured in the metadata.

C:\temp>c:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\csc.exe t.cs
Microsoft (R) Visual C# .NET Compiler version 7.10.6001.4
for Microsoft (R) .NET Framework version 1.1.4322
Copyright (C) Microsoft Corporation 2001-2002. All rights reserved.


C:\temp>t.exe
Hi
1.1.4322.2407
done

C:\temp>set COMPLUS_VERSION=v2.0.50727

C:\temp>t.exe
Hi
2.0.50727.1433
System.ICloneable
System.Collections.IList
System.Collections.ICollection
System.Collections.IEnumerable
System.Collections.Generic.IList`1[[System.Int32, mscorlib, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]]
System.Collections.Generic.ICollection`1[[System.Int32, mscorlib, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]]
System.Collections.Generic.IEnumerable`1[[System.Int32, mscorlib, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]]
done

↧

Updated MSDN forums

June 10, 2008, 12:41 am

≫ Next: Managed Dump debugging support for Visual Studio and ICorDebug

≪ Previous: Stuff in Reflection that's not in Metadata

The MSDN forums are updated and have a new look and feel. It's at a new link too: http://forums.msdn.microsoft.com/en-US/netfxtoolsdev/threads/ (the old link still forwards).

↧

Managed Dump debugging support for Visual Studio and ICorDebug

November 1, 2008, 12:51 pm

≫ Next: MVP Summit 2009

≪ Previous: Updated MSDN forums

This is the longest I've gone without blogging, but our PDC announcements have stuff way too cool to stay quiet about.

If you saw PDC, you've head that the CLR Debugging API, ICorDebug, will support dump-debugging. This enables any ICorDebug-based debugger (including Visual Studio and MDbg) to debug dump-files of .NET applications. The coolness goes well beyond that, but dump-debugging is just the easiest feature to describe.

This was not an overnight feature, and required some major architectural changes to be plumbed through the entire system. Specifically, when dump-debugging, there's no 'live' debuggee, so you can't rely on a helper-thread running in the debuggee process to service debugging requests anymore, so you need a completely different model.

Rick Byers has an excellent description of the ICorDebug re-architecture in CLR 4.0. He also describes some of the other advancements in the CLR Tools API space. Go read them.

↧

MVP Summit 2009

February 27, 2009, 2:45 pm

≫ Next: Virtual code execution via IL interpretation

≪ Previous: Managed Dump debugging support for Visual Studio and ICorDebug

For those going to the 2009 MVP Summit, I’ll be one of the speakers at the breakout sessions on March 2nd on Microsoft’s Main campus.

↧

Virtual code execution via IL interpretation

May 21, 2009, 8:10 pm

≫ Next: ICustomQueryInterface and CLR V4

≪ Previous: MVP Summit 2009

As Soma announced, we just shipped VS2010 Beta1. This includes dump debugging support for managed code and a very cool bonus feature tucked in there that I’ll blog about today.

Dump-debugging (aka post-mortem debugging) is very useful and a long-requested feature for managed code. The downside is that with a dump-file, you don’t have a live process anymore, and so property-evaluation won’t work. That’s because property evaluation is implemented by hijacking a thread in the debuggee to run the function of interest, commonly a ToString() or property-getter. There’s no live thread to hijack in post-mortem debugging.

We have a mitigation for that in VS2010. In addition to loading the dump file, we can also interpret the IL opcodes of the function and simulate execution to show the results in the debugger.

Here, I’ll just blog about the end-user experience and some top-level points. I’ll save the technical drill down for future blogs.

Consider the following sample:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Reflection;

public class Point 
{
    int m_x;
    int m_y;
    public Point(int x, int y)
    {
        m_x = x;
        m_y = y;
    }
    public override string ToString()
    {
        return String.Format("({0},{1})", this.X, this.Y);
    }

    public int X
    {
        get
        {
            return m_x;
        }
    }
    public int Y
    {
        get
        {
            return m_y;
        }
    }
}

public class Program
{
    static void Main(string[] args)
    {
        Dictionary<int, string> dict = new Dictionary<int, string>();
        dict[5] = "Five";
        dict[3] = "three";

        Point p = new Point(3, 4);
                
    }

    public static int  Dot(Point p1, Point p2)
    {
        int r2 = p1.X * p2.X + p1.Y * p2.Y;
        return r2;
    }

}

Suppose you have a dump-file from a thread stopped at the end of Main() (See newly added menu item “Debug | Save Dump As …”; load dump-file via “File | Open | File …”).

Normally, you could see the locals (dict, p) and their raw fields, but you wouldn’t be able to see the properties or ToString() values. So it would look something like this:

But with the interpreter, you can actually simulate execution. With the IL interpreter, here’s what it looks like in the watch window:

Which is exactly what you’d expect with live-debugging. (In one sense, “everything still works like it worked before” is not a gratifying demo…)

The ‘*’ after the values are indications that they came from the interpreter. Note you still need to ensure that property-evaluation is enabled in “Tools | options | Debugging”:

How does it work?
The Interpreter gets the raw IL opcodes via ICorDebug and then simulates execution of those opcodes. For example, when you inspect “p.X” in the watch window, the debugger can get the raw IL opcodes:

.method public hidebysig specialname instance int32
        get_X() cil managed
{
// Code size       12 (0xc)
.maxstack 1
.locals init ([0] int32 CS$1$0000)
IL_0000: nop
IL_0001: ldarg.0
IL_0002: ldfld      int32 Point::m_x
IL_0007: stloc.0
IL_0008: br.s       IL_000a
IL_000a: ldloc.0
IL_000b: ret
} // end of method Point::get_X

And then translate that ldfld opcode into a ICorDebug field fetch the same way it would fetch “p.m_x”. The problem gets a lot harder then that (eg, how does it interpret a newobj instruction?) but that’s the basic idea.

Other things it can interpret:

The immediate window is also wired up to use the interpreter when dump-debugging. Here are some sample things that work. Again, note the ‘*’ means the results are in the interpreter and the debuggee is not modified.

Simulating new objects:
? new Point(10,12).ToString()
"(10,12)"*

Basic reflection:
? typeof(Point).FullName
"Point"*

Dynamic method invocation:
? typeof(Point).GetMethod("get_X").Invoke(new Point(6,7), null)
0x00000006*

Calling functions, and even mixing debuggee data (the local variable ‘p’) with interpreter generated data (via the ‘new’ expression):
? Dot(p,new Point(10,20))
110*

It even works for Visualizers

Notice that it can even load the Visualizer for the Dictionary (dict) and show you the contents as a pretty array view rather than just the raw view of buckets. Visualizers are their own dll, and we can verify that the dll is not actually loaded into the debuggee. For example, the Dictionary visualizer dll is Microsoft.VisualStudio.DebuggerVisualizers.dll, but that’s not in the module list:

That’s because the interpreter has virtualized loading the visualizer dll into its own “virtual interpreter” space and not the actual debuggee process space. That’s important because in a dump file, you can’t load a visualizer dll post-mortem.

Other issues:

There are lots of other details here that I’m skipping over, like:

The interpreter is definitely not bullet proof. If it sees something it can’t interpreter (like a pinvoke or dangerous code), then it simply aborts the interpretation attempt.
The intepreter is recursive, so it can handle functions that call other functions. (Notice that ToString call get_X.)
How does it deal with side-effecting operations?
How does it handle virtual dispatch call opcodes?
How does it handle ecalls?
How does it handle reflection

Other advantages?

There are other advantages of IL interpretation for function evaluation, mainly that it addresses the ”func-eval is evil” problems by essentially degenerating dangerous func-evals to safe field accesses.

It is provably safe because it errs on the side of safety. The interpreter is completely non-invasive (it operates on a dump-file!).
No accidentally executing dangerous code.
Side-effect free func-evals. This is a natural consequence of it being non-invasive.
Bullet proof func-eval abort.
Bullet proof protection against recursive properties that stack-overflow.
It allows func-eval to occur at places previously impossible, such as in dump-files, when the thread is in native code, retail code, or places where there is no thread to hijack.

Closing thoughts

We realize that the interpreter is definitely not perfect. That’s part of why we choose to have it active in dump-files but not replace func-eval in regular execution. For dump-file scenarios, it took something that would have been completely broken and made many things work.

↧

ICustomQueryInterface and CLR V4

July 9, 2009, 8:02 pm

≫ Next: Writing a CLR Debugger in Python

≪ Previous: Virtual code execution via IL interpretation

CLR V4 fixes an issue with COM-interop that’s been bothering me for a while. The problem is that unless you’re using a PIA, you can often have either your caller or callee be managed code, but not both.

You can import the same COM-classic interface into managed code multiple times, and two components can naturally end up with two different .NET types representing the same single COM-classic interface. Furthermore, the types can be imported with different managed signatures because of things like [PreserveSig] attributes and different ways to marshal data types.

(PIAs are supposed to alleviate that by providing a single unified definition, but then getting multiple components to agree on that unified definition is its own problem. CLR V4 added support for to avoid requiring PIAs. see NoPia. )

When managed code calls a COM interface that is implemented by managed code (Managed –> Native –> Managed), the CLR detects that the COM-object is really a managed implementation and creates a direct managed call (Managed –> Managed). So if your caller and callee are bound to different .NET types for the interface, the CLR won’t realize it’s a COM-interface call and will just fail on the .NET type mismatch.

CLR V4 fixes this by adding the ICustomQueryInterface that lets a managed object really act as if it’s native code when being called through COM-interop.

Where I hit this…

I hit this as I was writing debugger code in Managed. We had a managed application (MDbg) that used COM-interop to call into a native implementation of ICorDebug. That worked great (managed calling native). Later, we had some cases of creating a managed implementation of certain ICorDebug interfaces. But that failed from Mdbg because they had different COM-interop import definitions for ICorDebug.

This also just naturally starts showing up as people are porting more and more of their legacy systems to managed code.

Code sample demonstrating the problem

Say you have a COM-classic interface IFoo, imported by 2 different components:

[ComImport, InterfaceType(1), ComConversionLoss, Guid("FC3E287D-D659-4E1D-81D5-9D29398C7237")]
interface IFoo1
{
    [PreserveSig]
    int Thing(int x);
}

[ComImport, InterfaceType(1), ComConversionLoss, Guid("FC3E287D-D659-4E1D-81D5-9D29398C7237")]
interface IFoo2
{        
    void Thing(int x);
}

Now suppose you have a class C1 that implements IFoo1. IFoo1 and IFoo2 represent the same COM interface and have the same GUID, so you’d like to be able to use them interchangeably. However, they’re 2 different .NET types. typeof(IFoo1) != typeof(IFoo2).

Ideally, you’d like the following snippet to succeed and call Thing(5) and Thing(3) on the object instance obj.

static void Main(string[] args)
{
    Console.WriteLine("Hi!");

    object obj = new C1();
    obj = GetRCW(obj); // get a COM object for C1

    IFoo1 f1 = (IFoo1)obj;
    IFoo2 f2 = (IFoo2)obj; // Fails!!! obj is really a C1, and can’t cast to a IFoo2
    f1.Thing(5);
    f2.Thing(3);
}

Run that and you get an invalid cast exception because it can’t cast C1 to a IFoo2. It doesn’t understand that IFoo2 and IFoo1 are the same interface..

C:\TEMP>b.exe
Hi!

Unhandled Exception: System.InvalidCastException: Unable to cast object of type 'ConsoleApplication4.C1' to type 'ConsoleApplication4.IFoo2'.
   at ConsoleApplication4.Program.Main(String[] args)

ICustomQueryInterface to the rescue.

CLR 4 allows an object to hook ICustomQueryInterface and have fine grain control over QI calls. The CLR detects that C1 is really a managed object because of a QI for a secret interface, IManagedObject. C1 can intercept this QI call and fail it (by returning CustomQueryInterfaceResult.Failed), and thus look like a native object. It passes all other QI calls through to the RCW’s QI handling (via returning CustomQueryInterfaceResult.NotHandled)

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Runtime.InteropServices;

namespace ConsoleApplication4
{
    [ComImport, InterfaceType(1), ComConversionLoss, Guid("FC3E287D-D659-4E1D-81D5-9D29398C7237")]
    interface IFoo1
    {
        [PreserveSig]
        int Thing(int x);
    }

    [ComImport, InterfaceType(1), ComConversionLoss, Guid("FC3E287D-D659-4E1D-81D5-9D29398C7237")]
    interface IFoo2
    {        
        void Thing(int x);
    }


    class C1 : IFoo1, ICustomQueryInterface
    {

        static readonly Guid IID_IMarshal = new Guid("00000003-0000-0000-C000-000000000046");
        static readonly Guid IID_IManagedObject = new Guid("C3FCC19E-A970-11d2-8B5A-00A0C9B7C9C4");

        CustomQueryInterfaceResult ICustomQueryInterface.GetInterface(ref Guid iid, out IntPtr ppv)
        {
            if (iid == IID_IMarshal ||
                iid == IID_IManagedObject
                )
            {
                ppv = IntPtr.Zero;
                return CustomQueryInterfaceResult.Failed;
            }

            ppv = IntPtr.Zero;
            return CustomQueryInterfaceResult.NotHandled;
        }



        #region IFoo1 Members

        public int Thing(int x)
        {
            Console.WriteLine("Inside C1={0}", x);
            return 0;
        }

        #endregion
    }


   
    class Program
    {
        // Convert it to a RCW
        static object GetRCW(object o)
        {
            IntPtr ip = IntPtr.Zero;
            try
            {
                ip = Marshal.GetIUnknownForObject(o);
                return Marshal.GetObjectForIUnknown(ip);
            }
            finally
            {
                Marshal.Release(ip);
            }
        }

        static void Main(string[] args)
        {
            Console.WriteLine("Hi!");

            object c1 = new C1();
            object obj = GetRCW(c1);

            Console.WriteLine(c1.GetType().FullName); // ConsoleApplication4.C1
            Console.WriteLine(obj.GetType().FullName); // System.__ComObject

            IFoo1 f1 = (IFoo1)obj;
            IFoo2 f2 = (IFoo2)obj;
            f1.Thing(5);
            f2.Thing(3);
        }
    }
}

We call GetRCW() to convert the object to a Runtime Callable Wrapper (RCW) so that the CLR will actually do COM-interop dispatch. You can observe the GetType() calls on c1 vs. obj.

Notice now both calls via Thing(5) and Thing(3) succeed. And they’re even going through different import signatures.

C:\TEMP>a.exe
Hi!
ConsoleApplication4.C1
System.__ComObject
Inside C1=5
Inside C1=3

Thanks to Paul Harrington (a dev on the Visual Studio Platform team) and Misha Shneerson for pointing me at the new functionality and code snippet for C1.GetInterface() Misha has a lot more information about CLR V4 COM-interop advances on his blog at http://blogs.msdn.com/mshneer/.

↧

Writing a CLR Debugger in Python

September 8, 2009, 11:13 pm

≫ Next: Speaking at Lake County .NET User’s Group

≪ Previous: ICustomQueryInterface and CLR V4

Harry Pierson has written an excellent set of blog entries about writing a managed debugger in IronPython. He builds on the ICorDebug managed wrappers that we ship in Mdbg and explains many of the concepts for how to write a debugger, such as managing breakpoints.

Speaking at Lake County .NET User’s Group

September 9, 2009, 3:16 pm

≫ Next: Windows Phone 7

≪ Previous: Writing a CLR Debugger in Python

I’ll be speaking at the Lake County .NET User’s Group (LCNUG) near Chicago, Illinois on September 24th. I’ll be talking about new features in C# 4.0, including named and optional parameters, dynamic support, scripting, office interop and No-PIA (Primary-Interop-Assemblies) support.

The permalink for the event is here.

If you’re in the area, swing on by!

↧

Windows Phone 7

December 12, 2010, 7:24 pm

≫ Next: Python Tools for VS

≪ Previous: Speaking at Lake County .NET User’s Group

I recently got the newly released Windows Phone 7 (the Samsung Focus). So far, I love it! This is my first smart-phone. It’s nice to join the 21st century.

I’m also poking around with how to write apps for it. It was easy to download C# Express and WP7 tools and get started with the emulator.

I started with ScottGu’s blog announcing the release of WP7 Dev tools. You can download the tools here.
I did the “My first WP7 app” tutorial, which shows hosting a web browser control. It worked flawlessly in the emulator.
I also looked at Scott’s blog on writing a Twitter app.
This is a nice MSDN summary of basic tasks.

WP7 user apps are written in WPF (or XNA). I never really needed WPF when writing debugging stacks, so I’m feeling the pain from the learning curve here.

↧

Python Tools for VS

September 20, 2011, 11:16 am

≫ Next: Pyvot for Excel

≪ Previous: Windows Phone 7

I’ve been having a great time using Python Tools for VS. It’s a free download that provides CPython language support in Visual Studio 2010. The intellisense is pretty good (especially for a dynamic language!) and the debugger is useful to have. Having a good IDE is changing the way I view the language. Check out the homepage for a long list of features it supports. One other perk is that because it’s using the VS 2010 shell, it works with my favorite VS 2010 editor extensions.

↧

Pyvot for Excel

November 9, 2011, 9:57 am

≫ Next: OpenSource CSV Reader on Nuget

≪ Previous: Python Tools for VS

I’m thrilled to see the availability of Pyvot, a python package for manipulating tabular data in excel. This is part of the Python Tools for Visual Studio (PTVS) ecosystem.

Check out the codeplex site at http://pytools.codeplex.com/wikipage?title=Pyvot or the tutorial on python.org.

Excel does expose an object model through COM, but it’s tricky to use. Pyvot provides a very simple python programming experience that focuses on your data instead of Excel COM object trivia. Here are some of my favorite examples:

Easy to send python data into excel, manipulate it in excel, and then send it back to python.
if you ask for a column in Excel’s object model, it will give you back the entire Excel column, including the one million empty cells. Wheras Pyvot will just give you back the data you used.
Pyvot will recognize column header names from tables.
Pyvot makes it easy to compute new columns and add them to your table.
Pyvot makes it easy to connect to an existing excel workbook, even if the workbook has not even been saved to a file. (This involved scanning down the running object table, and doing smart name matching). This allows you to use excel as a scratchpad for python.
Pyvot works naturally with Excel’s existing auto-filters. This enables a great scenario where you can start with data in python, send it to excel and manipulate it with excel auto filters (sort it, remove bad values, etc), and then pull the cleaned data back into python.

Some other FAQs:

What can’t Pyvot do? Pyvot is really focused on tabular data. Excel becomes a Datatable viewer for Python. However Pyvot is not intended to be a full excel automation solution.
How does Pyvot compare to VBA? a) Pyvot is just Python and so you can use vast existing Python libraries. b) Also, VBA is embedded in a single excel workbook and is hard to share across workbooks. Pyvot is about real Python files that live outside of the workbook and can be shared and managed under source control. c) VBA uses the excel object model, whereas Pyvot provides a much simpler experience for tabular data.
How does Pyvot compare to an Excel-addin? a) Pyvot runs entirely out-of-process, so you don’t need to worry about it crashing Excel on you. b) Excel-addins, like VBA, use the excel object model. c) Excel addins need to be installed. Pyvot is just loose python files that don’t interfere with your excel installation.

Anyway, if you need to excel goodness, especially filters, check out Pyvot and PTVS.

↧

OpenSource CSV Reader on Nuget

March 24, 2012, 9:31 am

≫ Next: ASP.Net WebAPI

≪ Previous: Pyvot for Excel

I did some volunteer work a few years ago that required processing lots of CSV files. So I solved the problems by writing a C# CSV reader, which I wanted to share here. The basic features here are:

be easy to use
read and write CSV files (and support tab and “|” delimiters too)
create CSV files around IEnumerable<T>, dictionaries, and other sources.
Provide a “linq to CSV” experience
provide both in-memory mutable tables and streaming over large data sources (thank you polymorphism!)
provide basic analysis operations like histogram, join, find duplicates, etc. The operations I implemented were driven entirely by the goals I had for my volunteer work.
Read from Excel
Work with Azure. (This primarily means no foolish dependencies, and support TextReader/TextWriter instead of always hitting the file system)

I went ahead and put it on github at https://github.com/MikeStall/DataTable. And it’s available for download via Nuget (see “CsvTools”). It’s nice to share, and maybe somebody else will find this useful. But selfishly, I’ve used this library for quite a few tasks over the years and putting it on Github and Nuget also makes it easier for me to find for future projects.

There are the obvious disclaimers here that this was just a casual side project I did as a volunteer, and so use as is.

Step 1: Install “CsvTools” via Nuget:

When you right click on the project references node, just select “Add Library Package Reference”. That will bring up the nuget dialog which will search the online repository for packages. Search for “CsvTools” and then you can instantly install it. It’s built against CLR 4.0, but has no additional dependencies.

Example 1: Loading from a CSV file

Here’s a CSV at: c:\temp\test.csv

name, species
Kermit, Frog
Ms. Piggy, Pig
Fozzy, Bear

To open and print the contents of the file:

using System;
using DataAccess; // namespace that Csv reader lives in

class Program
{
    static void Main(string[] args)
    {
        DataTable dt = DataTable.New.ReadCsv(@"C:\temp\test.csv");

        // Query via the DataTable.Rows enumeration.
        foreach (Row row in dt.Rows)
        {
            Console.WriteLine(row["name"]);
        }        
    }
}

There are a bunch of extension methods hanging off “DataTable.New” to provide different ways of loading a table. ReadCsv will load everything into memory, which allows mutation operations (see below). But this also supports streaming operations via the methods with “lazy” in their name, such as ReadLazy().

Example 2: Creating a CSV from an IEnumerable<T> and saving back to a file

Here’s creating a table from an IEnumerable<T>, and then saving that back to a TextWriter (in this case, Console.Out).

var vals = from i in Enumerable.Range(1, 10) select new { N = i, NSquared = i * i };
DataTable dt = DataTable.New.FromEnumerable(vals);
dt.SaveToStream(Console.Out);

Which produces this CSV:

N,NSquared
1,1
2,4
3,9
4,16
5,25
6,36
7,49
8,64
9,81
10,100

Example 3: Mutations

DataTable is actually an abstract base class. There are two primary derived classes:

MutableDataTable,, which loads everything into memory, stores it in column major order, and provides mutation operations.
streaming data table, which provides streaming access over a rows. This is obviously row major order, and doesn’t support mutation. The streaming classes are non-public derived classes of DataTable.

Most of the builder functions that load in memory actually return the derived MutableDataTable object anyways. A MutableDataTable is conceptually a giant 2d string array stored in column major order. So adding new columns or rearranging columns is cheap. Adding rows is expensive. Here’s an example of some mutations:

static void Main(string[] args)
{
    MutableDataTable dt = DataTable.New.ReadCsv(@"C:\temp\test.csv");

    // Mutations
    dt.ApplyToColumn("name", originalValue => originalValue.ToUpper());
    dt.RenameColumn(oldName:"species", newName: "kind");
    
    
    int id = 0;
    dt.CreateColumn("id#", row => { id++; return id.ToString(); });

    dt.GetRow(1)["kind"] = "Pig!!"; // update in place by row
    dt.Columns[0].Values[2] = "Fozzy!!"; // update by column

    // Print out new table
    dt.SaveToStream(Console.Out);        
}

Produces and prints this table:

name,kind,id#
KERMIT,Frog,1
MS. PIGGY,Pig!!,2
Fozzy!!,Bear,3

There’s a builder function, DataTable.New.GetMutableCopy, which produces a mutable copy from an arbitrary DataTable.

Example 4: Analysis

I needed some basic analysis functions, like join, histogram, select duplicates, sample, and where. These sit as static methods in the Analyze class.

Here’s an example of creating a table with random numbers, and then printing the histogram:

static void Main(string[] args)
{   
    // Get a table of 1000 random numbers
    Random r = new Random();
    DataTable dt = DataTable.New.FromEnumerable(
        from x in Enumerable.Range(1, 1000) 
        select r.Next(1, 10));

    Tuple<string,int>[] hist = Analyze.AsHistogram(dt, columnIdx: 0);
    
    // Convert the tuple[] to a table for easy printing
    DataTable histTable = DataTable.New.FromTuple(hist, 
        columnName1: "value",
        columnName2: "frequency");
    histTable.SaveToStream(Console.Out);
}

Produces this result:

value,frequency
9,151
8,124
2,118
7,110
3,107
5,104
1,101
6,99
4,86

↧

ASP.Net WebAPI

March 30, 2012, 8:48 pm

≫ Next: How WebAPI does Parameter Binding

≪ Previous: OpenSource CSV Reader on Nuget

I recently joined the ASP.Net team and have been working on WebAPI, which is a new .NET MVC-like framework for building HTTP web services. (This is certainly a change of pace from my previous life in the world of compilers and debuggers, but I’m having a blast )

ScottGu gave a nice overview of WebAPI here and just announced that WebAPI has gone open source on Codeplex with GIT. It’s nice to be able to check in a feature and then immediately blog about it.

A discussion forum for WebAPI is here. The codeplex site is here.

↧

How WebAPI does Parameter Binding

April 16, 2012, 2:33 pm

≫ Next: MVC Style parameter binding for WebAPI

≪ Previous: ASP.Net WebAPI

Here’s an overview of how WebAPI binds parameters to an action method. I’ll describe how parameters can be read, the set of rules that determine which technique is used, and then provide some examples.

[update] Parameter binding is ultimately about taking a HTTP request and converting it into .NET types so that you can have a better action signature.

The request message has everything about the request, including the incoming URL with query string, content body, headers, etc. Eg, without parameter binding, every action would have to take the request message and manually extract the parameters, kind of like this:

public object MyAction(HttpRequestMessage request)
{
        // make explicit calls to get parameters from the request object
        int id = int.Parse(request.RequestUri.ParseQueryString().Get("id")); // need error logic!
        Customer c = request.Content.ReadAsAsync<Customer>().Result; // should be async!
        // Now use id and customer
}

That’s ugly, error prone, repeats boiler plate code, is missing corner cases, and hard to unit test. You want the action signature to be something more relevant like:

public object MyAction(int id, Customer c) { }

So how does WebAPI convert from a request message into real parameters like id and customer?

Model Binding vs. Formatters

There are 2 techniques for binding parameters: Model Binding and Formatters. In practice, WebAPI uses model binding to read from the query string and Formatters to read from the body.

(1) Using Model Binding:

ModelBinding is the same concept as in MVC, which has been written about a fair amount (such as here). Basically, there are “ValueProviders” which supply pieces of data such as query string parameters, and then a model binder assembles those pieces into an object.

(2) Using Formatters:

Formatters (see the MediaTypeFormatter class) are just traditional serializers with extra metadata such as the associated content type. WebAPI gets the list of formatters from the HttpConfiguration, and then uses the request’s content-type to select an appropriate formatter. WebAPI has some default formatters. The default JSON formatter is JSON.Net. There is an Xml formatter and a FormUrl formatter that uses JQuery’s syntax.

The key method is MediaTypeFormatter.ReadFromStreayAsync, which looks :

public virtual Task<object> ReadFromStreamAsync(
    Type type, 
    Stream stream, 
    HttpContentHeaders contentHeaders, 
    IFormatterLogger formatterLogger)

Type is the parameter type being read, which is passed to the serializer. Stream is the request’s content stream. The read function then reads the stream, instantiates an object, and returns it.

HttpContentHeaders are just from the request message. IFormatterLogger is a callback interface that a formatter can use to log errors while reading (eg, malformed data for the given type).

Both model binding and formatters support validation and log rich error information. However, model binding is significantly more flexible.

When do we use which?

Here are the basic rules to determine whether a parameter is read with model binding or a formatter:

If the parameter has no attribute on it, then the decision is made purely on the parameter’s .NET type. “Simple types” uses model binding. Complex types uses the formatters. A “simple type” includes: primitives, TimeSpan, DateTime, Guid, Decimal, String, or something with a TypeConverter that converts from strings.
You can use a [FromBody] attribute to specify that a parameter should be from the body.
You can use a [ModelBinder] attribute on the parameter or the parameter’s type to specify that a parameter should be model bound. This attribute also lets you configure the model binder. [FromUri] is a derived instance of [ModelBinder] that specifically configures a model binder to only look in the URI.
The body can only be read once. So if you have 2 complex types in the signature, at least one of them must have a [ModelBinder] attribute on it.

It was a key design goal for these rules to be static and predictable.

Only one thing can read the body

A key difference between MVC and WebAPI is that MVC buffers the content (eg, request body). This means that MVC’s parameter binding can repeatedly search through the body to look for pieces of the parameters. Whereas in WebAPI, the request body (an HttpContent) may be a read-only, infinite, non-buffered, non-rewindable stream.

That means that parameter binding needs to be very careful about not reading the stream unless it’s guaranteeing to bind a parameter. The action body may want to read the stream directly, and so WebAPI can’t assume that it owns the stream for parameter binding. Consider this example action:

        // Action saves the request’s content into an Azure blob 
        public Task PostUploadfile(string destinationBlobName)
        {
            // string should come from URL, we’ll read content body ourselves.
            Stream azureStream = OpenAzureStorage(destinationBlobName); // stream to write to azure
            return this.Request.Content.CopyToStream(azureStream); // upload body contents to azure. 
        }

The parameter is a simple type, and so it’s pulled from the query string. Since there are no complex types in the action signature, webAPI never even touches the request content stream, and so the action body can freely read it.

Some examples

Here are some examples of various requests and how they map to action signatures.

/?id=123&name=bob
void Action(int id, string name) // both parameters are simple types and will come from url

/?id=123&name=bob
void Action([FromUri] int id, [FromUri] string name) // paranoid version of above.

void Action([FromBody] string name); // explicitly read the body as a string.

public class Customer { // a complex object
public string Name { get; set; }
public int Age { get; set; }
}

/?id=123
void Action(int id, Customer c) // id from query string, c is a complex object, comes from body via a formatter.

void Action(Customer c1, Customer c2) // error! multiple parameters attempting to read from the body

void Action([FromUri] Customer c1, Customer c2) // ok, c1 is from the URI and c2 is from the body

void Action([ModelBinder(MyCustomBinder)] SomeType c) // Specifies a precise model binder to use to create the parameter.

[ModelBinder(MyCustomBinder)] public class SomeType { } // place attribute on type declaration to apply to all parameter instances
void Action(SomeType c) // attribute on c’s declaration means it uses model binding.

Differences with MVC

Here are some differences between MVC and WebAPI’s parameter binding:

MVC only had model binders and no formatters. That’s because MVC would model bind over the request’s body (which it commonly expected to just be FormUrl encoded), whereas WebAPI uses a serializer over the request’s body.
MVC buffered the request body, and so could easily feed it into model binding. WebAPI does not buffer the request body, and so does not model bind against the request body by default.
WebAPI’s binding can be determined entirely statically based off the action signature types. For example, in WebAPI, you know statically whether a parameter will bind against the body or the query string. Whereas in MVC, the model binding system would search both body and query string.

↧

MVC Style parameter binding for WebAPI

April 18, 2012, 5:42 pm

≫ Next: How to bind to custom objects in action signatures in MVC/WebAPI

≪ Previous: How WebAPI does Parameter Binding

I described earlier how WebAPI binds parameters. The entire parameter binding behavior is determined by the IActionValueBinder interface and can be swapped out. The default implementation is DefaultActionValueBinder.

Here’s another IActionValueBinder that provides MVC parameter binding semantics. This lets you do things that you can’t do in WebAPI’s default binder, specifically:

ModelBinds everything, including the body. Assumes the body is FormUrl encoded
This means you can do MVC scenarios where a complex type is bound with one field from the query string and one field from the form data in the body.
Allows multiple parameters to be bound from the body.

Brief description of IActionValueBinder

Here’s what IActionValueBinder looks like:

    public interface IActionValueBinder
    {
        HttpActionBinding GetBinding(HttpActionDescriptor actionDescriptor);
    }

This is called to bind the parameters. It returns a HttpActionBinding object, which is a 1:1 with an ActionDescriptor. It can be cached across requests. The interesting method on that binding object is:

    public virtual Task ExecuteBindingAsync(HttpActionContext actionContext, CancellationToken cancellationToken)

This will execute the bindings for all the parameters, and signal the task when completed. This will invoke model binding, formatters, or any other parameter binding technique. The parameters are added to the actionContext’s parameter dictionary.

You can hook IActionValueBinder to provide your own binding object, which can have full control over binding the parameters. This is a bigger hammer than adding formatters or custom model binders.

You can hook up an IActionValueBinder either through the service resolver of the HttpControllerConfiguration attribute on a controller.

Example usage:

Here’s a an example usage. Suppose you have this code on the server. This is using the HttpControllerConfiguration attribute, and so all of the actions on that controller will use the binder. However, since it’s per-controller, that means it can still peacefully coexist with other controllers on the server.

    public class Customer
    {
        public string name { get; set; }
        public int age { get; set; }
    }

    [HttpControllerConfiguration(ActionValueBinder=typeof(MvcActionValueBinder))]
    public class MvcController : ApiController
    {
        [HttpGet]
        public void Combined(Customer item)
        {
        }
    }

And then here’s the client code to call that same action 3 times, showing the fields coming from different places.

        static void TestMvcController()
        {
            HttpConfiguration config = new HttpConfiguration();
            config.Routes.MapHttpRoute("Default", "{controller}/{action}", new { controller = "Home" });

            HttpServer server = new HttpServer(config);
            HttpClient client = new HttpClient(server);

            // Call the same action. Action has parameter with 2 fields. 

            // Get one field from URI, the other field from body
            {
                HttpRequestMessage request = new HttpRequestMessage
                {
                    Method = HttpMethod.Get,
                    RequestUri = new Uri("http://localhost:8080/Mvc/Combined?age=10"),
                    Content = FormUrlContent("name=Fred")
                };

                var response = client.SendAsync(request).Result;
            }

            // Get both fields from the body
            {
                HttpRequestMessage request = new HttpRequestMessage
                {
                    Method = HttpMethod.Get,
                    RequestUri = new Uri("http://localhost:8080/Mvc/Combined"),
                    Content = FormUrlContent("name=Fred&age=11")
                };

                var response = client.SendAsync(request).Result;
            }

            // Get both fields from the URI
            {
                var response = client.GetAsync("http://localhost:8080/Mvc/Combined?name=Bob&age=20").Result;
            }
        }

        static HttpContent FormUrlContent(string content)
        {
            return new StringContent(content, Encoding.UTF8, "application/x-www-form-urlencoded");
        }

The MvcActionValueBinder:

Here’s the actual code for the binder. Under 100 lines. (Disclaimer: this requires the latest sources. I verified against this change. I had to fix an issue that allowed ValueProviderFactory.GetValueProvider to return null).

Notice that it reads the body once per request, creates a per-request ValueProvider around the form data, and stashes that in request-local-storage so that all of the parameters share the same value provider. This sharing is essential because the body can only be read once.

// Example of MVC-style action value binder.
using System;
using System.Collections.Generic;
using System.Collections.Specialized;
using System.Globalization;
using System.Linq;
using System.Net.Http;
using System.Net.Http.Formatting;
using System.Threading;
using System.Threading.Tasks;
using System.Web.Http;
using System.Web.Http.Controllers;
using System.Web.Http.ModelBinding;
using System.Web.Http.ValueProviders;
using System.Web.Http.ValueProviders.Providers;

namespace Basic
{    
    // Binder with MVC semantics. Treat the body as KeyValue pairs and model bind it. 
    public class MvcActionValueBinder : DefaultActionValueBinder
    {
        // Per-request storage, uses the Request.Properties bag. We need a unique key into the bag. 
        private const string Key = "5DC187FB-BFA0-462A-AB93-9E8036871EC8";

        public override HttpActionBinding GetBinding(HttpActionDescriptor actionDescriptor)
        {
            MvcActionBinding actionBinding = new MvcActionBinding();
                                    
            HttpParameterDescriptor[] parameters = actionDescriptor.GetParameters().ToArray();
            HttpParameterBinding[] binders = Array.ConvertAll(parameters, p => DetermineBinding(actionBinding, p));

            actionBinding.ParameterBindings = binders;
                        
            return actionBinding;            
        }

        private HttpParameterBinding DetermineBinding(MvcActionBinding actionBinding, HttpParameterDescriptor parameter)
        {
            HttpConfiguration config = parameter.Configuration;

            var attr = new ModelBinderAttribute(); // use default settings
            
            ModelBinderProvider provider = attr.GetModelBinderProvider(config);
            IModelBinder binder = provider.GetBinder(config, parameter.ParameterType);

            // Alternatively, we could put this ValueProviderFactory in the global config.
            List<ValueProviderFactory> vpfs = new List<ValueProviderFactory>(attr.GetValueProviderFactories(config));
            vpfs.Add(new BodyValueProviderFactory());

            return new ModelBinderParameterBinding(parameter, binder, vpfs);
        }   

        // Derive from ActionBinding so that we have a chance to read the body once and then share that with all the parameters.
        private class MvcActionBinding : HttpActionBinding
        {                
            // Read the body upfront , add as a ValueProvider
            public override Task ExecuteBindingAsync(HttpActionContext actionContext, CancellationToken cancellationToken)
            {
                HttpRequestMessage request = actionContext.ControllerContext.Request;
                HttpContent content = request.Content;
                if (content != null)
                {
                    FormDataCollection fd = content.ReadAsAsync<FormDataCollection>().Result;
                    if (fd != null)
                    {
                        NameValueCollection nvc = fd.ReadAsNameValueCollection();

                        IValueProvider vp = new NameValueCollectionValueProvider(nvc, CultureInfo.InvariantCulture);

                        request.Properties.Add(Key, vp);
                    }
                }
                        
                return base.ExecuteBindingAsync(actionContext, cancellationToken);
            }
        }

        // Get a value provider over the body. This can be shared by all parameters. 
        // This gets the values computed in MvcActionBinding.
        private class BodyValueProviderFactory : ValueProviderFactory
        {
            public override IValueProvider GetValueProvider(HttpActionContext actionContext)
            {
                object vp;
                actionContext.Request.Properties.TryGetValue(Key, out vp);
                return (IValueProvider)vp; // can be null                
            }
        }
    }
}

↧

How to bind to custom objects in action signatures in MVC/WebAPI

April 20, 2012, 2:30 pm

≫ Next: How to create a custom value provider in WebAPI

≪ Previous: MVC Style parameter binding for WebAPI

MVC provides several ways for binding your own arbitrary parameter types. I’ll describe some common MVC ways and then show how this applies to WebAPI too. You can view this as a MVC-to-WebAPI migration guide. (Related reading: How WebAPI binds parameters )

Say we have a complex type, Location, which just has an X and Y. And we want to create that by invoking a Parse(string) function. The question then becomes: how do I wire up my custom Parse(string) function into WebAPI’s parameter binding system?

Query string: /?loc=123,456

And then this action gets invoked and the parameter is bound from the query string:

        public object MyAction(Location loc) 
        {
            // expect that loc.X = 123, loc.Y = 456
        }

Here’s the C# code for the my Location class, plus the essential parse function:

    // A complex type
    public class Location
    {        
        public int X { get; set; }
        public int Y { get; set; }

        // Parse a string into a Location object. "1,2" --> Loc(X=1,Y=2)
        public static Location TryParse(string input)
        {
            var parts = input.Split(',');
            if (parts.Length != 2)
            {
                return null;
            }

            int x,y;
            if (int.TryParse(parts[0], out x) && int.TryParse(parts[1], out y))
            {
                return new Location { X = x, Y = y };                
            }

            return null;
        }

        public override string ToString()
        {
            return string.Format("{0},{1}", X, Y);
        }
    }

Option Fail: what if I do nothing?

If you just define a Location class, but don’t tell WebAPI/MVC about the parse function, it won’t know how to bind it. It may make a best effort, but the Location parameter will be empty.

In WebAPI, we’ll see Location is a complex type, assume it’s coming from the request’s body and so try to invoke a Formatter on it. WebAPI will search for a formatter that matches the content type and claims to handle the Location type. The formatter likely won’t find anything in the body and leave the parameter empty.

Option #1: Manually call the parse function

You can always take the string in the action signature and manually call the parse function yourself.

        public object MyAction1(string loc)
        {
            Location loc2 = Location.TryParse(loc); // explicitly convert string
            // now use loc2 ... 
        }

You can still do this in WebAPI, exactly as is.

What does WebAPI do under the hood? In WebAPI, the string parameter is seen as a simple type, and so it uses model binding to pull ‘loc’ from the query string.

Option #2: Use a TypeConverter to make the complex type be simple

Or we can do it with a TypeConverter. This just teachers the model binding system about where to find the Parse() function for the given type.

    public class LocationTypeConverter : TypeConverter
    {
        public override bool CanConvertFrom(ITypeDescriptorContext context, Type sourceType)
        {
            if (sourceType == typeof(string))
            {
                return true;
            }
            return base.CanConvertFrom(context, sourceType);
        }

        public override object ConvertFrom(ITypeDescriptorContext context, 
           System.Globalization.CultureInfo culture, object value)
        {
            if (value is string)
            {
                return Location.TryParse((string) value);
            }

            return base.ConvertFrom(context, culture, value);
        }
    }

And then add the appropriate attribute to the Location’s type declaration:

   [TypeConverter(typeof(LocationTypeConverter))]
   public class Location
   {  ... }

Now in both MVC and WebAPI, your action will get called and the Location parameter is bound:

public object MyAction(Location loc)        
{
   // use loc
}

What does WebAPI do under the hood? The presence of a TypeDescriptor that converts from string means that WebAPI classifies this a “simple type”. Simple types use model binding. WebAPI will get ‘loc’ from the query string by matching the parameter name, see the parameter’s type is “Location” and then invoke the TypeConverter to convert from string to Location.

Option #3: Use a custom model binder

Another way is to use a custom model binder. This essentially just teachers the model binding system about the Location parse function. There are two key parts here:
a) defining the model binder and
b) wiring it up to the system so that it gets used.

Part a) Writing a custom model binder:

Here’s in MVC:

    public class LocationModelBinder : IModelBinder
    {
        public object BindModel(ControllerContext controllerContext, ModelBindingContext bindingContext)
        {
            string key = bindingContext.ModelName;
            ValueProviderResult val = bindingContext.ValueProvider.GetValue(key);
            if (val != null)
            {
                string s = val.AttemptedValue as string;
                if (s != null)
                {
                    return Location.TryParse(s);
                }
            }
            return null;
        }
    }

Of course, once you’ve written a custom model binder, you can do a lot more with it than just call a Parse() function. But that’s another topic…

Defining a custom model binder is very similar in WebAPI. We still have a correpsonding IModelBinder interface, and the design pattern is the same, but its signature is slightly different:

    public bool BindModel(HttpActionContext actionContext, ModelBindingContext bindingContext)

MVC takes in a controller context, whereas WebAPI takes in an actionContext (which has a reference to a controller context). And MVC returns the object for the model, whereas WebAPI returns a bool and sets the model result on the binding context. (As a reminder, WebAPI and MVC share design patterns, but have different types. So while you can often cut and paste code between them, you may need to touch up namespaces)

Part B) now we need to wire up the model binder.

In both MVC and WebAPI, there are 3 places you can do this.

1) The highest precedence location is the one closest to the parameter. Just add a [ModelBinder] attribute on the parameter’s signature

        public object  MyAction2(
            [ModelBinder(typeof(LocationModelBinder))]
            Location loc) // Use model binding to convert
        {
            // use loc...
        }

This is the same as WebAPI. (In WebAPI, this was only supported after beta, so if you’re pre-RTM, you’ll need the latest sources)

2) add a [ModelBinder] attribute on the type’s declaration.

        [ModelBinder(typeof(LocationModelBinder))]
        public class Location { ... }

Same as WebAPI, like #1.

3) Change it in a global config setting

In MVC, this is in the global.asax file. An easy way is just like so:

       ModelBinders.Binders.Add(typeof(Location), new LocationModelBinder());

In WebAPI, registration is on the HttpConfiguration object. Web API strictly goes through the service resolver. WebAPI does have a gotcha that you need to register custom model binders at the front because the default list has MutableObjectModelBinder which zealously claims all types and so would shadow your custom binder if it were just appended to the end.

            
            config.Services.Insert(typeof(System.Web.Http.ModelBinding.ModelBinderProvider), 
                  0, // Insert at front to ensure other catch-all binders don’t claim it first
                  new LocationModelBinderProvider());

And then in WebAPI, you still need to add an empty [ModelBinder] attribute on the parameter to tell WebAPI to look in the model binders instead of trying to use a formatter on it.

The [ModelBinder] doesn’t need to specify the binder type because you provided it in the config object.

        public object  MyAction2([ModelBinder] Location loc) // Use model binding to convert
        {
            // use loc...
        }

What does WebAPI do under the hood? In all 3 cases, WebAPI sees a [ModelBinder] attribute associated with the parameter (either on the Parameter or on the Parameter’s Type’s declaration). The model binder attribute can either supply the binder directly (as in cases #1 and #2) or fetch the binder from the config (case #3). WebAPI then invokes that binder to get a value for the parameter.

Other places to hook?

WebAPI is very extensible and you could try to hook other places too, but the ones above are the most common and easiest for this scenario. But for completeness sake, I’ll mention a few other options, which I may blog about more later:

For example, you could hook the IActionValueBinder (here’s an example of an MVC-style parameter binder), IHttpActionInvoker (to populate right before invoking the action), or even populate parameters through a filter.
By default, complex types try to come from the body, and the body is read via Formatters. So you could also try to provide a custom formatter. However, that’s not ideal because in our example, we wanted data from the query string and Formatters can’t read the query string.

↧

How to create a custom value provider in WebAPI

April 23, 2012, 10:29 am

≫ Next: Excel on Azure

≪ Previous: How to bind to custom objects in action signatures in MVC/WebAPI

Here’s how you can easily customize WebAPI parameter binding to include values from source other than the url. The short answer is that you add a custom ValueProvider and use Model Binding, just like in MVC.

ValueProviders are used to provide values for simple types and match on parameter name. ValueProviders serve up raw pieces of information and feed into the Model Binders. Model Binders compose that information (eg, building collections or complex types) and do type coercion (eg, string to int, invoke type converters, etc).

Here’s a custom value provider that extracts information from the request headers. in this case, our action will get the userAgent and host from the headers. This doesn’t interfere with other parameters, so you can still get the id from the query string as normal and read the body.

    public class ValueProviderTestController : ApiController
    {
        [HttpGet]
        public object GetStuff(string userAgent, string host, int id)
        {   
            // userAgent and host are bound from the Headers. id is bound from the query string. 
            // This just echos back. Do something interesting instead.
            return string.Format(
@"User agent: {0},
host: {1}
id: {2}", userAgent, host, id);
        }
    }

So when I run it and hit it from a browser, I get a string back like so:

User agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0),
host: localhost:8080
id: 45

Note that the client needs to set the headers in the request. Browsers will do this. But if you just call directly with HttpClient.GetAsync(), the headers will be empty.

We define a HeaderValueProviderFactory class (source below), which derives from ValueProviderFactory and supplies model binding with the information from the header.

We need to register the header value provider. We can do it globally in the config, like so:

            // Append our custom valueprovider to the list of value providers.
            config.Services.Add(typeof(ValueProviderFactory), new HeaderValueProviderFactory());

Or we can do it just on a specific parameter without touching global config by using the [ValueProvider] attribute, like so:

    public object GetStuff([ValueProvider(typeof(HeaderValueProviderFactory))] string userAgent)

The [ValueProvider] attribute just derives from the [ModelBinder] attribute and says “use the default model binding, but supply these value providers”.

What’s happening under the hood? For refresher reading, see How WebAPI does parameter binding. In this case, it sees the parameter is a simple type (string), and so it will bind via Model Binding. Model binding gets a list of value providers (either from the attribute or the configuration), and then looks at the name of the parameter (userAgent, host, id) from that list. Model Binding will also do composition and coercion.

Sources

WebAPI is an open source project and some of this may need the post-beta source. For example, service resolver has been cleaned up since beta, so it’s now easier to add new services.

Here’s the source for a test client: It includes a loop so that you can hit the service from a browser.

        static void TestHeaderValueProvider()
        {
            string prefix = "http://localhost:8080";
            HttpSelfHostConfiguration config = new HttpSelfHostConfiguration(prefix);
            config.Routes.MapHttpRoute("Default", "{controller}/{action}");

            // Append our custom valueprovider to the list of value providers.
            config.Services.Add(typeof(ValueProviderFactory), new HeaderValueProviderFactory());
            
            HttpSelfHostServer server = new HttpSelfHostServer(config);
            server.OpenAsync().Wait();

            try
            {
                // HttpClient will make the call, but won't set the headers for you. 
                HttpClient client = new HttpClient();
                var response = client.GetAsync(prefix + "/ValueProviderTest/GetStuff?id=20").Result;

                // Browsers will set the headers. 
                // Loop. You can hit the request via: http://localhost:8080/Test2/GetStuff?id=40
                while (true)
                {
                    Thread.Sleep(1000);
                    Console.Write(".");
                }
            }
            finally
            {
                server.CloseAsync().Wait();

Here’s the full source for the provider.

using System.Globalization;
using System.Net.Http.Headers;
using System.Reflection;
using System.Web.Http.Controllers;
using System.Web.Http.ValueProviders;

namespace Basic
{
    // ValueProvideFactory. This is registered in the Service resolver like so:
    //    config.Services.Add(typeof(ValueProviderFactory), new HeaderValueProviderFactory());
    public class HeaderValueProviderFactory : ValueProviderFactory
    {
        public override IValueProvider GetValueProvider(HttpActionContext actionContext)
        {
            HttpRequestHeaders headers = actionContext.ControllerContext.Request.Headers;
            return new HeaderValueProvider(headers);
        }
    }

    // ValueProvider for extracting data from headers for a given request message. 
    public class HeaderValueProvider : IValueProvider
    {
        readonly HttpRequestHeaders _headers;

        public HeaderValueProvider(HttpRequestHeaders headers)
        {
            _headers = headers;
        }

        // Headers doesn't support property bag lookup interface, so grab it with reflection.
        PropertyInfo GetProp(string name)
        {
            var p = typeof(HttpRequestHeaders).GetProperty(name, 
                    BindingFlags.Instance | BindingFlags.Public | BindingFlags.IgnoreCase);
            return p;
        }

        public bool ContainsPrefix(string prefix)
        {
            var p = GetProp(prefix);
            return p != null;
        }

        public ValueProviderResult GetValue(string key)
        {
            var p = GetProp(key);
            if (p != null)
            {
                object value = p.GetValue(_headers, null);
                string s = value.ToString(); // for simplicity, convert to a string
                return new ValueProviderResult(s, s, CultureInfo.InvariantCulture);
            }
            return null; // none
        }
    }
}

↧

Excel on Azure

April 23, 2012, 11:06 pm

≫ Next: WebAPI Parameter binding under the hood

≪ Previous: How to create a custom value provider in WebAPI

I amended my open-source CsvTools with an Excel reader. Once I read the excel worksheet into a datatable, I can use all the data table operators from the core CsvTools, including enumeration, Linq over the rows, analysis, mutation, and saving back out as a CSV. So this gives be a Linq-to-Excel on Azure experience, which ought to win a buzzword bingo contest!

The excel reader uses the OpenXml SDK, and so it can run on Azure. This is useful because Excel as a COM-object doesn’t run on servers, and so I couldn’t upload excel files to my ASP.Net projects without really fighting the security settings. With OpenXml, it’s easy since you’re just reading XML.

Here’s a little azure MVC test page that demonstrates uploading a xlsx file and displaying the contents in azure:

(side note: deploying MVC to Azure is super easy, courtesy of this great tutorial).

I also need to give a shout-out for Nuget! The dependency management here was great. I have one Nuget package for the core CsvTools (which is just the CSV reader with no dependencies) , and another package CsvTools.Excel (which has a dependency on CsvTools and the OpenXml SDK).

The excel reader is an extension method exposed off “DataTable.New”, so it’s easily discoverable.

Here’s a sample excel sheet, foo.xlsx:

And then the code to read it from C#:

private static void TestExcel()
{
    var dt = DataTable.New.ReadExcel(@"c:\temp\foo.xlsx");
    var names = from row in dt.Rows where int.Parse(row["age"]) > 10 select row["Name"];
    foreach (var name in names)
    {
        Console.WriteLine(name);
    }            
}

This example just reads the first worksheet in the workbook, which is the common case for my usage scenarios where people are using excel as a CSV format. It prints:

Ed
John

There are also some other overloads to give the whole list of worksheets.

public static IList<MutableDataTable> ReadExcelAllSheets(this DataTableBuilder builder, string filename);
public static IList<MutableDataTable> ReadExcelAllSheets(this DataTableBuilder builder, Stream input);

The reader is intended for Excel workbooks that represent tabular data and is not hardened against weird or malformed input.

Anyway, I’m finding this useful for some experiments, and sharing in case somebody else finds it useful too.

(Now I just need to throw in a WebAPI parameter binding for DataTables, use WebAPI’s query string support, and add some data table Azure helpers and I will be the buzzword bingo champion!)

↧