Tuesday 30 October 2018

Inline declaration of variables

There will be a new interesting feature in the upcoming Delphi 10.3 Rio: inline declared variables. I have permission to blog a little about it.

Marco Cantù already wrote about this new feature. I am not going to explain it, as I think he already did a good job doing that.

I just want to demo one of the main advantages of them: they are block-local, i.e. they are initialized at the point of declaration and are finalized at the end of the begin-end block they were declared in. This means that an interface or other managed variable declared that way is finalized at the end of the block.

To demo this, I made a little talkative interface, which tells us what it is doing, and when it is being created and being destroyed. It also has a name. Here is the code:

  ITalking = interface
    procedure Talk;

  TTalking = class(TInterfacedObject, ITalking)
    FName: string;
    constructor Create(const Name: string);
    procedure Talk;
    destructor Destroy; override;


{ TTalking }

constructor TTalking.Create(const Name: string);
  FName := Name;
  Writeln(Format('Creating %s', [Name]));

destructor TTalking.Destroy;
  Writeln(Format('Destroying %s', [FName]));

procedure TTalking.Talk;
  Writeln(Format('Hi, %s talking here!', [FName]));


And here is the demo:

procedure Test;
  var I1: IInterface := TTalking.Create('One'); // one way to declare an interface
  Writeln('// Before block');
    Writeln('// Before first inline var...');
    var I2 := TTalking.Create('Two') as ITalking; // the other way: type inference
    var I3: ITalking := TTalking.Create('Three');
    Writeln('// Do something useful here');
  Writeln('// After block');
  var I4: ITalking := TTalking.Create('Four');
  Writeln('// After declaration of Four');

And the output is:

Creating One
// Before block

// Before first inline var...
Creating Two
Hi, Two talking here!
Creating Three
Hi, Three talking here!
// Do something useful here
Destroying Three
Destroying Two

// After block
Creating Four
// After declaration of Four
Destroying Four
Destroying One

As you can see, the two interfaces declared and initialized inside the inner block are destroyed at the end of the same block. The interfaces declared and initialized in the outer block are destroyed at the end of the function.

This creates some interesting opportunities, e.g. for RAII using interfaces, or other auto-finalized structures. RAII doesn't have to be used for smart pointers. It can also be used for many other things, like locks, mutexes, temporary changes of cursors, etc. i.e. anything that is changed temporarily and restored at the end of a certain period. Just declare the RAII object inside the block and the change will be restored at the end of it. Just pass it an anonymous function to tell it what it must do at the end of the block (or in case of an exception).

But more about this later. FWIW, I also like the type inference, or that it can be used for for-loops.

Friday 5 October 2018

Why I like Stack Exchange

Because you can get extremely interesting answers, like the excellent answer to the question "What is Chirped Pulse Amplification, and why is it important enough to warrant a Nobel Prize?" on Physics Stack Exchange:

The 2018 Nobel Prize in Physics has just been announced, with half going to Arthur Ashkin for his work on optical tweezers and half going to Gérard Mourou and Donna Strickland for developing a technique called "Chirped Pulse Amplification". While normally the Wikipedia page is a reasonable place to turn to, in this case it's pretty flat and not particularly informative. So:

  • What is Chirped Pulse Amplification? What is the core of the method that really makes it tick?
  • What pre-existing problems did its introduction solve?
  • What technologies does it enable, and what research fields have become possible because of it?

I am not a physicist and just know the usual school physics, but have no problem understanding the answer. Excellent!

Monday 1 October 2018

The current state of generics in Delphi

To avoid duplication of generated code, the compiler builders of Embarcadero have done a nice job. They introduced new instrinsics like IsManagedType, GetTypeKind and IsConstantType (see this Stackoverflow answer), so they could make a function like the following generate a call to the exact function for the parametric type directly. This means that the code below "runs" completely inside the compiler, even the code in the called InternalAddMRef and similar dispatching routines.

function TList.Add(const Value: T): Integer;
  if IsManagedType(T) then
    if (SizeOf(T) = SizeOf(Pointer)) and (GetTypeKind(T) <> tkRecord) then
      Result := FListHelper.InternalAddMRef(Value, GetTypeKind(T))
    else if GetTypeKind(T) = TTypeKind.tkVariant then
      Result := FListHelper.InternalAddVariant(Value)
      Result := FListHelper.InternalAddManaged(Value);
  end else
  case SizeOf(T) of
    1: Result := FListHelper.InternalAdd1(Value);
    2: Result := FListHelper.InternalAdd2(Value);
    4: Result := FListHelper.InternalAdd4(Value);
    8: Result := FListHelper.InternalAdd8(Value);
    Result := FListHelper.InternalAddN(Value);

That, in its turn, means that if you code something like:

  List: TList<string>;
  List := TList<string>.Create;



is in fact directly compiled as


i.e. TList.Add does not have any runtime code at all, the result of "calling" it (see above) is the generation of a direct call to the DoAddString function. That is a simple method, not generic, not virtual or dynamic, like all the other methods of TListHelper, so the unused functions can be eliminated by the linker. This also means there is hardly any duplication of generated code anymore, i.e. if another part of the same executable also uses TList<string>.Add it will use the same function.

No code duplication?

Well, this means that if the class is coded like above, there is no unnecessary duplication of generated code anymore. That is cool and can probably reduce the expected bloat caused by using generics.

But it also means that it greatly reduces one of the advantages associated with generics: no unnecessary duplication of source code. You write a routine once, and only the type it works on is parametrized. The compiler takes care of generating code for each used routine and parametric type.

But then you still get the dreaded bloat. So you either duplicate a lot of source code (take a look at the code of TListHelper in System.Generics.Collections to see what I mean; for example, the functions DoAddWideString, DoAddDynArray, DoAddInterface and DoAddString contain exactly the same code, except for the lines where FItems^ is cast each to a different pointer type), which is almost as much work as writing a separate TList for each type, or you use the naïve approach, writing code only once, as generics should actually be used. But then you could get bloat again, i.e. it is well possible that the same routine gets generated multiple times, in different units of the same executable.

What to do?

There is not much we can do, except to emphatically ask Embarcadero to put the kind of logic that was implemented in TList<T> and other generic classes to avoid duplication of generated code, in the compiler and linker, so we don't have to write huge helper classes nor get the bloat.

In the meantime, you can do what Embarcadero did: if your class is used a lot for many different types, do a lot of "copy and paste generics". Otherwise, if you only have to deal with a few types and in a few places, simply use the naïve approach. Or you use a hybrid approach, using the intrinsics mentioned above and implementing the most used helper functions for the types you need most.

Let's hope they get a lot of requests and implement these things in the compiler and linker, like other languages with generics, e.g. C# or C++, do.

Rudy Velthuis