digitalmars.D - printf() metaprogramming challenge
- Walter Bright (111/111) May 23 2019 While up at night with jetlag at DConf, I started toying about solving a...
- Jonathan Marler (111/229) May 23 2019 It uses mixin, so not pretty, but it works...
- Les De Ridder (42/43) May 23 2019 Similar solution:
- Alex (4/22) May 23 2019 this can all be simplified:
- Yuxuan Shui (11/15) May 23 2019 What a coincidence, I had this exact problem today as well. It
- Andrei Alexandrescu (2/24) May 23 2019 Did you try .expand with the tuple?
- Yuxuan Shui (2/27) May 23 2019 It's 1 character shorter to just write someTuple[0..$] :)
- ag0aep6g (28/32) May 23 2019 I don't know if this satisfies the "no extra overhead" rule. Maybe when
- bpr (4/15) May 23 2019 Are you sure this works for betterC? It's been a while for me,
- Radu (5/20) May 24 2019 Indeed it doesn't work with -betterC flag. Easily testable on
- Petar Kirov [ZombineDev] (10/34) May 24 2019 In cases like this, one needs to use the enum lambda trick:
- Sebastiaan Koppe (3/12) May 24 2019 Ohh, that is nice one. Thanks!
- Radu (3/21) May 24 2019 Yes, good point! I forgot about this trick.
- Petar Kirov [ZombineDev] (13/14) May 24 2019 Best verified on d.godbolt.org. Compare:
- Radu (5/20) May 24 2019 I used the same method to generate C header files for a betterC
- Andrei Alexandrescu (3/23) May 24 2019 Interesting. These problems seem to be implementation-specific, not
- Jacob Carlborg (11/40) May 24 2019 This is kind of nice, but I would prefer to have a complete
- Walter Bright (5/11) May 24 2019 C's sprintf is already @nogc nothrow and pure. Doing our own is not that...
- Jacob Carlborg (11/17) May 24 2019 Technically it's not pure because it access `errno`, that's what I meant...
- Walter Bright (14/28) May 24 2019 The C standard doesn't say printf can set errno. Be that as it may, I di...
- Andrei Alexandrescu (8/16) May 24 2019 This 100x. Once C++ variadics were out, everybody and their cat had an
- Jonathan Marler (32/48) May 24 2019 It took me about an hour to port this "float to string"
- Walter Bright (14/18) May 24 2019 https://github.com/ulfjack/ryu says: "The Java implementation differs fr...
- Jonathan Marler (10/33) May 24 2019 I didn't design an implementation in an hour, I just ported one :)
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (10/14) May 25 2019 It is quite interesting that you get that performance without
- Patrick Schluter (12/22) May 25 2019 L1 instruction cache are small and the cost of code bloat is only
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (8/16) May 25 2019 Yes, in benchmarking one should only test full applications… I
- Mike Franklin (20/24) May 24 2019 That may be true, but one problem with `printf` is it is much too
- Jonathan Marler (3/8) May 24 2019 My implementation is "pay for what you use". A pure D
- Andrei Alexandrescu (5/26) May 25 2019 The high impact part is the metaprogramming and introspection machinery....
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (4/8) May 25 2019 Programmers are looking for solutions, not machinery…
- bpr (12/17) May 25 2019 I'd think you'd be commenting on the "Issue 5710" thread then, as
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (22/32) May 25 2019 AFAIK, Ulf Adams is stating that the Java implementation is
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (23/33) May 25 2019 AFAIK, Ulf Adams is stating that the Java specification is
- Joseph Rushton Wakeling (12/23) May 25 2019 FWIW the Ryu algorithm looks like a serious piece of work — see
- Walter Bright (15/29) May 25 2019 Thank you. I've saved a copy of the paper. It it is indeed
While up at night with jetlag at DConf, I started toying about solving a small
problem. In order to use printf(), the format specifiers in the printf format
string have to match the types of the rest of the parameters. This is well
known
to be brittle and error-prone, especially when refactoring the types of the
arguments.
(Of course, this is not a problem with writefln() and friends, but that isn't
available in the dmd front end, nor when using betterC. Making printf better
would mesh nicely with betterC. Note that many C compilers have extensions to
tell you if there's a mismatch, but they won't fix it for you.)
I thought why not use D's metaprogramming to fix it. Some ground rules:
1. No extra overhead
2. Completely self-contained
3. Only %s specifiers are rewritten
4. %% is handled
5. diagnose mismatch between number of specifiers and number of arguments
Here's my solution:
int i;
dprintf!"hello %s %s %s %s betty\n"(3, 4.0, &i, "abc".ptr);
gets rewritten to:
printf("hello %d %g %p %s betty\n", 3, 4.0, &i, "abc".ptr);
The code at the end accomplishes this. Yay!
But what I'd like it to do is to extend it to convert a `string s` argument
into
`cast(int)s.length, s.ptr` tuple and use the "%.*s" specifier for it.
I completely failed at that. I suspect the language has a deficiency in
manipulating expression tuples.
Does anyone see a way to make this work?
Note: In order to minimize template bloat, I refactored most of the work into a
regular function, minimizing the size of the template expansions.
------ Das Code ------------
import core.stdc.stdio : printf;
template Seq(A ...) { alias Seq = A; }
int dprintf(string f, A ...)(A args)
{
enum Fmts = Formats!(A);
enum string s = formatString(f, Fmts);
__gshared const(char)* s2 = s.ptr;
return printf(Seq!(s2, args[0..2], args[2..4]));
}
template Formats(T ...)
{
static if (T.length == 0)
enum Formats = [ ];
else static if (T.length == 1)
enum Formats = [Spec!(T[0])];
else
enum Formats = [Spec!(T[0])] ~ Formats!(T[1 .. T.length]);
}
template Spec(T : byte) { enum Spec = "%d"; }
template Spec(T : short) { enum Spec = "%d"; }
template Spec(T : int) { enum Spec = "%d"; }
template Spec(T : long) { enum Spec = "%lld"; }
template Spec(T : ubyte) { enum Spec = "%u"; }
template Spec(T : ushort) { enum Spec = "%u"; }
template Spec(T : uint) { enum Spec = "%u"; }
template Spec(T : ulong) { enum Spec = "%llu"; }
template Spec(T : float) { enum Spec = "%g"; }
template Spec(T : double) { enum Spec = "%g"; }
template Spec(T : real) { enum Spec = "%Lg"; }
template Spec(T : char) { enum Spec = "%c"; }
template Spec(T : wchar) { enum Spec = "%c"; }
template Spec(T : dchar) { enum Spec = "%c"; }
template Spec(T : immutable(char)*) { enum Spec = "%s"; }
template Spec(T : const(char)*) { enum Spec = "%s"; }
template Spec(T : T*) { enum Spec = "%p"; }
/******************************************
* Replace %s format specifiers in f with corresponding specifiers in A[].
* Other format specifiers are left as is.
* Number of format specifiers must match A.length.
* Params:
* f = printf format string
* A = replacement format specifiers
* Returns:
* replacement printf format string
*/
string formatString(string f, string[] A ...)
{
string r;
size_t i;
size_t ai;
while (i < f.length)
{
if (f[i] != '%' || i + 1 == f.length)
{
r ~= f[i];
++i;
continue;
}
char c = f[i + 1];
if (c == '%')
{
r ~= "%%";
i += 2;
continue;
}
assert(ai < A.length, "not enough arguments");
string fmt = A[ai];
++ai;
if (c == 's')
{
r ~= fmt;
i += 2;
continue;
}
r ~= '%';
++i;
continue;
}
assert(ai == A.length, "not enough formats");
return r;
}
----- End Of Das Code ----------
May 23 2019
On Thursday, 23 May 2019 at 19:33:15 UTC, Walter Bright wrote:
While up at night with jetlag at DConf, I started toying about
solving a small problem. In order to use printf(), the format
specifiers in the printf format string have to match the types
of the rest of the parameters. This is well known to be brittle
and error-prone, especially when refactoring the types of the
arguments.
(Of course, this is not a problem with writefln() and friends,
but that isn't available in the dmd front end, nor when using
betterC. Making printf better would mesh nicely with betterC.
Note that many C compilers have extensions to tell you if
there's a mismatch, but they won't fix it for you.)
I thought why not use D's metaprogramming to fix it. Some
ground rules:
1. No extra overhead
2. Completely self-contained
3. Only %s specifiers are rewritten
4. %% is handled
5. diagnose mismatch between number of specifiers and number of
arguments
Here's my solution:
int i;
dprintf!"hello %s %s %s %s betty\n"(3, 4.0, &i, "abc".ptr);
gets rewritten to:
printf("hello %d %g %p %s betty\n", 3, 4.0, &i, "abc".ptr);
The code at the end accomplishes this. Yay!
But what I'd like it to do is to extend it to convert a `string
s` argument into `cast(int)s.length, s.ptr` tuple and use the
"%.*s" specifier for it.
I completely failed at that. I suspect the language has a
deficiency in manipulating expression tuples.
Does anyone see a way to make this work?
Note: In order to minimize template bloat, I refactored most of
the work into a regular function, minimizing the size of the
template expansions.
------ Das Code ------------
import core.stdc.stdio : printf;
template Seq(A ...) { alias Seq = A; }
int dprintf(string f, A ...)(A args)
{
enum Fmts = Formats!(A);
enum string s = formatString(f, Fmts);
__gshared const(char)* s2 = s.ptr;
return printf(Seq!(s2, args[0..2], args[2..4]));
}
template Formats(T ...)
{
static if (T.length == 0)
enum Formats = [ ];
else static if (T.length == 1)
enum Formats = [Spec!(T[0])];
else
enum Formats = [Spec!(T[0])] ~ Formats!(T[1 .. T.length]);
}
template Spec(T : byte) { enum Spec = "%d"; }
template Spec(T : short) { enum Spec = "%d"; }
template Spec(T : int) { enum Spec = "%d"; }
template Spec(T : long) { enum Spec = "%lld"; }
template Spec(T : ubyte) { enum Spec = "%u"; }
template Spec(T : ushort) { enum Spec = "%u"; }
template Spec(T : uint) { enum Spec = "%u"; }
template Spec(T : ulong) { enum Spec = "%llu"; }
template Spec(T : float) { enum Spec = "%g"; }
template Spec(T : double) { enum Spec = "%g"; }
template Spec(T : real) { enum Spec = "%Lg"; }
template Spec(T : char) { enum Spec = "%c"; }
template Spec(T : wchar) { enum Spec = "%c"; }
template Spec(T : dchar) { enum Spec = "%c"; }
template Spec(T : immutable(char)*) { enum Spec = "%s"; }
template Spec(T : const(char)*) { enum Spec = "%s"; }
template Spec(T : T*) { enum Spec = "%p"; }
/******************************************
* Replace %s format specifiers in f with corresponding
specifiers in A[].
* Other format specifiers are left as is.
* Number of format specifiers must match A.length.
* Params:
* f = printf format string
* A = replacement format specifiers
* Returns:
* replacement printf format string
*/
string formatString(string f, string[] A ...)
{
string r;
size_t i;
size_t ai;
while (i < f.length)
{
if (f[i] != '%' || i + 1 == f.length)
{
r ~= f[i];
++i;
continue;
}
char c = f[i + 1];
if (c == '%')
{
r ~= "%%";
i += 2;
continue;
}
assert(ai < A.length, "not enough arguments");
string fmt = A[ai];
++ai;
if (c == 's')
{
r ~= fmt;
i += 2;
continue;
}
r ~= '%';
++i;
continue;
}
assert(ai == A.length, "not enough formats");
return r;
}
----- End Of Das Code ----------
It uses mixin, so not pretty, but it works...
void main()
{
int i = 0;
dprintf!"hello %s %s %s %s betty\n"(3, 4.0, &i, "abc".ptr);
const msg = "AAA!";
dprintf!"A dstring '%s'\n"(msg[0 .. 3]);
}
template Seq(A ...) { alias Seq = A; }
int dprintf(string f, A ...)(A args)
{
import core.stdc.stdio : printf;
enum Fmts = Formats!(A);
enum string s = formatString(f, Fmts);
__gshared const(char)* s2 = s.ptr;
enum call = function() {
import std.conv : to;
string printfCall = "printf(s2";
foreach(i, T; A)
{
static if (is(T : string))
{
printfCall ~= ", cast(size_t)args[" ~ i.to!string
~ "].length, args["
~ i.to!string ~ "].ptr";
}
else
{
printfCall ~= ", args[" ~ i.to!string ~ "]";
}
}
return printfCall ~ ")";
}();
//pragma(msg, call); // uncomment to see the final call
return mixin(call);
}
template Formats(T ...)
{
static if (T.length == 0)
enum Formats = [];
else static if (T.length == 1)
enum Formats = [Spec!(T[0])];
else
enum Formats = [Spec!(T[0])] ~ Formats!(T[1 .. T.length]);
}
template Spec(T : byte) { enum Spec = "%d"; }
template Spec(T : short) { enum Spec = "%d"; }
template Spec(T : int) { enum Spec = "%d"; }
template Spec(T : long) { enum Spec = "%lld"; }
template Spec(T : ubyte) { enum Spec = "%u"; }
template Spec(T : ushort) { enum Spec = "%u"; }
template Spec(T : uint) { enum Spec = "%u"; }
template Spec(T : ulong) { enum Spec = "%llu"; }
template Spec(T : float) { enum Spec = "%g"; }
template Spec(T : double) { enum Spec = "%g"; }
template Spec(T : real) { enum Spec = "%Lg"; }
template Spec(T : char) { enum Spec = "%c"; }
template Spec(T : wchar) { enum Spec = "%c"; }
template Spec(T : dchar) { enum Spec = "%c"; }
template Spec(T : string) { enum Spec = "%.*s"; }
template Spec(T : immutable(char)*) { enum Spec = "%s"; }
template Spec(T : const(char)*) { enum Spec = "%s"; }
template Spec(T : T*) { enum Spec = "%p"; }
/******************************************
* Replace %s format specifiers in f with corresponding
specifiers in A[].
* Other format specifiers are left as is.
* Number of format specifiers must match A.length.
* Params:
* f = printf format string
* A = replacement format specifiers
* Returns:
* replacement printf format string
*/
string formatString(string f, string[] A ...)
{
string r;
size_t i;
size_t ai;
while (i < f.length)
{
if (f[i] != '%' || i + 1 == f.length)
{
r ~= f[i];
++i;
continue;
}
char c = f[i + 1];
if (c == '%')
{
r ~= "%%";
i += 2;
continue;
}
assert(ai < A.length, "not enough arguments");
string fmt = A[ai];
++ai;
if (c == 's')
{
r ~= fmt;
i += 2;
continue;
}
r ~= '%';
++i;
continue;
}
assert(ai == A.length, "not enough formats");
return r;
}
May 23 2019
On Thursday, 23 May 2019 at 22:48:33 UTC, Jonathan Marler wrote:It uses mixin, so not pretty, but it works...Similar solution: --- printf.d 2019-05-24 00:48:44.840543714 +0200 +++ printf_s.d 2019-05-24 00:52:47.829178613 +0200 -1,13 +1,12 import core.stdc.stdio : printf; -template Seq(A ...) { alias Seq = A; } - int dprintf(string f, A ...)(A args) { enum Fmts = Formats!(A); enum string s = formatString(f, Fmts); + alias args_ = Args!(args); __gshared const(char)* s2 = s.ptr; - return printf(Seq!(s2, args[0..2], args[2..4])); + mixin( q{return printf(s2, } ~ args_ ~ q{);} ); } template Formats(T ...) -42,6 +41,22 template Spec(T : const(char)*) { enum Spec = "%s"; } template Spec(T : T*) { enum Spec = "%p"; } +template Spec(T : string) { enum Spec = "%.*s"; } + +template Args(A ...) +{ + static if (A.length == 0) + enum Args = ""; + else static if (A.length == 1) + enum Args = Arg!(A[0]); + else + enum Args = Arg!(A[0]) ~ ", " ~ Args!(A[1 .. A.length]); +} + +template Arg(alias string arg) { enum Arg = "cast(int)"~arg.stringof~".length,"~arg.stringof~".ptr"; } + +template Arg(alias arg) { enum Arg = arg.stringof; } + /****************************************** * Replace %s format specifiers in f with corresponding specifiers in A[]. * Other format specifiers are left as is.
May 23 2019
template Spec(T : byte) { enum Spec = "%d"; }
template Spec(T : short) { enum Spec = "%d"; }
template Spec(T : int) { enum Spec = "%d"; }
template Spec(T : long) { enum Spec = "%lld"; }
template Spec(T : ubyte) { enum Spec = "%u"; }
template Spec(T : ushort) { enum Spec = "%u"; }
template Spec(T : uint) { enum Spec = "%u"; }
template Spec(T : ulong) { enum Spec = "%llu"; }
template Spec(T : float) { enum Spec = "%g"; }
template Spec(T : double) { enum Spec = "%g"; }
template Spec(T : real) { enum Spec = "%Lg"; }
template Spec(T : char) { enum Spec = "%c"; }
template Spec(T : wchar) { enum Spec = "%c"; }
template Spec(T : dchar) { enum Spec = "%c"; }
template Spec(T : string) { enum Spec = "%.*s"; }
template Spec(T : immutable(char)*) { enum Spec = "%s"; }
template Spec(T : const(char)*) { enum Spec = "%s"; }
template Spec(T : T*) { enum Spec = "%p"; }
this can all be simplified:
static foreach(k,v: ["byte":"%d", "short":"%d", ...])
mixin(`template Spec(T : `~k~`) { enum Spec = "`~v~`"; }`);
The string mixin is not necessary but easier than an aliasSeq.
May 23 2019
On Thursday, 23 May 2019 at 19:33:15 UTC, Walter Bright wrote:[snip] I completely failed at that. I suspect the language has a deficiency in manipulating expression tuples.What a coincidence, I had this exact problem today as well. It seems currently the only way to do this is either with mixins, or using tuple. Assuming you already have all of the arguments in a tuple: auto args = tuple(...); And args[x] is a string, you can do this: auto args_prime = tuple(args[0..x], args[x].length, args[x].ptr, args[x..$]); You then need to do some template magic to expand all such arguments... Using mixin is probably better.
May 23 2019
On 5/23/19 6:58 PM, Yuxuan Shui wrote:On Thursday, 23 May 2019 at 19:33:15 UTC, Walter Bright wrote:Did you try .expand with the tuple?[snip] I completely failed at that. I suspect the language has a deficiency in manipulating expression tuples.What a coincidence, I had this exact problem today as well. It seems currently the only way to do this is either with mixins, or using tuple. Assuming you already have all of the arguments in a tuple: auto args = tuple(...); And args[x] is a string, you can do this: auto args_prime = tuple(args[0..x], args[x].length, args[x].ptr, args[x..$]); You then need to do some template magic to expand all such arguments... Using mixin is probably better.
May 23 2019
On Friday, 24 May 2019 at 00:41:31 UTC, Andrei Alexandrescu wrote:On 5/23/19 6:58 PM, Yuxuan Shui wrote:It's 1 character shorter to just write someTuple[0..$] :)On Thursday, 23 May 2019 at 19:33:15 UTC, Walter Bright wrote:Did you try .expand with the tuple?[snip] I completely failed at that. I suspect the language has a deficiency in manipulating expression tuples.What a coincidence, I had this exact problem today as well. It seems currently the only way to do this is either with mixins, or using tuple. Assuming you already have all of the arguments in a tuple: auto args = tuple(...); And args[x] is a string, you can do this: auto args_prime = tuple(args[0..x], args[x].length, args[x].ptr, args[x..$]); You then need to do some template magic to expand all such arguments... Using mixin is probably better.
May 23 2019
On 23.05.19 21:33, Walter Bright wrote:But what I'd like it to do is to extend it to convert a `string s` argument into `cast(int)s.length, s.ptr` tuple and use the "%.*s" specifier for it.[...]Does anyone see a way to make this work?I don't know if this satisfies the "no extra overhead" rule. Maybe when `arrlen` and `arrptr` are inlined? int dprintf(string f, A ...)(A args) { enum Fmts = Formats!(A); enum string s = formatString(f, Fmts); __gshared const(char)* s2 = s.ptr; import std.meta: staticMap; return printf(Seq!(s2, staticMap!(arg, args))); } template arg(alias a) { static if (is(typeof(a) == string)) alias arg = Seq!(arrlen!a, arrptr!a); else alias arg = a; } auto arrlen(alias a)() { return a.length; } auto arrptr(alias a)() { return a.ptr; } template Spec(T : string) { enum Spec = "%.*s"; } void main() { int i; dprintf!"hello %s %s %s %s betty %s\n"(3, 4.0, &i, "abc".ptr, "foobar"); } // ... rest of the code unchanged ...
May 23 2019









Les De Ridder <les lesderid.net> 