A Note on Python

January 7, 2025

The advantages of Python are well known and it is a language that still remains hyped — but let’s look at some of its other sides.

Line Count

One positive side of many solutions is by stressing how few lines something requires. That’s especially the case with Python, but it doesn’t take into account the whole Software Development Life Cycle (SDLC).

One does not only want to churn out some idiomatic example code. Instead one has a large, complex software project that needs to be robust and withheld quality assurance during a long product lifetime. Hence one looks after other qualities other than whether the very initial effort requires little code.

Contracts Matter

Python’s typing system is so-called duck typing: ”If it walks like a duck and it quacks like a duck, then it must be a duck”. First of all, it means that if a caller doesn’t live up to the contract of the API, it will result in an obscure crash somewhere far down in the called code. The caller have to figure out what went wrong. This is a problem because in QA there’s the mantra that exhaustive testing is impossible, which is one reason to why we have to rely on code robustness.

I also think types is natural documentation. “What does this function do? What does it expect?” Types document that. Hence I argue conceptually for abstract base types such as Swift’s protocols, Java’s interfaces and so forth.

Versus MatLab

I replace MatLab with Python for quantitative finance, but what I miss in commercially backed projects such as first-mentioned, is good documentation. For instance NumPy and Pandas lack a consumer/user/customer-centric perspective, as opposed to what perhaps (we) programmers enjoy to do. For some reason open-source perhaps doesn’t attract for writing documentation. While it could make use of systematic addressing of this problem, it is low hanging fruit for Google Summer of Code, for instance.

Python is slow and I see the solution in the computation area as a hack: the code path jumps into C/C++ land using NumPy and Pandas. This is fine, but the code becomes convoluted and I associate to Perl’s “write once, understand never.” It feels like Python attempts to be taken to an area which results in an ill design.

Conclusion

What is Python useful for? I believe light, small tasks, such as doing something computationally advanced that isn’t a complex software project. The lack of typing means you cannot (or perhaps shouldn’t) write large applications, and perhaps performance is relevant.

However, for the more short computation and script tasks it is excellent, and it is indeed in that area it receives publicity.

Posted by Frans Englich
Filed in Uncategorized
Tags: coding, data-science, finance, matlab, programming, python, quant

Open Tech > Proprietary Vertical Silos

November 21, 2024

In the usual frantic activity in the tech sphere I say the innovation we see is limited, while we see an enormous growth of the monopolies in form of vertical silos. Apple’s eco system is closed, and the same applies to for instance Meta’s.

As a result, innovations on the market level remain limited. For instance, the Internet and the Web have innovated and advanced sub optimally the recent years. It is primarily used as infrastructure for these corporations. This needs to change.

Or take that e-books — books and reading, the very foundation of our civilisation — are locked by proprietary DRMs. Perhaps open DRM via blockchain combined with suitable governance is the way to go. This needs to be looked at further.

One effect of the lack of the development and advancement of open technologies is that the common denominator remains low. Innovation stays within the vertical silos, consumers become dependent on them, and the market remains stagnated.

It is my advice to those who want to see real innovation that isn’t locked to companies, involve themselves in the development of open specifications and technologies.

Posted by Frans Englich
Filed in Uncategorized
Tags: art, books, consumer rights, DRM, entertainment, governance, legal, open-source, software, technology, Technology and Society

AI? Less Computer More Me, Please

June 20, 2024

AI is the current mega trend, and soon our mobiles and laptops will have the functionality integrated, bringing it closer to our daily lives.

While new technology such as AI surely has its use, it can also be healthy to critically have a wider perspective. With AI we can for instance:

Have emails written for us in a manner we perhaps don’t have the skills to present in person
Generate texts, such as CVs or cover letters, that we perhaps cannot back up
Generating texts that we don’t necessarily understand, can fact check or vouch for

My question is:

What do we human beings gain from that computers — on their own — are sophisticated?

It is a shift, where computers have gone from being a tool that assists, to taking a role on its own.

Here are areas where innovation is badly needed:

Mental health has plummeted seemingly connected with social media and mobiles. Is it VR glasses we need?
With the Google Effect, also known as digital amnesia, people tend to forget what they searched. Considering the wide spread usage of search engines in our lives, improvement in this area would be massive. (Scientific replication seems questionable and one can problematise)
The fast food markets managed to dopamine-hack customers with their perfected products, and currently the social media are successful at that as well, again at the cost of customers’ health. A form of “intelligent digital dopamine administration”, as wishful as it is, would mean the technology’s destructive impact is reduced
Reading comprehension on screens is less efficient, and probably the majority read most of their content on screens. This a big thing. Screens are massively marketed and used, but still they are largely less efficient than books. The causes could be many. Perhaps an invention of e-ink books would be a massive productivity boost.

The young IT entrepreneur is hailed and the markets value as they do. I believe it’s wise to question what directions they take us.

AI and other new technologies are exciting, but ensuring that computers are helpful and constructive for us, is imperative.

Posted by Frans Englich
Filed in AI, Hardware, Software development, Technology and Society
Tags: AI, Hardware, Software development, Technology and Society

XML to QObjects: QXmlToQObjectCreator

October 23, 2008

Thank you, to all who attended Dev Days 2008 in Munich. For me it was really great to talk to so many users and hear about all the baffling projects that people pull off with Qt. And of course, to hear how people use and what people need, in terms of Qt’s XML support.

One customer told about how sub-classes of QObject are used for representing data, and are converted to such from XML. So, why not add a little helper class to Qt for this?

The class, which currently only is a research idea and missed the feature freeze for Qt 4.5, is called QXmlToQObjectCreator, and hopefully the documentation explains it all:

QXmlToQObjectCreator API Documentation

In other words, it’s a very simple class that builds a QObject tree corresponding to the output of QXmlQuery. The current sketched code is pasted here, for those interested.

In what way can this class be made more useful?

Posted by Frans Englich
Filed in HTML/XML/XHTML, Qt, QtXmlPatterns

19 Comments »

XSL-T and Qt

September 10, 2008

A couple of weeks ago, I merged the development branch for XSL-T into our main line, heading for Qt 4.5. The idea is that Qt will carry an XSL-T 2.0 implementation with as usual being cross-platform, having solid documentation, and easy of use.

Using it is should straightforward. Either on the command line:

xmlpatterns yourStylesheet.xsl yourInputDocument -param myParam=myValue

Or using the C++ API[1]:

QXmlQuery myQuery(QXmlQuery::XSLT20);
myQuery.bindVariable("myParam", QVariant("myValue");
myQuery.setQuery("http://example.com/myStylesheet.xsl");
QFile out("outFile.xml");
out.open(QIODevice::WriteOnly);

myQuery.evaluateTo(&out);

See the documentation for the QXmlQuery class on the overloads available for setQuery() and evaluateTo(), for instance.

However, due to the beast XSL-T 2.0 is — I agree that it’s larger than XQuery — we’ve decided to do this according to the “release early release often” approach. The first, in Qt 4.5, will carry a subset, and subsequently be complemented in Qt 4.6. The current status is documented in the main page for the QtXmlPatterns module, which can be viewed in the documentation snapshot.

Therefore, while the current implementation probably falls short on more complex applications(such as Docbook XSL), it can run simpler things, users can plan ahead, and we trolls can receive feedback on what features/APIs that are missing, and what needs focus. So feel free to do that: send a mail to [email protected], or say hello on IRC(FransE, on Free Node).

The code is accessible through the Qt snapshots.

What is XSL-T anyway?

XSL-T is a programming language for transforming XML into XML, HTML or text. Some implementations, such as QtXmlPatterns or Saxon, provides mechanisms to map XML to other data sources and hence widens the scope of the language by letting the XML act as an abstract interface. Wikipedia has a good article on XSL-T. Version 2.0 of XSL-T extends the language heavily by putting a rigid type system and data model in the backbone, and adds many features that was a pain to miss when programming in XSL-T 1.0. XSL-T 2.0 use XPath 2.0, and shares the same large function library as XQuery.

Over time, Java bindings through QtJambi and ECMAScript bindings through QtScript, will likely arrive.

Posted by Frans Englich
Filed in HTML/XML/XHTML, Qt, QtXmlPatterns

21 Comments »

QIODevice and QXmlQuery

December 11, 2007

I have not yet seen an API for XQuery in which integrating the data model, atomic values, nodes and all, into the interfacing language has been a walk in the park.

At the top of the list of things people tend to ask on the forums around is “How do I get XML represented as a sequence of bytes in Java/C++ into my query?”, whose result is clear — a tree fragment for the query to operate on — but whose method for reaching is not that given if you ask me.

There is no “bytestream” type in XQuery. Should the user build the tree herself and then pass the tree to the query? Should the implementation in some voodoish way be instructed how to treat a string or custom type? Shouldn’t the query engine do it such that its scope of analysis is increased and its done the way it prefers it?

What I sense have been the problem with some solutions is that they mix the data, the bytestream, with interpretation.

In Qt this manifestate itself with that the content of a QIODevice should appear in a QXmlQuery. The way it’s now provided, is that when a QIODevice is bound to a variable using QXmlQuery::bindVariable(), the query sees a URI(an instance of xs:anyURI) which behind the scenes maps to the QIODevice the user bound. Hence, if the purpose is to build an XML document, one passes the URI to the builtin fn:doc() function.

I hope this is clean. Since it’s handled like any other URI, custom extensions stays at a minimum, error reporting is consistent, and the interpretation hasn’t been coupled with the data. For instance, later on I hope to merge in support for XInclude and XQuery Update, and in those cases the URI is again simply passed to for instance fn:put().

One can weight quite well on URIs and the abstraction the XPath Data Model provides, it seems.

Posted by Frans Englich
Filed in Uncategorized

6 Comments »

Query Your Toaster

November 15, 2007

People have asked for Qt’s XQuery & XPath support to not be locked to a particular tree backend such as QDom, but to be able to work on arbitrary backends.

Any decent implementation(such as XQilla or Saxon) provide that nowadays in someway or another, but I’d say Patternist’s approach is novel, with its own share of advantages. So let me introduce what Qt’s snapshot carries.

<ul>
    {
        for $file in $exampleDirectory//file[@suffix = "cpp"]
        order by xs:integer($file/@size)
        return <li>
                    {string($file/@fileName)}, size: {string($file/@size)}
                  </li>
    }
</ul>

and the query itself was set up with:

QXmlQuery query;FileTree fileTree(query.namePool());
query.setQuery(&file, QUrl::fromLocalFile(file.fileName()));
query.bindVariable("exampleDirectory", fileTree.nodeFor(QLibraryInfo::location(QLibraryInfo::ExamplesPath)));
if(!query.isValid())
     return InvalidQuery;
QFile out;
out.open(stdout, QIODevice::WriteOnly);
query.serialize(&out);

These two snippets are taken from the example found in examples/xmlpatterns/filetree/, which with about 250 lines of code, has virtualized the file system into an XML document.

In other words, with the tree backend FileTree that the example has, it’s possible to query the file system, without converting it to a textual XML document or anything like that.

And that’s what the query does: it finds all the .cpp files found on any level in Qt’s example directory, and generate a HTML list, ordered by their file size. Maybe generating a view for image files in a folder would have been a tad more useful.

The usual approach to this is an abstract interface/class for dealing with nodes, which brings disadvantages such as heap allocations and that one need to allocate such structures and hence the possibility to affect the implementation of what one is going to query.

But along time ago Patternist was rewritten to use Qt’s items & models pattern, which means any existing structure can be queried, without touching it. That’s what the FileTree class does, it subclasses QSimpleXmlNodeModel and handles out QXmlNodeModelIndex instances, which are light, stack allocate values.

This combined with that the engine tries to evaluate in a streamed and lazy manner to the degree that it thinks it can, means fairly efficient solutions should be doable.

So what does this mean? It means that if you would like to, you can relatively cheaply be able to use the XQuery language on top of your custom data structure, as long as it is somewhat hierarchical.

For instance, a backend could bridge the QObject tree, such that the XQuery language could be used to find Human Interface Guideline-violations within widgets; molecular patterns in a chemistry application can concisely be identified with a two or three liner XPath expression, and the documentation carries on with a couple of other examples. No need to convert QWidgets to nodes, or force a compact representation to sub-class an abstract interface.

A to me intriguing case would be a web robot that models the links between different pages as a graph, and finds invalid documents & broken links using the doc-available() function, or reported URIs that a website shouldn’t be linking to(such as a public site referencing intranet pages).

Our API freeze is approaching. If something is needed but missing, let me know.

Posted by Frans Englich
Filed in HTML/XML/XHTML, Qt, QtXmlPatterns, Software development

13 Comments »

Integrating Compiler Messages

October 23, 2007

Attention to details is ok, but compiler messages has historically not received it. Here’s an example of GCC’s output:

qt/src/xml/query/expr/qcastingplatform.cpp: In member function 'bool CastingPlatform::prepareCasting(): qt/src/xml/query/expr/qcastas.cpp:117: instantiated from here qt/src/xml/query/expr/qcastingplatform.cpp:85: error: no matching function for call to 'locateCaster(int)' qt/src/xml/query/expr/qcastingplatform.cpp:93: note: candidates are: locateCaster(const bool&)

Typically compiler messages have been subject to crude printf approaches and dignity has been left out: localization, translation, consistency in quoting style (for instance), adapting language to users (e.g, to not phrase things preferred by compiler engineers), good English, and just generally looking sensible.

To solve that it requires quite some work, and that’s probably the explanation to why it often is left out. To have line numbers, error codes, names of functions, and whatever available and flowing through the system requires quite some plumbing and room in the design.

Another thing is that nowadays we really should expect that compiler messages within IDEs or other graphical applications should be sanely typeset. If not, we’ve lost ourselves in all this UNIX stuff. Keywords and important phrases should be italic, emphasized, colorized depending on the GUI style.

For shuffling compiler messages around it is customary to pass a set of properties: a URI, line number, column number, a descriptive string, and possibly an error code. Apart from that it falls short reaching the goals outlined in this text, it encounters a problem which I think is illustrated in the above example with GCC. What does one do if the message involves several locations?

Even if a message involves several locations, it is still one message and should be treated so, and presented as so. The approach of using a struct with properties falls short here, and chops the message into as many parts as it has locations.

For Patternist I wanted to make an attempt at improving messages. So far it is an improvement at least. For instance, for this message that the command line tool patternist outputs:

the installed QAbstractMessageHandler was passed a QSourceLocation and a message which read:

Operator + is not available between atomic values of type xs:integer and xs:string.

It was subsequently converted to local encoding and formatted with ECMA-48 color codes. (The format is not spec’d yet, it will probably be XHTML with specified class ids.)

While using markup for the message is a big improvement, it opens the door for formatting and all, this API still has the problem of dealing with multiple locations.

What is the solution to that?

Striking the balance between programmatic interpretation(such that for instance source document navigation is doable) and that the message reads naturally as one coherent unit is to… maybe duplicate the information, but each time tailored for a particular consumer?

In my <l:location href="myDocument.xml" line="57" column="3">myQuery.xq at line 57, column 3</l:location>, function fn:doc() failed with code XPTY0004: the file <l:location href="myDocument.xml" line="93" column="9">myDocument.xml failed to parse at line 93, column 9</l:location>: unexpected token &.

This is complicated by that language strings cannot be concatenated together since that prevents translation. But I think the above paragraph is possible to implement. As above, the message reads coherently, but still allows programmatic extraction. A language string and formatted data sits in opposite corners of extremity, and maybe markup is the balance between them.

Would this give good compiler messages and allow slick IDE integration? If not, what would?

Posted by Frans Englich
Filed in Qt, QtXmlPatterns, Software development

10 Comments »

XPath & XQuery in Qt

September 18, 2007

The Qt snapshots now includes support for XPath 2.0 and XQuery 1.0.

Being part of the XML library, the idea is that Qt 4.4 will ship with a C++ API for running and evaluating such queries. On the side too, is a command line tool called patternist, for quickly testing queries, scripting and old-school web solutions. But who cares, blogs with screenshots is the thing:

Stronger XML support in Qt has been consistently asked for by users over a long time, with XPath being one of the main requests. Hopefully Patternist, with the help of KDE folks, users, and customers expressing what’s missing, will please those needs. Considering the similarities of XQuery and XSL-T, Patternist also serves as a foundation for implementing XSL-T, if so decided.

For KDE folks all this might ring a bell. Patternist was indeed first developed for a long time in the KDE repository, as part of KDOM. We just thought it would make a lot more use as part of Qt.

And I think exactly that makes this exciting. W3C’s XQuery working group has registered an astonishing number of exciting implementations. But for users, reliability is what matter in the end. Whether bugs will be fixed, whether people can answer questions, whether the piece is maintained and documented. Persistency. Trolltech swiftly carries this on its shoulders(assuming I brush my teeth and all that).

Combined with that Qt is open source and the Patternist SDK used for development is as well, this is like eating some nasty chocolate while at the same time singing a little duet with Miss Piggy. I can’t sing, nor can Piggy (although she tries), but you get my point.

Humble modesty aside, it is worth to mention that this still needs work. About 94% of the test suite is passed, the API needs more work, and there is performance issues.

Nailing test cases and trimming code paths are problems that have known solutions (though typically horrible to carry out). Harder is to know what people need and how they need it. It’s hard to guess what kind of APIs or extensions Amarok or KOffice or a GNOME or web application need.

If you got input, feel free to add a comment to the blog, send a report to Trolltech, grab me(FransE) on the Open Projects IRC network, or ask a question or two on the qt-interest mailing list.

The documentation starts over here.

Posted by Frans Englich
Filed in Qt, QtXmlPatterns

11 Comments »

Representing XML

January 11, 2007

Patternist, the XQuery/XPath/XSL-T framework, is abstracted to be able to use different tree-implementations, in concept like Saxon. Up until now, Patternist has been using one that wrapped Qt’s QDom. When I started writing that very first tree backend it was with the purpose to boot strap the rest of the code, a temporary solution that got the job done until the solution for production use arrived. QDom’s massive memory usage — my measurements says roughly 18 times the document size — is people’s usual complaint. The reason I stalled was that the XPath Data Model, simply couldn’t be implemented with QDom, let alone efficiently. So what now?

This blog entry is tinkering — although without accompanying code — on how to represent XML.

Read the rest of this entry »

Posted by Frans Englich
Filed in HTML/XML/XHTML