? ?

Oct. 27th, 2012

Getting back into it (part 3)

This post is about software development, but I'm disappointed to say it's not about feedparser or listparser development.

I'm back to working 14+ hour days, and much of my time has been spent writing automation scripts in a custom scripting language that can be interpreted by Tera Term. It has a featureset that makes what I'm doing fairly easy, but it lacks niceties that I'm accustomed to. For instance, everything is global. No, everything. There is no scope. Loops have break but not continue. Subroutines exist but not functions (no arguments, no return values...probably because it has no scope). Nevertheless, I've been able to accomplish a great deal with it.

Now I'm tackling a new problem: browser automation. I frequently work with HTTP-based interfaces, and it's pretty tedious. At first I thought "I need simplicity. I'll just use iMacros." Then I tried it and slapped myself, because while it seemed very easy to use it couldn't be used as a part of a larger script. So I installed Selenium. By the time I left work today I had some promising results, and I expect that I may have a great example by the end of tomorrow.

The biggest problem for me will be navigating through the stupid web interfaces of third-party vendors. Those guys lurvs their stupid frames, their Internet Explorer-only Javascript, and their Flash-based login screens that look like they're sitting on the edge of still waters that reflect what you're typing. Thank God for search engines, because I would never have overcome the simple issues I ran into today.

It's made me hungry to get another feedparser release out the door!

Feb. 12th, 2012

A case for forgetting

My software never forgets what I tell it. I've accumulated over a dozen dozen passwords, door codes, and PINs over the years, but while I don't use most of those 150-some services these days they're still clogging my password manager. My address book has hundreds of entries spanning almost a decade, but while all of those people are memorable, most I haven't thought about for years. My music player is full of music that I bought years ago but no longer enjoy or don't prefer to hear outside of the holiday season.

What I want is software that can archive information I don't need or want, but can retrieve it when necessary. Most of the birthdays on my calendar are for people in my address book that I lost contact with long ago. I haven't logged into the Seventeen or Bust website for years, but I might again one day (despite the lecherous name, it's actually a math thing I ran into while taking courses on Chaos Theory). I don't want Christmas music to pop up when my music player's set to random, but that doesn't mean I want to delete it. Archiving and later retrieving forgotten content is a common concept in email clients, but for some reason its musical equivalent is embodied only by "Best of the Decade" compilations from Readers Digest.

I hope one day I have software that meets this need. I'm also waiting anxiously for the "Best of the 90's" three disc set: I haven't listened to that one Tal Bachman song in about a minute!

Tags: ,

Mar. 27th, 2011

Predictions and facts

After porting feedparser to Python 3, I've consistently tested every change on Python 2.4 through Python 3.1. That's four versions of Python 2, and two versions of Python 3. I've also started creating coverage reports for each test run, so I can ensure that the tests are reasonably thorough. Unfortunately, both versions of Python 3 take at least five minutes to run on my computer, so after installing Python 3.2, a full test run takes almost 20 minutes!

Dissatisfied, I sat down and used the inexorably awesome cProfile module to get information on what's taking forever in Python 3.0, and pprint() is being called almost 3 million times wtf?! Where is pprint() even being used?

Well. It turns out that pretty error messages were being preemptively created for every test, regardless of whether the test failed or not. You know -- just in case. I moved two lines of code, and Python 3.0 through 3.2 are now running faster than Python 2. I never would have guessed that pprint() would cause me so much watch-checking aggravation, but thank goodness I had tools to tell me precisely where to look for a solution!

With that Aesop fresh in my mind, I've been really frustrated by a situation I've found myself in. I've been interacting with a guy who's been telling me that I need to use foo to fix all of my problems. I've expressed doubt multiple times since there's no profiling information, no performance numbers, and definitely no facts to support his case. Unfortunately, this guy keeps waving his hands and confidently predicting that foo will solve all of my problems and make me a cup of hot cocoa when I come in from the cold. Recently, however, I found out that foo might actually make it more difficult to get performance numbers. The stupid thing might actually insulate itself from profiling and reporting tools! When I brought this to his attention he replied (paraphrasing and emphasis my own) that it shouldn't be an issue. I predict that we'll be able to predict where the problems are.

Frankly, I don't need a prophet. I need an application profile. I need performance numbers. I need call counts and execution plans and logs. I won't be snookered by confidence, and you shouldn't be either.

Dec. 18th, 2010

Goodbye, Liferea

I've not seen eye-to-eye with the developers of my current feed aggregator, Liferea for a long time. I became aware that we had different ideas about how a feed aggregator should work when they tried implementing a new message count in the notification icon, but happily that decision was reversed.

Most recently the difference was hammered home when I read they had implemented what I consider to be a non-feature: sorting your feeds in alphabetical order. This has been something that I've wanted for a long time, but they implemented it as a context menu option instead of as something that happens automatically. Most astonishing was the last bit of the blog entry:

In several discussion about the need of this feature the usual argument was about the feed lists being much too long and a specific feed to hard to locate. Sorting all feeds would speed up the feed lookup. To everyone having this problem: please use folders! Organize your feeds in topic folders, don't add more than ~10 feeds to a folder. Don't even start loosing the overview!

I personally do believe the sorting option shouldn't even be necessary. Using folders has too many advantages (recursive viewing, hiding read items) to not utilize them.

See, and I do use the folders to organize my feeds into categories, but the developers don't grasp the problem I'm having: if I want to remove a feed I have to find the thing first. If I add a feed, I have to manually sort it so that I can always find it later. I never run into this when reading feeds, I run into this when organizing feeds. My inconvenience is trivialized by Lars Lindner's broad assertion that "you're not using the software correctly", but the conciliatory solution reflects a deep misunderstanding of my use case.

That said, I'm not writing this to condemn the developers' choice in how the software works, but instead to highlight the divide between how we think it should function. I respect that they want the software to work one way, and are writing it with that use case in mind (although why-oh-why they would ever add a feature without understanding why they're doing so is beyond me). But it's appropriate for me to find another piece of software that works by default the way I think it should.

Liferea is a powerful, featureful feed aggregator that has served me very reliably for a long time. I've felt several times during my years of use that it doesn't completely mesh with how I think, and I think that being told "you're not using the software correctly" is a good reason to set off in search of software that I will use correctly...I just hope that it's as good as Liferea.