kurtmckee, posts by tag: listparser - LiveJournal

listparser 0.18

I'm pleased to announce the release of listparser 0.18!

This release simply replaces the regular expression-based RFC 822 date parser with procedural code. The package is available on PyPI.

listparser is a Python module that parses subscription lists (also called reading lists) and returns all of the feeds and subscription lists that it finds. It supports OPML, RDF+FOAF, and the iGoogle exported settings format, and runs in Python 2.5 through Python 3.5, Jython, and PyPy.

download | documentation | source code

Tags: listparser, release

Posted at 01:35 pm | Link | Share | Flag

I'll be in the Netherlands

Hey everybody, I'm going to be in Zaltbommel and Waardenburg, Netherlands starting February 4th for two weeks. If anyone would like to get together with me while I'm there, please let me know! I'd be delighted to meet with you, have a drink, and talk about feedparser or Python in general (or anything else for that matter)!

UPDATE: I'm going to be in Skopje, Macedonia starting February 18th for a week. I'd love to meet with anyone around that area, too!

Tags: feedparser, life, listparser, python, work

Posted at 05:21 pm | Link | Leave a comment | Share | Flag

Date parsing

I have lost patience with the RFC 822 date parsing in both feedparser and listparser. Back in 2009 when I started writing listparser I decided to use regular expressions to turn RFC 822 date strings into Python datetime objects. Earlier this year when I discovered that feedparser's RFC 822 parser had copied code from Python's rfc822 module I stripped it out and replaced it with the code I'd written for listparser.

Over time it's been necessary to tweak the code to support additional variations: extra commas, extra whitespace, swapped days and months, non-standard timezone modifications...so this weekend I decided to look at what the regular expression currently looks like. The result is not pretty:

(?:(?P<dayname>mon|tue|wed|thu|fri|sat|sun), )?(?P<day> *\\d{1
,2}) (?P<month>jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec
)(?:[a-z]*,?) (?P<year>(?:\\d{2})?\\d{2})(?: (?P<hour>\\d{2}):
(?P<minute>\\d{2})(?::(?P<second>\\d{2}))? (?:etc/)?(?P<tz>ut|
gmt(?:[+-]\\d{2}:\\d{2})?|[aecmp][sd]?t|[zamny]|[+-]\\d{4}))?

What's worse, to support swapped days and months it's necessary to create a second regular expression to match that, too. So I decided to rewrite the code using str.split() and a couple of dictionaries. I then ran timing tests on the whole affair, and I'm feeling pretty pleased with the results so far, as it just barely edges out the current code. I expect the new parser to land in feedparser after I integrate it into listparser.

Tags: feedparser, listparser

Posted at 11:27 am | Link | Leave a comment | Share | Flag

listparser 0.17 - "Territory expansion"

I'm pleased to announce that listparser 0.17 is available for immediate download! This release features support for Python 2.4 through 3.3, Jython 2.5.2 and 2.5.3, as well as PyPy 1.8.0. The codebase runs on all of these with no modification, and should still also run on IronPython 2.6.2 (although I'm currently not able to test this).

You can download a copy from the Python Package Index, or clone the git repository at GitHub. Bug reports and pull requests are always accepted at GitHub.

listparser is a Python library that parses subscription lists (also called reading lists) and returns all of the feeds, subscription lists, and "opportunity" URLs that it finds. It supports OPML, RDF+FOAF, and the iGoogle exported settings format.

Tags: listparser

Posted at 03:10 am | Link | Leave a comment | Share | Flag

listparser now has a unified codebase

listparser now supports both Python 2 and Python 3 with a single codebase. I'd wondered how difficult it would be, and it turns out: not very! I'm delighted with the results, particularly because I can now generate coverage reports that take into account every interpreter that listparser runs on (including Jython, and excepting IronPython).

As a side benefit, I can take this experience and apply it to the feedparser codebase in the future. However, I have some bug reports and documentation changes to tend to first...

listparser is a Python library that parses subscription lists (also called reading lists) and returns all of the feeds, subscription lists, and "opportunity" URLs that it finds. It supports OPML, RDF+FOAF, and the iGoogle exported settings format.

Tags: listparser

Posted at 11:00 pm | Link | Leave a comment | Share | Flag

listparser 0.16 - "Refresh"

I'm pleased to announce that a new version of listparser has been released!

The big change this release is that users can now more easily install listparser in Python 3 environments, thanks to an updated setup.py file. This is made possible by running listparser through the 2to3 tool automatically if setup.py detects it is being run by a Python 3 interpreter.

Download it, and report back if you find any bugs!

listparser is a Python library that parses subscription lists (also called reading lists) and returns all of the feeds, subscription lists, and "opportunity" URLs that it finds. It supports OPML, RDF+FOAF, and the iGoogle exported settings format.

[ homepage | documentation | bugs ]

Tags: listparser, release

Posted at 03:14 am | Link | Leave a comment | Share | Flag

listparser v0.15 - "A special day"

I am absolutely thrilled to announce the latest version of listparser. The big news is that it now supports IronPython, which means that listparser runs on three different Python interpreters (CPython, Jython, and now IronPython)!

There's a complication with IronPython, however; it doesn't ship with any XML parsers, which means listparser is dead in the water. Happily, there's a file called pyexpat.py that, when placed in the Python path, allows listparser and IronPython to work together. In my case I simply put pyexpat.py in the same directory as listparser.py.

Download it, use it, and report any bugs you find!

listparser is a Python library that parses subscription lists (also called reading lists) and returns all of the feeds, subscription lists, and "opportunity" URLs that it finds. It supports OPML, RDF+FOAF, and the iGoogle exported settings format.

[ homepage | documentation | repository | bugs ]

Tags: listparser

Posted at 05:37 am | Link | Leave a comment | Share | Flag

listparser v0.14 - "A good year"

I'm pleased to announce the release of listparser v0.14!

The big news is that I added support for the relevant Yandex.ru FOAF extensions (translations from Google and Babelfish). With this, LiveJournal FOAF files are now supported (example). This is great news for LiveJournal users with feed readers that use listparser, as they could now subscribe to their own FOAF file and follow all of their friends' LiveJournal blogs!

Other than that, I cleaned up the code and improved the documentation. Check it out!

listparser is a Python library that parses subscription lists (also called reading lists) and returns all of the feeds, subscription lists, and "opportunity" URLs that it finds. It supports OPML, RDF+FOAF, and the iGoogle exported settings format.

[ homepage | download | repository | documentation | bugs ]

Tags: listparser

Posted at 10:34 am | Link | Leave a comment | Share | Flag

LiveJournal FOAF file support

I'm really pleased to announce that, after reporting a LiveJournal bug over a year ago, the one-word patch - strangely attributed to another user - will soon be running live on the site! In celebration, I've started patching listparser to support LiveJournal FOAF files. My hope is that feed readers using listparser will allow users to subscribe to their friends list using their FOAF file.

I expect to release a new version soon, so be on the lookout!

P.S. - I may have found another bug in the LiveJournal code, how exciting is that?

UPDATED: I've reported the bug. I hope it gets fixed, and that this time they credit the fix to me instead of the intermediary support guy.

Tags: listparser

Posted at 01:37 am | Link | Leave a comment | Share | Flag

listparser v0.13 - "Revelations"

I'm pleased to announce the release of listparser v0.13! This is an important bugfix release, and I recommend everyone upgrade immediately.

Bug fixes

The last release of listparser contained an infinite loop bug in the Injector code. Large documents that contained undeclared character references would trigger the bug, which occurred because the cache never got cleared, and each call to read() would return the same cached content over and over and over. This is fixed, and indeed the Injector code has been significantly simplified.

Additionally, I fixed a long-standing intermittent bug in the unit tests. Back when I released version 0.10, I mentioned that I was occasionally seeing a test fail; after a little sleuthing I discovered that the unit tests were sometimes starting to run before the server thread was ready to accept localhost URL requests. This bug is now fixed.

Unit testing

I've overhauled and modularized the unit testing code; most significantly, very few of the test files are retrieved from the server thread anymore, which has noticeably sped up the test suite. Additionally, a number of tests now call the function they're meant to test (such as _mkfile() and _rfc822()), rather than pushing everything through parse(). Finally, more tests were added, bringing the code coverage very close to 100% again.

Get it!

With all of these changes, I think that this may be the most stable listparser release ever. You can download listparser from PyPI and copy listparser.py into your project, or you can install it easily using easy_install:

$ easy_install listparser

Have fun!

listparser is a Python library that parses subscription lists (also called reading lists) and returns all of the feeds, subscription lists, and "opportunity" URLs that it finds. It supports OPML, RDF+FOAF, and the iGoogle exported settings format.

[ homepage | download | repository | documentation | bugs ]

Tags: listparser

Posted at 07:18 am | Link | Leave a comment | Share | Flag

kurtmckee, posts by tag: listparser - LiveJournal

Apr. 24th, 2015

listparser 0.18

Jan. 13th, 2013

I'll be in the Netherlands

Dec. 17th, 2012

Date parsing

Dec. 16th, 2012

listparser 0.17 - "Territory expansion"

Apr. 18th, 2012

listparser now has a unified codebase

Dec. 17th, 2011

listparser 0.16 - "Refresh"

Nov. 15th, 2010

listparser v0.15 - "A special day"

Oct. 22nd, 2010

listparser v0.14 - "A good year"

Sep. 19th, 2010

LiveJournal FOAF file support

Feb. 1st, 2010

listparser v0.13 - "Revelations"

Bug fixes

Unit testing

Get it!

December 2019

Me, elsewhere

Tags

Page Summary

Syndicate