Last night a long-running project of mine went live: pg_docbot v2.
For years, Jan Wieck provided a helper bot (rtfm_please) in the #postgresql IRC channel in the freenode network. Because of protocol changes in the freenode network, this bot was no longer functional. Together with some others we decided to write a quick and dirty new bot. As it is with dirty hacks, not everything was optimal: after timeouts the bot was not able to reconnect - more exactly the POE framework did not even recognize the timeout. Also extending the bot and adding new functionality was complicated. For a while I collected all these problems in my personal bugtracker and about two years ago I started a full rewrite.
Some of the new key features:
- pg_docbot's channel limit is gone: a user in the freenode network can only join 20something channels, the new bot was designed from the ground to handle multiple IRC connections and circumvent this problem
- function to identify stale urls: the new ?lost command shows all unconnected urls
- registered users are now either "op" or "admin": all operators can issue ?learn and ?forget, admins can - of course - do everything
- new command to post to all channels: the ?wallchan command let the doc post to all channels
- i18n: every channel has a configured language, default is English - all messages in this channel are posted in the configured language (if translation is available)
- watchdog on board: every session is monitored and reconnected, if necessary - no more "ads: can you please restart the bot?"
- nickname handling: every session is monitoring his (registered) nickname and will reclaim the nick if necessary, also nickserv handling is included now
- commands are recognized in different languages: a nice add-on, by-product of i18n, most commands can be used in different languages - like "search" (English) and "suche" (German)
- bot can join and leave channels on the fly: not much to say about, just that you can have the bot in a temporary PostgreSQL channel if you like
- channels can have paswords now: this works both for configured channels as well as on-the-fly joined channels
- autojoin channels: configured but not joined channels are rejoined after a while, also it is possible to configure but not autojoin channels
- statistics: the bot runs anonymous stats about his usage, like ?search, ?learn, ?forget and so on
There is still a lot to do, not all of my tickets are closed. If you want pg_docbot talking in your language, please send me translations. The pg_docbot code is on git.postgresql.org.
Next things on my todo list:
- verify each URL from time to time: mark unreachable as invalid
- intelligent sort order: not yet sure how to solve this problem, right now there is no specific sort order
- move pg_docbot to PostgreSQL infrastructure
- web interface: the bot should redirect the user to his website if there are more then let's say 2 or 3 urls, to avoid flooding the IRC channels
- integration in postgresql.org website: the pg_docbot database contains very useful knowledge, there are plans to integrate this into the search on the main website
- integration with explain.depesz.com: every time the bot see's a link from a paste site, it should scan the content and generate a postign on explain.depesz.com
- monitor planet.postgresql.org: publish new postings in IRC channels
- allow better search: like using a regexp
Continue reading "New PostgreSQL pg_docbot is live"