New PostgreSQL pg_docbot is live

Posted by ads' corner on Friday, 2012-08-17
Posted in [Perl][Postgresql-News]

Last night a long-running project of mine went live: pg_docbot v2.

For years, Jan Wieck provided a helper bot (rtfm_please) in the #postgresql IRC channel in the freenode network. Because of protocol changes in the freenode network, this bot was no longer functional. Together with some others we decided to write a quick and dirty new bot. As it is with dirty hacks, not everything was optimal: after timeouts the bot was not able to reconnect - more exactly the POE framework did not even recognize the timeout. Also extending the bot and adding new functionality was complicated. For a while I collected all these problems in my personal bugtracker and about two years ago I started a full rewrite.

Some of the new key features:

  • pg_docbot’s channel limit is gone: a user in the freenode network can only join 20something channels, the new bot was designed from the ground to handle multiple IRC connections and circumvent this problem
  • function to identify stale urls: the new ?lost command shows all unconnected urls
  • registered users are now either “op” or “admin”: all operators can issue ?learn and ?forget, admins can - of course - do everything
  • new command to post to all channels: the ?wallchan command let the doc post to all channels
  • i18n: every channel has a configured language, default is English - all messages in this channel are posted in the configured language (if translation is available)
  • watchdog on board: every session is monitored and reconnected, if necessary - no more “ads: can you please restart the bot?”
  • nickname handling: every session is monitoring their (registered) nickname and will reclaim the nick if necessary, also nickserv handling is included now
  • commands are recognized in different languages: a nice add-on, by-product of i18n, most commands can be used in different languages - like “search” (English) and “suche” (German)
  • bot can join and leave channels on the fly: not much to say about, just that you can have the bot in a temporary PostgreSQL channel if you like
  • channels can have paswords now: this works both for configured channels as well as on-the-fly joined channels
  • autojoin channels: configured but not joined channels are rejoined after a while, also it is possible to configure but not autojoin channels
  • statistics: the bot runs anonymous stats about the usage, like ?search, ?learn, ?forget and so on

There is still a lot to do, not all of my tickets are closed. If you want pg_docbot talking in your language, please send me translations. The pg_docbot code is on

Next things on my todo list:

  • verify each URL from time to time: mark unreachable as invalid
  • intelligent sort order: not yet sure how to solve this problem, right now there is no specific sort order
  • move pg_docbot to PostgreSQL infrastructure
  • web interface: the bot should redirect the user to the website if there are more then let’s say 2 or 3 urls, to avoid flooding the IRC channels
  • integration in website: the pg_docbot database contains very useful knowledge, there are plans to integrate this into the search on the main website
  • integration with every time the bot see’s a link from a paste site, it should scan the content and generate a posting on
  • monitor publish new postings in IRC channels
  • allow better search: like using a regexp

Here is a list of pg_docbot functionality:

  • ?search: also known as ??<keyword>, search for a specific keyword or url
  • ?help: output help
  • ?help <topic>: output help for <topic>
  • ?learn <keyword> <url>: assign <keyword> to <url> in database, operator only
  • ?forget <keyword>: forget the keyword, operator only
  • ?forget <url>: forget the url, operator only
  • ?status: output statistics, admin only, command channel only
  • ?say <channel> <text>: say <text> in <channel>, admin only
  • ?wallchan <text>: say <text> in all channels, admin only
  • ?lost: show all unconnected (that’s: urls without associated keyword) urls, admin only, command channel only
  • ?key <keyword>: search for all urls connected to <keyword>, operator only, command channel only
  • ?url <url>: search for all keywords connected to <url>, operator only, command channel only
  • ?join <channel> <session number> lang:<language> pass:<password>: channel and session number are required, admin only
  • ?leave <channel>: leave <channel>, admin only
  • ?quit: shutdown the bot

Categories: [Perl] [Postgresql-News]