Skip to content

Wöchentlicher PostgreSQL Newsletter - 21. August 2011

Der Originalartikel befindet sich unter:

== Wöchentlicher PostgreSQL Newsletter - 21. August 2011 ==

== PostgreSQL Produkt Neuigkeiten ==

MyJSQLView 3.30, ein GUI Werkzeug welches mit PostgreSQL verwendet
werden kann, ist erschienen.

pgpool-II 3.1.0 beta1, ein Connection Pooler und mehr,
ist erschienen.

Ein deutschsprachiges Tutorial für PostgreSQL 9.0 ist erschienen.

pgwatch 1.0beta2, ein Monitoring Werkzeug für PostgreSQL,
ist erschienen.

== PostgreSQL Jobs im August ==

== PostgreSQL Lokal ==

Postgres Open 2011, eine Konferenz die sich auf den Umbruch der
Datenbankindustrie durch PostgreSQL konzentriert, wird vom 14. bis 16.
September 2011 in Chicago, Illinois im Westin Michigan Avenue
Hotel stattfinden.

PG-Day Denver 2011 findet am Freitag, dem 21. Oktober 2011 auf dem
Auraria Campus in der Nähe von Downtown Denver, Colorado statt.

PostgreSQL Conference West (#PgWest) findet vom 27. bis 30. September
2011 im San Jose Convention Center in Jan Jose, Kalifornen, USA statt.

PostgreSQL Conference Europe 2011 findet vom 18. bis
21. Oktober in Amsterdam statt.

pgbr findet in Sao Paulo, Brazilien, am 3. und 4. November 2011 statt.

PGConf.DE 2011 ist die Deutschsprachige PostgreSQL Konferenz
und wird am 11. November 2011 im Rheinischen Industriemuseum
in Oberhausen, Deutschland, stattfinden. Der Call for Papers ist offen.

== PostgreSQL in den News ==

Planet PostgreSQL:

Dieser wöchentliche PostgreSQL Newsletter wurde erstellt von David Fetter.

Sende Neuigkeiten und Ankündigungen bis Sonntag, 15 Uhr Pazifischer
Zeit. Bitte sende englische Beiträge an, deutsche an, italienische an, spanische an

== Reviews ==

== Angewandte Patches ==

Tom Lane pushed:

- Fix unsafe order of operations in foreign-table DDL commands.  When
  updating or deleting a system catalog tuple, it's necessary to
  acquire RowExclusiveLock on the catalog before looking up the tuple;
  otherwise a concurrent VACUUM FULL on the catalog might move the
  tuple to a different TID before we can apply the update.  Coding
  patterns that find the tuple via a table scan aren't at risk here,
  but when obtaining the tuple from a catalog cache, correct ordering
  is important; and several routines in foreigncmds.c got it wrong.
  Noted while running the regression tests in parallel with VACUUM
  FULL of assorted system catalogs.  For consistency I moved all the
  heap_open calls to the starts of their functions, including a couple
  for which there was no actual bug.  Back-patch to 8.4 where
  foreigncmds.c was added.

- Fix race condition in relcache init file invalidation.  The previous
  code tried to synchronize by unlinking the init file twice, but that
  doesn't actually work: it leaves a window wherein a third process
  could read the already-stale init file but miss the SI messages that
  would tell it the data is stale.  The result would be bizarre
  failures in catalog accesses, typically "could not read block 0 in
  file ..." later during startup.  Instead, hold RelCacheInitLock
  across both the unlink and the sending of the SI messages.  This is
  more straightforward, and might even be a bit faster since only one
  unlink call is needed.  This has been wrong since it was put in (in
  2002!), so back-patch to all supported releases.

- Preserve toast value OIDs in toast-swap-by-content for
  CLUSTER/VACUUM FULL.  This works around the problem that a catalog
  cache entry might contain a toast pointer that we try to dereference
  just as a VACUUM FULL completes on that catalog.  We will see the
  sinval message on the cache entry when we acquire lock on the toast
  table, but by that point we've already told tuptoaster.c "here's the
  pointer to fetch", so it's difficult from a code structural
  standpoint to update the pointer before we use it.  Much less
  painful to ensure that toast pointers are not invalidated in the
  first place.  We have to add a bit of code to deal with the case
  that a value that previously wasn't toasted becomes so; but that
  should be a seldom-exercised corner case, so the inefficiency
  shouldn't be significant.  Back-patch to 9.0.  In prior versions, we
  didn't allow CLUSTER on system catalogs, and VACUUM FULL didn't
  result in reassignment of toast OIDs, so there was no problem.

- Fix incorrect order of operations during sinval reset processing.
  We have to be sure that we have revalidated each nailed-in-cache
  relcache entry before we try to use it to load data for some other
  relcache entry.  The introduction of "mapped relations" in 9.0 broke
  this, because although we updated the state kept in relmapper.c
  early enough, we failed to propagate that information into relcache
  entries soon enough; in particular, we could try to fetch pg_class
  rows out of pg_class before we'd updated its relcache entry's
  rd_node.relNode value from the map.  This bug accounts for Dave
  Gould's report of failures after "vacuum full pg_class", and I
  believe that there is risk for other system catalogs as well.  The
  core part of the fix is to copy relmapper data into the relcache
  entries during "phase 1" in RelationCacheInvalidate(), before
  they'll be used in "phase 2".  To try to future-proof the code
  against other similar bugs, I also rearranged the order in which
  nailed relations are visited during phase 2: now it's pg_class
  first, then pg_class_oid_index, then other nailed relations.  This
  should ensure that RelationClearRelation can apply
  RelationReloadIndexInfo to all nailed indexes without risking use of
  not-yet-revalidated relcache entries.  Back-patch to 9.0 where the
  relation mapper was introduced.

- Forget about targeting catalog cache invalidations by tuple TID.
  The TID isn't stable enough: we might queue an sinval event before a
  VACUUM FULL, and then process it afterwards, when the target tuple
  no longer has the same TID.  So we must invalidate entries on the
  basis of hash value only.  The old coding can be shown to result in
  various bizarre, hard-to-reproduce errors in the presence of
  concurrent VACUUM FULLs on system catalogs, and could easily result
  in permanent catalog corruption, up to and including complete loss
  of tables.  This commit is just a minimal fix that removes the
  unsafe comparison.  We should remove transmission of the tuple TID
  from sinval messages altogether, and then arrange to suppress the
  extra message in the common case of a heap_update that doesn't
  change the key hashvalue.  But that's going to be much more
  invasive, and will only produce a probably-marginal performance
  gain, so it doesn't seem like material for a back-patch.  Back-patch
  to 9.0.  Before that, VACUUM FULL refused to do any tuple moving if
  it found any INSERT_IN_PROGRESS or DELETE_IN_PROGRESS tuples (and
  CLUSTER would give up altogether), so there was no risk of moving a
  tuple that might be the subject of an unsent sinval message.

- Revise sinval code to remove no-longer-used tuple TID from inval
  messages.  This requires adjusting the API for syscache callback
  functions: they now get a hash value, not a TID, to identify the
  target tuple.  Most of them weren't paying any attention to that
  argument anyway, but plancache did require a small amount of fixing.
  Also, improve performance a trifle by avoiding sending duplicate
  inval messages when a heap_update isn't changing the catcache lookup

- Fix two issues in plpython's handling of composite results.  Dropped
  columns within a composite type were not handled correctly.  Also,
  we did not check for whether a composite result type had changed
  since we cached the information about it.  Jan Urbański, per a bug
  report from Jean-Baptiste Quenot

- Update 9.1 release notes to reflect commits through today.  Also do
  another pass of copy-editing.

- Explain max_prepared_transactions requirement in isolation tests'
  README.  Now that we have a test that requires nondefault settings
  to pass, it seems like we'd better mention that detail in the
  directions about how to run the tests.  Also do some very minor

- Tag 9.1rc1.

- Fix performance problem when building a lossy tidbitmap.  As pointed
  out by Sergey Koposov, repeated invocations of tbm_lossify can make
  building a large tidbitmap into an O(N^2) operation.  To fix, make
  sure we remove more than the minimum amount of information per call,
  and add a fallback path to behave sanely if we're unable to fit the
  bitmap within the requested amount of memory.  This has been wrong
  since the tidbitmap code was written, so back-patch to all supported

Peter Eisentraut pushed:

- Add "Reason code" prefix to internal SSI error messages.  This makes
  it clearer that the error message is perhaps not supposed to be
  understood by users, and it also makes it somewhat clearer that it
  was not accidentally omitted from translation.  Idea from Heikki
  Linnakangas, except that we don't mark "Reason code" for translation
  at this point, because that would make the implementation too

- Adjust regression tests for error message change

- Use less cryptic variable names

- Make pg_basebackup progress report translatable.  Also fix a
  potential portability bug, because INT64_FORMAT is only guaranteed
  to be available with snprintf, not fprintf.

- MacOS -> Mac OS.  Josh Kupershmidt

- Move \r out of translatable strings.  The translation tools are very
  unhappy about seeing \r in translatable strings, so move it to a
  separate fprintf call.

- Translation updates

- Improve detection of Python 3.2 installations.  Because of ABI
  tagging, the library version number might no longer be exactly the
  Python version number, so do extra lookups.  This affects
  installations without a shared library, such as ActiveState's
  installer.  Also update the way to detect the location of the
  'config' directory, which can also be versioned.  Ashesh Vashi

- Change PyInit_plpy to external linkage.  Module initialization
  functions in Python 3 must have external linkage, because
  PyMODINIT_FUNC does dllexport on Windows-like platforms.  Without
  this change, the build with Python 3 fails on Windows.

- Hide unused variable warnings under Python 3

Bruce Momjian pushed:

- In pg_upgrade, avoid dumping orphaned temporary tables.  This makes
  the pg_upgrade schema matching pattern match pg_dump/pg_dumpall.
  Fix for 9.0, 9.1, and 9.2.  Report and proposed bug fix by David

- In pg_upgrade, don't copy visibility map files from clusters that
  did not have crash-safe visibility maps to clusters that expect
  crash-safety.  Request from Robert Haas.

- Implement src/tools/copyright as a Perl program, so anyone can run
  it.  David Fetter

- Add executable bit to file.

- Remove use of 'tie' in perl for;  instead use normal
  file open/close.

- Fix problem with regex in copyright test.  Report and fix by Kris

- Fix to properly us 'tie' function.  Kris Jurka

- Have thread_test create its test files in the current directory,
  rather than /tmp.  Also cleanup C defines and add comments.  Per
  report by Alex Soto

Heikki Linnakangas pushed:

- Fix bogus comment that claimed that the new BACKUP METHOD line in
  backup_label was new in 9.0.  Spotted by Fujii Masao.

- If backup-end record is not seen, and we reach end of recovery from
  a streamed backup, throw an error and refuse to start up. The
  restore has not finished correctly in that case and the data
  directory is possibly corrupt.  We already errored out in case of
  archive recovery, but could not during crash recovery because we
  couldn't distinguish between the case that pg_start_backup() was
  called and the database then crashed (must not error, data is OK),
  and the case that we're restoring from a backup and not all the
  needed WAL was replayed (data can be corrupt).  To distinguish those
  cases, add a line to backup_label to indicate whether the backup was
  taken with pg_start/stop_backup(), or by streaming (ie.
  pg_basebackup).  This is a different implementation than what I
  committed to 9.2 a week ago.  That implementation was not
  back-patchable because it required re-initdb.  Fujii Masao

- Fix comment about which version had BACKUP METHOD line in
  backup_lable, again.  It was invalidated again by Fujii's patch to

- Teach pg_controldata and pg_resetxlog about the new
  backupEndRequired field in control file.

- Strip whitespace from SQL blocks in the isolation test suite. This
  is purely cosmetic, it removes a lot of IMHO ugly whitespace from
  the expected output.

- Add an SSI regression test that tests all interesting permutations
  in the order of begin, prepare, and commit of three concurrent
  transactions that have conflicts between them.  The test runs for a
  quite long time, and the expected output file is huge, but this test
  caught some serious bugs during development, so seems worthwhile to
  keep. The test uses prepared transactions, so it fails if the server
  has max_prepared_transactions=0. Because of that, it's marked as
  "ignore" in the schedule file.  Dan Ports

Magnus Hagander pushed:

- Adjust total size in pg_basebackup progress report when reality
  changes.  When streaming including WAL, the size estimate will
  always be incorrect, since we don't know how much WAL is included.
  To make sure the output doesn't look completely unreasonable, this
  patch increases the total size whenever we go past the estimate, to
  make sure we never go above 100%.

- Adjust wording now that estimated size can increase.  Per comment
  form Fujii Masao.

Andrew Dunstan pushed:

- Properly handle empty arrays returned from plperl functions.  Bug
  reported by David Wheeler, fix by Alex Hunsaker.

Robert Haas pushed:

- Remove obsolete README file.  Perhaps we ought to add some other
  kind of documentation here instead, but for now let's get rid of
  this woefully obsolete description of the sinval machinery.

- Make lazy_vacuum_rel call pg_rusage_init only if needed.
  do_analyze_rel already does it this way.  Euler Taveira de Oliveira

- Typo fix.

- Allow sepgsql regression tests to be run from a user homedir.
  KaiGai Kohei, with some changes by me.

- Fix contrib/sepgsql and contrib/xml2 to always link required
  libraries.  contrib/xml2 can get by without libxslt; the relevant
  features just won't work.  But if doesn't have libxml2, or if
  sepgsql doesn't have libselinux, the link succeeds but the module
  then fails to work at load time.  To avoid that, link the require
  libraries unconditionally, so that it will be clear at link-time
  that there is a problem.  Per discussion with Tom Lane and KaiGai

- Clean up 'chkselinuxenv' script.  Eliminate dependencies on "which",
  as we don't really need that to be installed for proper testing.
  Don't number the tests, as that increases the footprint of every
  patch that wants to add or remove tests.  Make the test output more
  informative, so that it's a bit easier to see what went right (or
  wrong).  Spelling and grammar improvements.

== Abgelehnte Patches (bis jetzt) ==

No one was disappointed this week :-)

== Eingesandte Patches ==

Joachim Wieland sent in another revision of the patch to provide
facilities for exporting and using snapshots.

Magnus Hagander sent in a patch intended to address some infelicities
in the representation of timestamptzs in replication.

KaiGai Kohei sent in three patches to unify DROP into a single

Heikki Linnakangas and Alexander Korotkov traded new revisions of the
patch to speed up GiST index builds.

Fujii Masao sent in two revisions of a patch to fix some issues in
cascading replication.

Jeevan Chalke sent in a patch to allow the same cursor names in nested

Magnus Hagander sent in another revision of the patch to implement

Josh Kupershmidt sent in a patch to fix up the pg_comments view.

Greg Smith sent in a patch that tracks and displays the accumulated
cost when autovacuum is running.  Code by Noah Misch and Greg Smith.

Josh Kupershmidt sent in a patch to fix some infelicities in

Shigeru HANADA sent in two more revisions of the patch which gives the
format of FDW options.

KaiGai Kohei sent in two more revisions of the patch to allow access
to the userspace access vector cache.

Wojciech Muła sent in a patch to fix some infelicities in PL/pgsql's
handling of %TYPE in arrays.


No Trackbacks


Display comments as Linear | Threaded

No comments

Add Comment

Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.
E-Mail addresses will not be displayed and will only be used for E-Mail notifications.
To leave a comment you must approve it via e-mail, which will be sent to your address after submission.
Form options