Skip to content

Wöchentlicher PostgreSQL Newsletter - 01. August 2010

Der Originalartikel befindet sich unter:

== Wöchentlicher PostgreSQL Newsletter - 01. August 2010 ==

PostgreSQL 9.0 Beta 4 ist erschienen. Testen!

Job::Machine 0.011, eine zuverlässige Perl Job Queue die PostgreSQL
verwendet, ist erschienen.

pgAdmin 1.10.5, ein GUI-basierendes Management
Werkzeug für PostgreSQL, ist erschienen.

pg_sample 0.01, ein Werkzeug zum Exportieren von kleinen
Beispieldatensätzen aus einer großen PostgreSQL Datenbank, ist

== PostgreSQL Jobs im August ==

== PostgreSQL Lokal ==

Reuven Lerner gibt einen 5-tägigen PostgreSQL Kurs vom 1. bis 5.
August am Hi-Tech College in Herzliya, Israel.

FrOSCon 2010 findet in St. Augustin, Deutschland am 21. und 22. August
2010 statt. Die Deutsche PostgreSQL Usergruppe hat ihren eigenen Devroom
und sucht noch Vorträge. Weitere Informationen unter:

Der Call for Papers für West ist bis zum 5. September 2010 offen.
Details unter:

== PostgreSQL in den News ==

Planet PostgreSQL:

Dieser wöchentliche PostgreSQL Newsletter wurde erstellt von David

Sende Neuigkeiten und Ankündigungen bis Sonntag, 15 Uhr Pazifischer
Zeit. Bitte sende englische Beiträge an, deutsche an, italienische an

== Reviews ==

KaiGai Kohei reviewed Robert Haas's patch to add get_whatever_oid

== Applied Patches ==

Peter Eisentraut committed:

- In pgsql/doc/src/sgml/xfunc.sgml, add table creation and population
  to example.  From John Gage.

- In pgsql/doc/src/sgml/perform.sgml, fix grammar.  Backpatched to

- In pgsql/contrib/xml2/README.xml2, spelling fix.

- Spelling fixes.

- In pgsql/src/bin/psql/common.c, show psql timing output even in
  quiet mode.  These two settings ought to be independent of each

- In pgsql/contrib/chkpass/, remove wildly obsolete type
  attribute externallength from chkpass.

- Fix indentation of verbatim block elements.  Block elements with
  verbatim formatting (literallayout, programlisting, screen,
  synopsis) should be aligned at column 0 independent of the
  surrounding SGML, because whitespace is significant, and indenting
  them creates erratic whitespace in the output.  The CSS stylesheets
  already take care of indenting the output.  Assorted markup
  improvements to go along with it.

- Translation updates for 9.0beta4.

Alvaro Herrera committed:

- In PL/Perl[U], ensure $_SHARED is declared in the main:: namespace,
  per bugs #5570 and #5571.  Author: Alex Hunsaker

Robert Haas committed:

- In postgresql.conf.sample, use on/off rather than true/false.  We
  used to be consistent about this, but my recent patch to add a
  restart_after_crash GUC failed to follow the existing convention.
  Report and patch from Fujii Masao.

- In pgsql/doc/src/sgml/plpgsql.sgml, fix typo in PL/pgsql code
  example.  Backpatch to 8.4.  Marc Cousin.  Review by Kevin Grittner.

- Add ssl_cipher() and ssl_version() functions to contrib/sslinfo.
  Review by Dave Page.

- In pgsql/src/backend/access/heap/heapam.c, fix possible page
  corruption by ALTER TABLE .. SET TABLESPACE.  If a zeroed page is
  present in the heap, ALTER TABLE .. SET TABLESPACE will set the LSN
  and TLI while copying it, which is wrong, and heap_xlog_newpage()
  will do the same thing during replay, so the corruption propagates
  to any standby.  Note, however, that the bug can't be demonstrated
  unless archiving is enabled, since in that case we skip WAL logging
  altogether, and the LSN/TLI are not set.  Back-patch to 8.0; prior
  releases do not have tablespaces.  Analysis and patch by Jeff Davis.
  Adjustments for back-branches and minor wordsmithing by me.

- In pgsql/contrib/fuzzystrmatch/fuzzystrmatch.c, avoid using
  text_to_cstring() in levenshtein functions.  Operating directly on
  the underlying varlena saves palloc and memcpy overhead, which
  testing shows to be significant.  Extracted from a larger patch by
  Alexander Korotkov.

- Make details of the Numeric representation private to numeric.c.
  Review by Tom Lane.

- Make psql distinguish between unique indices and unique constraints.
  Josh Kupershmidt.  Reviewing and kibitzing by Kevin Grittner and me.

Tom Lane committed:

- Rewrite the rbtree routines so that an RBNode is the first field of
  struct representing a tree entry, rather than being a separately
  piece of storage.  This API is at least as clean as the old one (if
  more so --- there were some bizarre choices in there) and it permits
  very substantial memory savings, on the order of 2X in ginbulk.c's

- Fix an additional set of problems in GIN's handling of lossy page
  pointers.  Although the key-combining code claimed to work correctly
  if its input contained both lossy and exact pointers for a single
  page in a single TID stream, in fact this did not work, and could
  not work without pretty fundamental redesign.  Modify keyGetItem so
  that it will not return such a stream, by handling lossy-pointer
  cases a bit more explicitly than we did before.  Per followup
  investigation of a gripe from Artur Dabrowski.  An example of a
  query that failed given his data set is select count(*) from
  search_tab where (to_tsvector('german', keywords ) @@
  to_tsquery('german', 'ee:* | dd:*')) and (to_tsvector('german',
  keywords ) @@ to_tsquery('german', 'aa:*')); Back-patch to 8.4 where
  the lossy pointer code was introduced.

- In pgsql/src/backend/tsearch/ts_selfuncs.c, add some knowledge about
  prefix matches to tsmatchsel().  It's not terribly bright, but it
  beats assuming that a prefix match behaves identically to an exact
  match, which is what the code was doing before :-(.  Noted while
  experimenting with Artur Dobrowski's example.

- Fix ANALYZE's ancient deficiency of not trying to collect stats for
  expression indexes when the index column type (the opclass
  opckeytype) is different from the expression's datatype.  When
  coded, this limitation wasn't worth worrying about because we had no
  intelligence to speak of in stats collection for the datatypes used
  by such opclasses.  However, now that there's non-toy estimation
  capability for tsvector queries, it amounts to a bug that ANALYZE
  fails to do this.  The fix changes struct VacAttrStats, and
  therefore constitutes an API break for custom typanalyze functions.
  Therefore we can't back-patch it into released branches, but it was
  agreed that 9.0 isn't yet frozen hard enough to make such a change
  unacceptable.  Ergo, back-patch to 9.0 but no further.  The API
  break had better be mentioned in 9.0 release notes.

- Back-patch fix for renaming asyncCommitLSN to asyncXactLSN.  AIUI
  this was supposed to go into 9.0 as well as HEAD.

- Fix an ancient typo that prevented the detection of conflicting
  fields when interval input "invalid" was specified together with
  other fields.  Spotted by Neil Conway with the help of a clang
  warning.  Although this has been wrong since the interval code was
  written more than 10 years ago, it doesn't affect anything beyond
  which error message you get for a wrong input, so not worth
  back-patching very far.

- Tweak a couple of macros in the regex code to suppress compiler
  warnings from "clang".  The VERR changes make an assignment
  unconditional, which is probably easier to read/understand anyway,
  and one can hardly argue that it's worth shaving cycles off the case
  of reporting another error when one has already been detected.  The
  INSIST change limits where that macro can be used, but not in a way
  that creates a problem for any existing call.

- In pgsql/src/template/darwin, don't try to force use of
  -no-cpp-precomp on OS X.  It's been five years since Apple shipped a
  compiler that needed this switch, and there's increasing interest in
  using other compilers that won't accept the switch at all.  Better
  to let anybody who still needs the switch inject it via CPPFLAGS.
  Per gripe from Neil Conway.

  Also, fix minor memory leaks in code called by ginEntryInsert, in
  particular in ginInsertValue and entryFillRoot, as well as
  ginEntryInsert itself.  These leaks resulted in the GIN index build
  context continuing to bloat even after we'd filled it to
  maintenance_work_mem and started to dump data out to the index.  In
  combination these fixes restore the GIN index build code to honoring
  the maintenance_work_mem limit about as well as it did in 8.4.
  Speed seems on par with 8.4 too, maybe even a bit faster, for a
  non-pathological case in which HEAD was formerly slower.  Back-patch
  to 9.0 so we don't have a performance regression from 8.4.

- Fix potential failure when hashing the output of a subplan that
  produces a pass-by-reference datatype with a nontrivial projection
  step.  We were using the same memory context for the projection
  operation as for the temporary context used by the hashtable
  routines in execGrouping.c.  However, the hashtable routines feel
  free to reset their temp context at any time, which'd lead to
  destroying input data that was still needed.  Report and diagnosis
  by Tao Ma.  Back-patch to 8.1, where the problem was introduced by
  the changes that allowed us to work with "virtual" tuples instead of
  materializing intermediate tuple values everywhere.  The earlier
  code looks quite similar, but it doesn't suffer the problem because
  the data gets copied into another context as a result of having to
  materialize ExecProject's output tuple.

- In pgsql/src/backend/executor/nodeLockRows.c, fix oversight in new
  EvalPlanQual logic: the second loop over the ExecRowMark list in
  ExecLockRows() forgot to allow for the possibility that some of the
  rowmarks are for child tables that aren't relevant to the current
  row.  Per report from Kenichiro Tanaka.

- In pgsql/doc/src/sgml/installation.sgml, work around a documentation
  toolchain problem by replacing the "AIX-fixlevels" table with a
  <variablelist> carrying the same information.  Previously the 9.0
  documentation was failing to build as a US-size PDF file.  It's
  quite obscure what the real problem is or why this avoids it, but we
  need a hack now so we can build docs for beta4.  In passing do a bit
  of editing in the AIX installation docs, in particular remove a
  long-obsolete claim that the regression tests are likely to fail.

- In pgsql/src/backend/commands/tablecmds.c, fix another longstanding
  problem in copy_relation_data: it was blithely assuming that a local
  char[] array would be aligned on at least a word boundary.  There
  are architectures on which that is pretty much guaranteed to NOT be
  the case ... and those arches also don't like non-aligned memory
  accesses, meaning that log_newpage() would crash if it ever got
  invoked.  Even on Intel-ish machines there's a potential for a large
  performance penalty from doing I/O to an inadequately aligned
  buffer.  So palloc it instead.  Backpatch to 8.0 --- 7.4 doesn't
  have this code.

- In pgsql/src/include/catalog/pg_proc.h, clean up some
  inconsistencies in the volatility marking of various I/O related
  functions.  Per today's discussion, we will henceforth assume that
  datatype I/O functions are either stable or immutable, never
  volatile.  (This implies in particular that domain CHECK constraint
  expressions shouldn't be volatile, since domain_in executes them.)
  In turn, functions that execute the I/O functions of arbitrary
  datatypes should always be labeled stable.  This affects the
  labeling of array_to_string, which was unsafely marked immutable,
  and record_in, record_out, record_recv, record_send, domain_in,
  domain_recv, which were over-conservatively marked volatile.  The
  array I/O functions were already marked stable, which is correct per
  this policy but would have been wrong if we maintained domain_in as
  volatile.  Back-patch to 9.0, along with an earlier fix to correctly
  mark cash_in and cash_out as stable not immutable (since they depend
  on lc_monetary).  No catversion bump --- the implications of this
  are not currently severe enough to justify a forced initdb.

- In pgsql/doc/src/sgml/release-9.0.sgml, update release notes for 9.0
  beta 4.  Back-patch some changes that were made only in HEAD.

- Improved version of patch to protect pg_get_expr() against misuse:
  look through join alias Vars to avoid breaking join queries, and
  move the test to someplace where it will catch more possible ways of
  calling a function.  We still ought to throw away the whole thing in
  favor of a data-type-based solution, but that's not feasible in the
  back branches.  This needs to be back-patched further than 9.0, but
  I don't have time to do so today.  Committing now so that the fix
  gets into 9.0beta4.

- Rewrite the key-combination logic in GIN's keyGetItem() and
  scanGetItem() routines to make them behave better in the presence of
  "lossy" index pointers.  The previous coding was outright incorrect
  for some cases, as recently reported by Artur Dabrowski: scanGetItem
  would fail to return index entries in cases where one index key had
  multiple exact pointers on the same page as another key had a lossy
  pointer.  Also, keyGetItem was extremely inefficient for cases where
  a single index key generates multiple "entry" streams, such as an @@
  operator with a multiple-clause tsquery.  The presence of a lossy
  page pointer in any one stream defeated its ability to use the
  opclass consistentFn, resulting in probing many heap pages that
  didn't really need to be visited.  In Artur's example case, a query
  like WHERE tsvector @@ to_tsquery('a & b') was about 50X slower than
  the theoretically equivalent WHERE tsvector @@ to_tsquery('a') AND
  tsvector @@ to_tsquery('b') The way that I chose to fix this was to
  have GIN call the consistentFn twice with both TRUE and FALSE values
  for the in-doubt entry stream, returning a hit if either call
  produces TRUE, but not if they both return FALSE.  The code handles
  this for the case of a single in-doubt entry stream, but punts
  (falling back to the stupid behavior) if there's more than one lossy
  reference to the same page.  The idea could be scaled up to deal
  with multiple lossy references, but I think that would probably be
  wasted complexity.  At least to judge by Artur's example, such cases
  don't occur often enough to be worth trying to optimize.  Back-patch
  to 8.4.  8.3 did not have lossy GIN index pointers, so not subject
  to these problems.

- In pgsql/src/backend/tsearch/ts_selfuncs.c, tweak tsmatchsel() so
  that it examines the structure of the tsquery whenever possible (ie,
  whenever the tsquery is a constant), even when no statistics are
  available for the tsvector.  For example, foo @@ 'a & b'::tsquery
  can be expected to be more selective than foo @@ 'a'::tsquery,
  whether or not we know anything about foo.  We use
  DEFAULT_TS_MATCH_SEL as the assumed selectivity of individual query
  terms when no stats are available, then combine the terms according
  to the query's AND/OR structure as usual.  Per experimentation with
  Artur Dabrowski's example.  (The fact that there are no stats
  available in that example is a problem in itself, but nonetheless
  tsmatchsel should be smarter about the case.) Back-patch to 8.4 to
  keep all versions of tsmatchsel() in sync.

Simon Riggs committed:

- Add explicit regression tests for ALTER TABLE lock levels.  Use this
  to catch a couple of lock level assignments that slipped through
  manual testing, per Peter Eisentraut.

- Rename asyncCommitLSN to asyncXactLSN to reflect changed role in
  9.0.  Transaction aborts now record their LSN to avoid corner case
  behaviour in Streaming Replication/Hot Standby, hence change of name
  of variables and functions.  As pointed out by Fujii Masao.
  Cosmetic changes only.

Marc Fournier committed:

- Tag for beta4.

== Abgelehnte Patches (bis jetzt) ==

No one was disappointed this week :-)

== Eingesandte Patches ==

Robert Haas sent in a patch to add ssl_version() and ssl_cipher() to

KaiGai Kohei sent in two more patches implementing SECURITY LABEL.

Boxuan Zhai sent in two more revisions of the WIP patch for MERGE.

Alexander Korotkov sent in two more revisions of the patch to make
levenshtein multibyte-aware in contrib/fuzzystrmatch.

Fujii Masao sent in another revision of the patch to do synchronous

Simon Riggs sent in another patch to reduce the needed locking for

Florian Pflug sent in another revision of the patch to show individual
statement latencies in pgbench output.

Erik Rijkers sent in a patch to fix an oversight in the variadic
function docs.

Robert Haas sent in another revision of the patch to reduce the
on-disk size of NUMERICs.

Zoltan Boszormenyi sent in another WIP patch to add a lock_timeout

Henk Enting sent in a patch to help with the intersection of CHECK
constraints and multiple inheritance.


No Trackbacks


Display comments as Linear | Threaded

No comments

Add Comment

Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.
E-Mail addresses will not be displayed and will only be used for E-Mail notifications.
To leave a comment you must approve it via e-mail, which will be sent to your address after submission.
Form options