The Google Summer of Code 2021 (GSoC) for the PostgreSQL Project is wrapped up. The timeline this year was shortened to half, compared to previous years. That’s good, because smaller projects can be worked on, and students have a chance to cope with a changing environment at home and university. On the other hand, the shorter time doesn’t allow diving into more complex projects. Nevertheless, with the help of all mentors, six students successfully concluded their projects.
In no particular order:
Add more monitoring to pg_systat
pg_systat is systat
for PostgreSQL. It allows you to monitor various PostgreSQL statistics tables from a terminal window. Prior to the summer, it only monitored the database, table, index, vacuum, and replication statistics provided by the PostgreSQL catalog tables.
This project added monitoring for other aspects of PostgreSQL. pg_systat
is now capable of monitoring data from:
- pg_stat_progress_copy catalog table
- pg_stat_statements extension
- pg_buffercache extension
In addition to documentation improvements, there is also an Asciinema video demoing the program.
There is also a presentation describing the summer’s work.
Create procedural language extension for the Julia programming language
PostgreSQL allows developers more freedom to extend the provided capabilities in an application-specific manner compared to other popular open-source Relational Databases. One example of this is the ability to write user-defined functions in other languages besides SQL and C.
Julia is still a new language but with a growing community and increasing popularity due to its several strong points, among which are the Python-like straightforward syntax, speed, and interoperability with other languages.
This GSoC project expands the existing PL/Julia project by adding even more useful features. Here is a summary of what was implemented over the summer:
- Added support for more data types as input and output:
NULL
, boolean, numeric types, composite types. Arrays of base types can now be passed as input arguments to PL/Julia functions. Users can also return the above, or sets of the above from PL/Julia UDFs. - Added trigger support - users can write trigger functions in PL/Julia
- Added event trigger support
- Added support for the DO command
- Added functions for database access from PL/Julia:
spi_exec(query, limit)
andspi_exec(query)
for SQL-statement execution,spi_fetchrow(cursor)
andspi_cursor_close(cursor)
to return rows and to close the cursor respectively,spi_prepare(query, argtypes)
to prepare and save an execution plan andspi_exec_prepared(plan, args, limit)
to execute a previously prepared plan.
A brief presentation of the above is found here.
Documentation with examples of usage.
Currently the extension works for PostgreSQL version 13 and Julia versions >= 1.6.
Develop Performance Farm Benchmarks and Website
The existing Performance Farm had a small website. This year’s GSoC project greatly expanded and improved the website. It now lets you search for specific commits, compare different results and has nice and shiny graphs all over the place.
The improvements are not yet rolled out, because they depend on a newer Django version. However the sysadmin team already said that the upgrade will happen eventually.
Improve pgeu-system for Conference Management
The conference management system used by quite a number of PostgreSQL conferences (originally named pgeu-system (mirror)) got a new form system, which allows better previews including sending preview emails to the author. In addition to better previews, it’s now possible to write the text in Markdown format and the system will handle the preview for both website and emails.
pgagroal: Metrics and monitoring
This project focused on expanding the collected metrics and expose them through the Prometheus interface. The requirements were driven by the Grafana 8 dashboard to provide the administrators with an overview of how the connection pool is performing in various categories. Furthermore, the GSoC project improved parts of the implementation such that it is easier to add new metrics and in a more fine-grained detail in the future. The GSoC work will be available in pgagroal 1.3.0 scheduled for release September 7th, 2021.
Junduo Dong (@An-DJ) did an excellent job on this project which will benefit the pgagroal community as a whole.
WAL-G optimization backup storage by remote-copy APIs
WAL-G supports two backup types: full and delta. When doing a full backup, WAL-G sends all database files to the cloud. Delta backups store only changed files. The goal of Alexander Slesarev’s project was to optimize the full backup creation process.
The full backup pushing process could take several hours for big databases, but in some cases, many files stay unchanged from the last backup, so sending them to the cloud again is not necessary.
To reduce network load and time spent in full backup creation, unchanged files could be copied from the last backup using the cloud storage remote copy API. During the GSoC 2021, Alexander designed and implemented this idea.
As a result of Alexander’s summer project, WAL-G got a new type of backup files composer called “copy composer”. Benchmark results show that copy composer is highly effective for databases with a large amount of cold data compared to hot data.
More details are available in Alexander’s blog post.
Mentors
All in all another successful GSoC project, and we hope that students stay with the project for a longer time.
Thank you to all mentors (in no particular order):
- Jesper Pedersen
- Andrey Borodin
- Lætitia Avrot
- Mark Wong
- Jonathan Katz
- Stephen Frost
- Mark Rofail
- Ilaria Battiston
- Greg Mullane
- Gabrielle Roth
- David Steele
- Daniil Zakhlystov
- Fabrízio Mello
- (me)