PostgreSQL reconsiders its process-based model [LWN.net]


Welcome to LWN.net

The following subscription-only content has been made available to you
by an LWN subscriber. Thousands of subscribers depend on LWN for the
best news from the Linux and free software communities. If you enjoy this
article, please consider accepting the trial offer on the right. Thank you
for visiting LWN.net!

Free trial subscription

Try LWN for free for 1 month: no payment
or credit card required. Activate
your trial subscription now
and see why thousands of
readers subscribe to LWN.net.

By Jonathan Corbet
June 19, 2023

In the fast-moving open-source world, programs can come and go quickly; a
tool that has many users today can easily be eclipsed by something better
next week. Even in this environment, though, some programs endure for a
long time. As an example, consider the
PostgreSQL database system
, which traces its
history
back to 1986. Making fundamental changes to a large code base
with that much history is never an easy task. As fundamental changes go,
moving PostgreSQL away from its process-oriented model is not a small one,
but it is one that the project is considering seriously.

A PostgreSQL instance runs as a large set of cooperating processes,
including one for each connected client. These processes communicate
through a number of shared-memory regions using an elaborate library that
enables the creation of complex data structures in a setting where not all
processes have the same memory mapped at the same address. This model has
served the project well for many years, but the world has changed a lot
over the history of this project. As a result, PostgreSQL developers are
increasingly thinking that it may be time to make a change.

A proposal

At the beginning of June, Heikki Linnakangas, seemingly following up on
some in-person conference discussions, posted a
proposal
to move PostgreSQL to a threaded model.

I feel that there is now pretty strong consensus that it would be a
good thing, more so than before. Lots of work to get there, and
lots of details to be hashed out, but no objections to the idea at
a high level.

The purpose of this email is to make that silent consensus explicit.

The message gave a quick overview of some of the challenges involved in
making such a move, and acknowledged, in an understated way, that this
transition “surely cannot be done fully in one release“. One thing
that was missing was a discussion of why this big change would be
desirable, but that was filled in as the discussion went on. As Andres
Freund put
it
:

I think we’re starting to hit quite a few limits related to the
process model, particularly on bigger machines. The overhead of
cross-process context switches is inherently higher than switching
between threads in the same process – and my suspicion is that that
overhead will continue to increase. Once you have a significant
number of connections we end up spending a *lot* of time in TLB
misses, and that’s inherent to the process model, because you can’t
share the TLB across processes.

He also pointed out that the process model imposes costs on development,
forcing the project to maintain a lot of duplicated code, including several
memory-management mechanisms that would be unneeded in a single address
space. In a
later message
he also added that it would be possible to share state
more efficiently between threads, since they all run within the same
address space.

The reaction of some developers, though, made it clear that the “pretty
strong consensus
” cited by Linnakangas might not be quite that strong after
all. Tom Lane said: “I
think this will be a disaster. There is far too much code that will get
broken
“. He added later
that the cost of this change would be “enormous“, it would create
more than one security-grade bug“, and that the benefits would not
justify the cost. Jonathan Katz suggested
that there might be other work that should have a higher priority. Others
worried that losing the isolation provided by separate processes could make
the system less robust overall.

Still, many PostgreSQL developers seem to be cautiously in favor of at
least exploring this change. Robert Haas said
that PostgreSQL does not scale well on larger systems, mostly as a result
of the resources consumed by all of those processes. “Not all databases
have this problem, and PostgreSQL isn’t going to be able to stop having it
without some kind of major architectural change
“. Just switching to
threads might not be enough, he said, but he suggested that this change
would enable a number of other improvements.

How to get there

Moving the core of the PostgreSQL server into a single address space will
certainly present a number of challenges. The biggest one, as pointed
out
by Haas and others, would appear to be the server’s “widespread
and often gratuitous use of global variables
“. Globals work well
enough when each server process has its own set, but that approach clearly
falls apart when threads are used instead. According
to Konstantin Knizhnik
, there are about 2,000 such variables currently
used by the PostgreSQL server.

A couple of approaches to this problem were discussed. One was pulling all
of the global variables into a big “session state” structure that would be
thread-local. That idea quickly loses its appeal, though, when one
considers trying to create and maintain a 2,000-member structure, so the
project is unlikely to go this way. The alternative is to simply throw all
of the globals into thread-local storage, an approach that is easy and
would work, but heavy use of thread-local storage would exact a performance
penalty that would reduce the benefits of the switch to threads in the
first place. Haas said that marking globals specially (to put them into
thread-local storage, among other things) would be a beneficial project
in its own right, as that would be a good first step in reducing their use.
Freund agreed,
saying that this effort would pay off even if the switch to threads never
happens.

But, Freund cautioned,
moving global variables to thread-local storage is the easiest part of the
job:

Redesigning postmaster, defining how to deal with extension
libraries, extension compatibility, developing tools to make
developing a threaded postgres feasible, dealing with freeing
session lifetime memory allocations that previously were freed via
process exit, making the change realistically reviewable,
portability are all much harder.

An interesting point that received surprisingly little attention in the
discussion is that Knizhnik has
already done a threads port
of PostgreSQL. The global-variable
problem, he said, was not that difficult. He had more trouble with
configuration data, error handling, signals, and the like. Support for
externally maintained extensions will be a challenge. Still, he saw some
significant benefits in working in the threaded environment. Anybody who
is thinking about taking on this project would be well advised to look
closely at this work as a first step.

Another complication that the PostgreSQL developers have in mind is that of
supporting both the process-based and thread-based modes, perhaps
indefinitely. The need to continue to support running in the process-based
mode would make it harder to take advantage of some of the benefits offered
by threads, and would significantly increase the maintenance burden
overall. Haas, though, is
not convinced
that it would ever be possible to remove support for the
process-based mode. Threads might not perform better for all use cases, or
some important extensions may never gain support for running in threads.
The removal of process support is, as he noted, a question that can only
really be considered once threads are working well.

That point is, obviously, a long way into the future, assuming it arrives
at all. While the outcome of the discussion suggests that most PostgreSQL
developers think that this change is good in the abstract, there are also
clearly concerns about how it would work in practice. And, perhaps more
importantly, nobody has, yet, stepped up to say that they would be willing
to put in the time to push this effort forward. Without that crucial
ingredient, there will be no switch to threads in any sort of foreseeable
future.


Did you like this article? Please accept our
trial subscription offer to be
able to see more content like it and to participate in the discussion.

(Log in to post comments)

Source link