2010-09-15

Installing Postgres


I am trying the release candidate for PostgreSQL (more commonly known as Postgres) 9.0, on my Mac. [Update: Now using 9.0.1.] Version 8.4.4 is current, but given its imminent release I thought I'd give 9 a whirl.

Know that 9.0 is not a radical revision. Original plans labelled it 8.5. Some major features are being added, making it a big enough change that they decided to call it 9.0. But apparently the core architecture is the same. The bundled REAL Studio driver ("PostgreSQL Plugin.rbx") has not been updated for v9, but is working well for me so far as of R.S. 2010 Release 4.1.

Caveat: This discussion is focused on Mac OS X. Some details may vary for other platforms.
Caveat 2: I'm no expert on this topic. Take with a grain of salt. I'm just sharing my current understanding.

Where to Download

Postgres is an open-source project with its own web site.  Some commercial companies are closely involved, employing key developers and providing technical support and other services. One of these companies, EnterpriseDB provides some of the binary (compiled and ready to install) builds. So, for the Mac OS X release, you go to the Postgres.org downloads page offering 3 different download options, including a download link that takes you to EnterpriseDB. That link was for the "One click installer" option which I chose rather than the Fink or MacPorts option.

Terms

A "cluster" in Postgres can mean either of two totally different things:

  • Multiple computers cooperating in a way as to provide fail-over or other such features.
    - This is the meaning used commonly in most computing topics.
  • A single installation of Postgres on a single computer.
    - This is the usual meaning in a Postgres discussion. 
I have no idea how the term "cluster" came to be used this way in Postgres history. But get used to it. If you simply download the Postgres installer, run it once on a single Mac, you have a "Postgres cluster". The meaning is singular in the sense that you have one copy of the software installed, but plural in the sense that Postgres supports multiple database simultaneously running. Your Postgres cluster may have one database listing all the books and music you own, while also running another database with your sports league statistics. The data is stored in separate files on disk, but are both managed by Postgres simultaneously.

Computer > Cluster > Database > Schema > Table > Columns > Rows

A Mac has a Postgres cluster installed. That cluster will have at least one database (named 'postgres' to match the user account created when Postgres is installed), and you can create additional databases. In each database you have at least one schema named 'public'. Generally you can do all your work in that schema. Then you define one or more tables. Each table has one or more columns. After that you can create the actual records in the database, the "rows".

Installation Guide

Preview the steps taken during installation by reading this thorough Installation Guide. There is one apparent error: A screenshot shows port 5433, but the default is actually 5432. You can skip reading the Non-interactive installation section. However, do pay attention to the Uninstallation section at the end, as you cannot just toss Postgres in the Trash like a common app.

Specify UTF-8

From what I can gather, generally speaking, we be using UTF-8 as the character encoding used to store our data in Postgres. You can specify the character encoding for each database individually. But more convenient is specifying the encoding for the Postgres cluster, which then becomes the default for any databases created later.

You specify the cluster's character encoding when installing Postgres. I'm not sure, but you may not be able to change this later. The default is mysterious as it is platform-specific. I've been advised to override this to choose a specific encoding. This is done by choosing one of the "locales" with "UTF8" in its name. Apparently Postgres is unfortunately combining two ideas that should really be handled separately:

  • Locale - What rules should be followed for sorting, interpreting dates, etc.?
  • Character Encoding - How should character codepoint numbers are allowed, what letters do they represent, and how should they be written to disk?


For work here in the United States, I want the locale rules usually followed for US English. And I want character encoding to be UTF-8.  So I choose "en_US.UTF-8". This "Locale" setting is the only one I alter during installation; For all others I go with the default.

"postgres" User

The installer ran well without incident on my Mac OS X 10.6.4 (Snow Leopoard) MacBook. I went with all the defaults. Installer runs for several minutes. Your system password is required, for an unusual purpose: To create another Unix user on your Mac. A user named "postgres" is created, and you'll be prompted to invent a password for this user. Be sure to write that password down. That is the password for utilities to connect to the database installation, but is also that new user's password.

The main place for the installation is /Library/PostgreSQL/. Nested is a "9.0" folder, inside of which you'll find a "data" folder. Note that you do not have access to that "data" folder; it is owned by the "postgres" user. Postgres is one of the most secure database systems available. The blocked folder is one aspect of that security. If any of the other user accounts is compromised, rogue software will not have direct access to the database files.

So how do we access the database from our usual user account if we can't get to the database directly? We use utility programs, such as "pgAdmin" to connect to the database using password credentials. For example, to backup the data from the database, you can use "pg_dump" to extract the entire database definition and data as SQL script files.

StackBuilder

After the installer completes, the "StackBuilder" utility runs. This tool is a convenience for downloading updated or additional pieces to your Postgres installation. Using StackBuilder is optional, and you can skip it for your first time usage.

Notice the "PostgreSQL 9.0"  folder in /Applications. You'll find the admin tool "pgAdmin III.app" and a "Documentation" folder.


Shared Memory(?!)

Screenshot of Mac OS X installer disk image (DMG)
Last came one confusing speed bump in my installation experience: 'shared memory'. While the issue is confusing, don't be scared… It may turn out to be a non-issue for you as it was for me.

Next to the installer on the DMG is a README. The document is short, with a strange and confusing discussion about critical "shared memory" settings in BSD systems, which includes Mac OS X. It talks about memory amounts for settings with names like shmmax, shmmin, shmmni, shmseg, and shmall. Also read the linked web page.

You needn't concern yourself with these settings just yet. The one-click installer for Postgres 9.0.1 automatically creates a "sysctl.conf" file in the folder "etc" folder at the root of your boot drive:
/etc/sysctl.conf
You must restart your Mac for those settings to take effect.

The most important thing those settings do is raise the limit on the amount of memory used by Postgres from 4 megs to 33 megs. That is sufficient as a beginner starting out with Postgres. But is still a skimpy amount of memory for Postgres to operate. As your databases grow, and for deployment to a production server, you must re-visit this topic and tweak that "systctl.conf" file. When that time comes, see my blog entry and the Postgres doc.

That concludes the installation portion of my Postgres experience. Postgres should automatically be running at this point. In the Activity Monitor program, see several processes named 'postgres' and owned by the 'postgres' user. Launch the newly installed "pgAdmin" app installed in your Applications folder. With pgAdmin, you can create a new database, tables, columns, and rows.

No comments:

Post a Comment