2011-05-25

Work Music

soma fm → Streams. Ambient streams. Keeps the hands typing and the brain coding.

In AAC format, as well as the older MP3. iTunes-friendly links.

One stream in particular works for me: "Mission Control" where ambient music is ingeniously overlaid with the chatter patter of real NASA mission control personnel and astronauts -- sometimes live, sometimes recorded.

2011-05-21

Location of Postgres Logs on a Mac

For Postgres 9.0.4 on Mac OS X 10.6.7, I found logs here:

  /Library/PostgreSQL/9.0/data/pg_log

Accessing Postgres On The Command-Line on Mac OS X

I simply wanted to restart the Postgres server. I'm trying to enable SSL/TLS, and such settings are checked only when Postgres starts. Postgres is set to auto-start when the Mac boots, if you install using EnterpriseDB's free Mac installer.  But I want to restart Postgres manually, without rebooting the Mac.

You'd think restarting Postgres would be easy, but no. The pgAdmin program is not built to stop or start the Postgres server, at least not that I could find. The command line tool "pg_ctl" can check the status of a server as well as start and stop. That command even has a handy "restart" option that effectively does both the stop and start. But I cannot run that pg_ctl from my own command-line in the Terminal app. The command must be run as the 'postgres' unix user created by the Postgres installer.

So, the answer is simple, right? Run the command as "sudo", the superuser? That usually is the answer with Mac command-line limitations. But, no, not in this case. The command must be run as another user, not just with enhanced privileges. So a bit of googling might lead you to use the "su" command to act as another user, like this:
  su postgres
This fails though I don't know understand why. After much googling, I finally found the trick:
  sudo su - postgres
It takes the combination of "sudo" and "su". Also notice the hyphen with spaces.

Now I can (I mean 'postgres user can) run Postgres utility programs. The next problem is that those utilities are not found automatically from the command line. So instead of just typing "pg_ctl" in the Terminal, type the entire path to the program, such as:
  /Library/PostgreSQL/9.0/bin/pg_ctl

Tip: Rather than type, just find the programs in the Finder (or the very handy Path Finder app), then drag-and-drop to the Terminal app.

Yet another problem to solve. Some of those commands need to know where the Postgres "data" folder is. Again, you can drag and drop the path, or use Path Finder's Copy Path > Unix menu item.

So for example, to check the status of the Postgres server, run these two lines in the Terminal. Run them separately, one after the other.
  sudo su - postgres
  /Library/PostgreSQL/9.0/bin/pg_ctl status -D /Library/PostgreSQL/9.0/data

To avoid including the full paths, you can tell your Mac to include those folders when looking for programs. But that’s a topic for another day.

I'm using Postgres 9.0.4 on Mac OS X 10.6.7.

2011-05-17

I Wish Postgres Had…

Here is my wish list for Postgres features.

  • Feedback system
    I have yet to learn how to submit bugs & feature requests to the Postgres developers. So if anyone knows how I should relay this list, please post a comment to tell me.
  • Multiply String function
    It can be surprisingly handy to have a command that takes a string, repeats that string a certain number of times, and returns the result. One usage is padding a string with spaces, underscores, or periods, as seen in a book's Table Of Contents. The command could be added to this collection, and might look like this:
    Multiply( string, numberOfTimes ) --> String

2011-05-10

Postgres Server Settings

Postgres has hundreds of settings in its configuration. Nearly all of them have defaults appropriate to most of us most of the time. Here's a few you may want to consider changing.

These settings are found in the pgAdmin app menu Tools > Server Configuration > postgresql.conf.

• superuser_reserved_connections

Sets aside a few of maximum number of connections for use only by the Postgres superuser, usually named 'postgres'. Without this setting, if your server reaches its limit, you will not be able to logon as an administrator. That's not good, so turn this on and use at least the default of 3 connection slots.

2011-05-09

Tips On Creating Login Forms

This article in Smashing Magazine has a bunch of tips and techniques to consider when building Login forms and Account Creation forms. For example, for web forms they gives alternatives to those annoying Captchas such as the Honeypot-Captcha and timestamp analysis.

2011-05-03

Uninstalling Postgres

Dang! I lost the password to one of my Postgres installations. Fortunately, it had not been used so no great loss. But how to uninstall both the Postgres software, and the 'postgres' unix user on Mac OS X?

I found this posting telling me how to uninstall. I've written my own explanation below.

I did this with Postgres 9.0.4 with the Mac installer provided by EnterpriseDB, on Mac OS X 10.6.7 Snow Leopard.

The steps are:
  1. Run the uninstaller app.
  2. Delete the enclosing folder, and provide Mac system password to authorize deleting the 'data' folder.
  3. Delete the 'postgres' Unix user.
Details

Run the uninstaller app installed as part of the Postgres package. However the uninstaller is not located in the Applications > PostgeSQL folder. Instead, look in the root folder (your hard disk) > Library > PostgreSQL > 9.0 > uinstall-postgresql.app. Running that app will remove all of Postgres except for two pieces:

• The folder enclosing the uninstaller, and the all-important "data" folder containing any databases you created.

The "data" folder is special because it belongs to the "postgres" Unix user created on your Mac during installation. So the admin Mac user in which you are using the Finder does not have direct access to that folder. Fortunately the regular admin Mac user can move that folder to the trash though you will be prompted for your Mac password.

• 'postgres' Unix user

Killing that Unix user, and its password, requires the use of Terminal.app found in the "Utilities" folder of the "Applications" folder.

Type the following into the Terminal:
    sudo dscl . delete /users/postgres

The 'sudo' means your are invoking special privileges as a system administrator. So you’ll be prompted to enter your Mac system password. The 'dscl' command works on Directory Service.

With that you are done. You may now reinstall Postgres. Only this time, write down your password in a safe place immediately, as I learned the hard way.

If for some reason you need/want to use the command line to unstall Postgres rather than the uninstaller app, see that posting link above.

2011-05-01

Automatically Record History of Field Changes in Postgres (Dynamic Triggers in PL/pgSQL)

Hoorah! I was able to complete my attempt at writing a single PL/pgSQL function in Postgres 9.0.4 to create history records tracking individual field value changes generically for all my tables. Some developers call this an "audit trail", though an accountant might say otherwise.

Special thanks to John DeSoi on the General Postgres mailing list for pointing me to a crucial code example to make PL/pgSQL interpret: 
   "OLD." || myColumnNameVar
as:
   OLD.first_name    (for example)

The crucial line is:
   EXECUTE 'SELECT ($1).' || ri.column_name || '::text' INTO t USING NEW;

I'm new to SQL and Postgres, so I may be wrong but my interpretation of that line is:
Since there appears no way to make the PL/pgSQL interpreter interpret our desired string, the trick is to look outside of PL/pgSQL and instead use the SQL interpreter. I was incorrectly thinking of the PL/pgSQL interpreter as the same thing as the SQL interpreter, or rather as incorporating the SQL interpreter. Once I saw that line of example code, I realized the two interpreters are separate. The language, the syntax, of PL/pgSQL is a superset of SQL, but the interpreters are separate with one calling the other.

Brilliant! It works, and it seems to be fast enough, at least for my needs.

I also found this blog post on working around the impossibility of dynamic triggers in PL/pgSQL.

I'll share my current code & table structure below. Caveat: This code has not yet been thoroughly tested, nor has it been deployed. I only finalized it today.

[A] I'm working on a greenfield project, where:
  • I've built every table to have a primary key of type uuid named exactly "pkey_".
  • Every table has a TIMESTAMPTZ field named "record_modified_".

My approach below hard-codes these assumptions.

[B] I have this "history_" table:

CREATE TABLE history_
(
 pkey_ uuid NOT NULL DEFAULT uuid_generate_v1mc(), -- The primary key for this table, though no primary key constraint was created (for the sake of performance and conservation). This column and timestamp_ column are the only two columns about this table itself. All other columns are about the inserted/modified/deleted record in some other table.
 table_name_ character varying(120) NOT NULL, -- Name of table whose row is being affected (inserted, deleted, or modified).
 column_name_ character varying(120) NOT NULL, -- Name of the column in some other table whose row value is being modified. This column's value is empty string if the operation was DELETE.
 timestamp_ timestamp with time zone NOT NULL DEFAULT clock_timestamp(), -- The moment this record was created. Using the clock_timestamp() function as a default, to capture the actual moment in time rather than moment when transaction began.
 db_user_name_ character varying(120) NOT NULL DEFAULT "current_user"(), -- The name of the Postgres user logged in to this database connection/session.
 app_name_ character varying(120) NOT NULL DEFAULT current_setting('application_name'::text), -- The name of the application connected to the database. May also include the version number of app, and the name of the human user authenticated within the app.
 old_value_ character varying(120) NOT NULL DEFAULT ''::character varying,
 new_value_ character varying(120) NOT NULL DEFAULT ''::character varying,
 uuid_ uuid NOT NULL, -- The UUID of the row being affected, the row being inserted, updated, or deleted. Assumes every table whose history is being recorded uses the 'uuid' data type as its primary key.
 operation_ character varying(120) NOT NULL, -- What database operation resulted in this trigger running: INSERT, UPDATE, DELETE, or TRUNCATE.
 table_oid_ oid NOT NULL, -- The oid of the table whose record is being modified. May be helpful if a table name changes over time.
 ordinal_position_of_column_ integer NOT NULL, -- The position of the affected column in its table. Every new column gets a number, incremented by one for each. This may be helpful when analyzing changes across a stretch of time during which a column's name was changed. Apparently columns have no oid, so we are recording this number instead.
 transaction_began_ timestamp with time zone NOT NULL DEFAULT transaction_timestamp() -- The time when the current transaction began. Can act like a transaction identifier, to group multiple "history_" rows of the same transaction together. This is not foolproof, as multiple transaction could possibly start in the same split second moment. Assuming the computer's clock has a fine resolution, this chance of a coincidence should be quite miniscule.
)


I do not have a primary constraint for this table. The "pkey_" column acts as a primary key, but there is no need for an index or uniqueness testing for this special table. I suppose I could even drop this column entirely, since this kind of logging table will never have related records nor do we ever care about identifying a particular row. I'll leave it for now, just for the heck of it.

[C] For every table I want to track field-level value changes, I create a trigger like this:

CREATE TRIGGER XXX_trigger_history_
AFTER INSERT OR UPDATE OR DELETE ON XXX_
FOR EACH ROW EXECUTE PROCEDURE make_history_();


where 'XXX' is the name of the particular table.

[D] I created this function:

CREATE OR REPLACE FUNCTION make_history_()
RETURNS TRIGGER
LANGUAGE plpgsql
AS
$BODY$
/*     Purpose: Make a history of changes to most fields in the table calling this trigger function.
       This kind of history tracking is also known as an "audit trail".
       This function works by detecting each change in value for important fields in a certain table.
       This trigger function then calls another function to create a row in the "history_" table.
    This kind of feature is often called an "audit trail" by software developers. I avoid using that term in this context as a real
    audit trail in accounting terms involves more than this simple field change tracking.
*/
/*    © 2011 Basil Bourque. This source code may be used freely forever by anyone taking full responsibility for doing so, without warranty.
  
    Thanks so very much to John DeSoi of the pgsql-general@postgresql.org mailing list for pointing me to this crucial code example:
    http://wiki.postgresql.org/wiki/PL/pgSQL_Dynamic_Triggers
    Before reading that example, my previous efforts led me to conclude a generic history facility written in PL/pgSQL was impossible.
*/
/*     We make these assumptions about any table using this function in its trigger:
           • Has a primary key named "pkey_" of type uuid.
           • Has a field tracking the datetime the record was last modified, named "record_modified_" of type timestamptz.
           • Is in the default/current schema.
       While it might be nice to rewrite this function to escape these assumptions, I've spent all my energies to get this far.
    I welcome feedback from anyone who want to take this further.
*/



/*

For each table on which you want history, create a trigger by executing SQL like this:
CREATE TRIGGER XXX_trigger_history_
AFTER INSERT OR UPDATE OR DELETE ON XXX_
FOR EACH ROW EXECUTE PROCEDURE make_history_();


where XXX is the name of the specific table.
*/

/*     Notes:
      
    The 'OLD' and 'NEW' variables represent the entire row whose INSERT/UPDATE/DELETE caused this trigger to run.
       The 'TG_xxx' variables are special variables created automatically by Postgres for the trigger function.
       For example, TG_OP indicates which modification operation is happening: INSERT, UPDATE, or DELETE.
       http://www.postgresql.org/docs/current/static/plpgsql-trigger.html
       "clock_timestamp()" gets the actual time at the moment of execution. In contrast, most other timestamp
       functions return the time when the current transaction began.
    For more information, see: http://www.postgresql.org/docs/current/static/functions-datetime.html
  
    The "history_" table also includes a column "transaction_began_" defaulting to "transaction_timestamp()". This timestamp can act
    like a transaction identifier, to group multiple "history_" rows of the same transaction together. This is not foolproof, as
    multiple transaction could possibly start in the same split second moment. Assuming the computer's clock has a fine resolution,
    this chance of a coincidence should be quite miniscule. If someone knows a way to get a true transaction id, please share.
*/

/*  History:
    2011-04-31    • Published on the general Postgres mailing list.
    2011-05-01    • Revised to not ignore the ".record_created_" field.
                  • Published on my blog at http://crafted-software.blogspot.com/.
*/

DECLARE
    ri RECORD; -- About this data type "RECORD": http://www.postgresql.org/docs/current/static/plpgsql-declarations.html#PLPGSQL-DECLARATION-RECORDS
    oldValue TEXT;
    newValue TEXT;
    isColumnSignificant BOOLEAN;
    isValueModified BOOLEAN;
BEGIN
    /*RAISE NOTICE E'\n    Running function: make_history_ ----------------\n\n    Operation: %\n    Schema: %\n    Table: %\n',
        TG_OP,
        TG_TABLE_SCHEMA,
        TG_TABLE_NAME;*/

    IF (TG_OP = 'INSERT') OR (TG_OP = 'UPDATE') THEN
        NEW.record_modified_ = clock_timestamp(); -- Record the moment this row is being saved.
      
        FOR ri IN
            -- Fetch a ResultSet listing columns defined for this trigger's table.
            SELECT ordinal_position, column_name, data_type
            FROM information_schema.columns
            WHERE
                table_schema = quote_ident(TG_TABLE_SCHEMA)
            AND table_name = quote_ident(TG_TABLE_NAME)
            ORDER BY ordinal_position
        LOOP
            -- For each column in this trigger's table, copy the OLD & NEW values into respective variables.
            -- NEW value
            EXECUTE 'SELECT ($1).' || ri.column_name || '::text' INTO newValue USING NEW;
            -- OLD value
            IF (TG_OP = 'INSERT') THEN   -- If operation is an INSERT, we have no OLD value, so use an empty string.
                oldValue := ''::varchar;
            ELSE   -- Else operation is an UPDATE, so capture the OLD value.
                EXECUTE 'SELECT ($1).' || ri.column_name || '::text' INTO oldValue USING OLD;
            END IF;
            -- Make noise for debugging.
            /*RAISE NOTICE E'\n    Column #: %\n    Name: %\n    Type: %\n    Old: %\n    New: %\n',
                ri.ordinal_position,
                ri.column_name,
                ri.data_type,
                oldValue,
                newValue;*/
              
            --    ToDo: Add code to throw an Exception if the primary key value is changing (other than from NULL on an INSERT).
          
            --     ToDo: Add code to ignore columns whose data type does not cast well to TEXT/VARCHAR.
          
            --    Ignore some columns:
            --         • Those whose names are marked with a trailing x.
            --        • The primary key.
            --         • Our timestamp field recording the row's  most recent modification.
            isColumnSignificant := (position( '_x_' in ri.column_name ) < 1) AND (ri.column_name <> 'pkey_') AND (ri.column_name <> 'record_modified_');
            IF isColumnSignificant THEN
                isValueModified := oldValue <> newValue;  -- If this nthField in the table was modified, make history.
                IF isValueModified THEN
                    /*RAISE NOTICE E'Inserting history_ row for INSERT or UPDATE.\n';*/
                    INSERT INTO history_ ( operation_, table_oid_, table_name_, uuid_, column_name_, ordinal_position_of_column_, old_value_, new_value_ )
                    VALUES ( TG_OP, TG_RELID, TG_TABLE_NAME, NEW.pkey_, ri.column_name::VARCHAR, ri.ordinal_position, oldValue::VARCHAR, newValue::VARCHAR );
                END IF;
            END IF;
        END LOOP;
  
        RETURN NEW;
    ELSIF (TG_OP = 'DELETE') THEN
        /*RAISE NOTICE E'Inserting history_ row for DELETE.\n';*/
        --    Similar to INSERT above, but refers to OLD instead of NEW, and passes empty values for last 4 fields.
        INSERT INTO history_ ( operation_, table_oid_, table_name_, uuid_, column_name_, ordinal_position_of_column_, old_value_, new_value_ )
        VALUES ( TG_OP, TG_RELID, TG_TABLE_NAME, OLD.pkey_, ''::VARCHAR, 0, ''::VARCHAR, ''::VARCHAR );
        RETURN OLD;
      
    END IF;
     /* Should never reach this point. Branching in code above should always reach a call to RETURN. */
    RAISE EXCEPTION 'Unexpectedly reached the bottom of this function without calling RETURN.';
END;

$BODY$;



[end of code]

The beauty of this is that a history record is created for each field edit regardless of the avenue by which the data was changed. If my desktop app modifies data, a history row is created. If my web app does so, a history row is created. Ditto if I use interactive SQL to fix some data, or use pgAdmin to affect data. In all cases, history is made.