PostgreSQL: Data is Important
At our engineering all-hands meeting, I presented a 5-minute lightning talk on the merits of PostgreSQL. In the spirit of being able to search for data, I converted the deck into the following blog post. Enjoy!
PostgreSQL: Data is Important
But Why Postgres?
- Solves some problems with migrations
- Full-featured indexes
- Robust constraints
- Streaming replication
- Sophisticated query planner
- PostGIS
- Full text search
- Advanced SQL features
Migration Improvements
No downtime for most DDL migrations. Instantly performs:
- dropping and renaming columns
- adding columns (if not
NOT NULL) - creating indexes (
CREATE INDEX CONCURRENTLY)
Transactional DDL statements (not even Oracle has this)
-
wrap migrations in a SQL transaction
-
roll back everything in migration on a partial failure
Indexes
-
CREATE INDEX ON USERS md5(id) -
CREATE UNIQUE INDEX ON payment_sources (user_id, status) WHERE status = 'active'; -
inverted indexes
- full-text search, array operations (inclusion tests, etc), ltrees
- very extensible
-
- can put indexes on different disk than the tables.
-
fully decoupled from table store
Constraints
CHECKconstraints- foreign key constraints available for all tables
- exclusion constraints (new in 9.0)
benefits of validation in the DB
-
reduces dependencies on monolithic app
-
improves consistency guarantees
-
deeper level of safeguards
-
don’t have to worry about prefetching associations
-
just as easy to test
Streaming Replication
-
new in 9.0
-
streams WALs to standby servers while being generated
-
less latency than file-based replication
-
more consistent than statement-based replication (MySQL <5.1)
-
synchronous capabilities coming in 9.1 guarantees 2-safe durability
Sophisticated Query Planner
-
cost-based (as opposed rule-based) query optimizer
-
heuristics gathered on tables to optimize better
-
costs of random seek, seq scan, etc. configurable
- e.g making random seeks on an SSD not a bad thing to the optimizer
- genetic algorithms to plan complex queries.
Better EXPLAIN
MySQL sample
+----+-------------+---------------------+-------+--------------------------------------+--------------------------------------+---------+------+------+------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------------+-------+--------------------------------------+--------------------------------------+---------+------+------+------------------------------+
| 1 | PRIMARY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away |
| 2 | DERIVED | bacon_cheeseburgers | range | index_bacon_cheeseburgers_on_user_id | index_bacon_cheeseburgers_on_user_id | 5 | NULL | 1 | Using where |
+----+-------------+---------------------+-------+--------------------------------------+--------------------------------------+---------+------+------+------------------------------+
Postgres Sample
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=8.31..8.32 rows=1 width=0)
-> GroupAggregate (cost=0.00..8.30 rows=1 width=12)
Filter: ((min(bacon_cheeseburgers.created_at) + '7 days'::interval) < max(bacon_cheeseburgers.created_at))
-> Index Scan using index_bacon_cheeseburgers_on_user_id on bacon_cheeseburgers (cost=0.00..8.27 rows=1 width=12)
Index Cond: ((user_id >= 69) AND (user_id <= 70))
Also… if you have a subquery, postgres won’t execute it
PostGIS (for geo queries)
-
extensible GIS system
-
R tree indexes for built-in geometric data types
-
geometric data types
-
free census data (TIGER) for geocoding (addresses→coordinates)
-
Rails integration (example from railsonpostgresql.com)
Restaurant.first(:conditions => ["the_geom && ?", Polygon.from_coordinates([[[x_min, y_min], [x_min, y_max], ...]]], 4269)])
Full Text Search
-
full-featured text indexing
-
dictionaries (languages, stopwords, synonyms)
-
customizable search ranking algorithms
-
mature
-
lighter weight solution than solr/lucene
Advanced SQL Features
-
EXCEPTandINTERSECTstatements in addition toUNION -
pl/*, custom aggs, UDFs, etc.
-
Triggers
-
Windowing functions (example from postgresql.org) (for advanced analytics)
SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary DESC) FROM empsalary; depname | empno | salary | rank -----------+-------+--------+------ develop | 8 | 6000 | 1 develop | 11 | 5200 | 2 develop | 9 | 4500 | 3 personnel | 2 | 3900 | 1 personnel | 5 | 3500 | 2 sales | 1 | 5000 | 1 sales | 4 | 4800 | 2
Comments
Get support help at squareup.com/support. We'll delete off-topic comments.