databases: architecture | client | select | where | dates | join | aggregate | sort and limit | insert, update, and delete | schema | sequences | indices | import and export | script | function | query tuning | user | python | ruby | help | admin
sql: types | casts | literals | dates | identifiers | operators | functions | distinct | qualified * | regular expressions | sequences | group by | aggregation functions | window functions | where clause subqueries | join | from clause subquery | as | limit and offset | with | null | sets and multisets | session objects | scripts | reflection | sargable expressions | transactions | idempotent sql
postgresql | mysql | sqlite | |
---|---|---|---|
version used |
9.1 | MariaDB 5.5 with InnoDB | 3.7 |
show version | client: $ psql --version server: > show server_version; |
client: $ mysql --version server: > status; |
$ sqlite3 --version |
architecture | |||
engine | in separate server process which communicates with client via TCP port or Unix domain socket | in separate server process which communicates with client via TCP port or Unix domain socket | contained in client library |
data | database consists of tables which represent relations | database consists of tables which represent relations | database consists of tables which represent relations |
files | each table corresponds to one or more files on the file system | database is stored in a single file | |
persistence | a write ahead log is used to ensure durability without flushing tables to disk after each transaction | depends on storage engine; in InnoDB a redo log is used to ensure durability without flushing tables to disk after each transaction | database file is updated after each transaction |
indices |
tables can have any number of indices | tables can have any number of indices | tables can have any number of indices |
transactions | transactions can contain DML and DDL; row and table locking is available | transactions can contain DML; table locking is available; InnoDB storage engine uses row locking to implement transactions | |
security | password protected user accounts with fine grained permissions | password protected user accounts with fine grained permissions | security and permissions as provided by file system for database file |
server side language | PL/pgSQL; server can be extended to support other languages | SQL/PSM | none |
client | |||
postgresql | mysql | sqlite | |
invoke client |
$ psql -U $USER -h localhost | $ mysql -u $USER -h localhost -p | $ sqlite3 DB_FILE |
client help | \? | ? \? \h help |
.help |
default port |
5432 | 3306 | none |
show databases | SELECT datname FROM pg_database; \l |
SHOW databases; | .databases |
switch database |
\connect database; | use database | attach "/tmp/db2" as db2; to disconnect: detach db2; |
current database |
SELECT current_database(); | SELECT database(); | All attached files are current; names in one file may shadow another |
chdir |
\cd /tmp | none | none |
shell command |
\! ls | system ls | none |
run script | \i setup.sql | source setup.sql | .read setup.sql |
redirect output to file | \o /tmp/output.txt to discontinue redirection: \o |
tee /tmp/output.txt to discontinue redirection: notee |
.output /tmp/output.txt to discontinue redirection: .output stdout |
set column delimiter | \f "\t" | none | .separator "\t" |
toggle between aligned and unaligned output | \a | none | none |
select | |||
postgresql | mysql | sqlite | |
select * | SELECT * FROM customers; |
SELECT * FROM customers; |
SELECT * FROM customers; |
project columns | SELECT name, age FROM customers; |
SELECT name, age FROM customers; |
SELECT name, age FROM customers; |
exclude column |
none | none | none |
project expression | SELECT 'Mr. ' || name, age + 3 FROM customers; |
SELECT concat('Mr. ', name), age + 3 FROM customers; |
SELECT 'Mr. ' || name, age + 3 FROM customers; |
rename column | SELECT name AS moniker FROM customers; |
SELECT name AS moniker FROM customers; |
SELECT name AS moniker FROM customers; |
where | |||
postgresql | mysql | sqlite | |
filter rows | SELECT * FROM customers WHERE name = 'Ed'; |
SELECT * FROM customers WHERE name = 'Ed'; |
SELECT * FROM customers WHERE name = 'Ed'; |
comparison operators | SELECT * FROM customers WHERE age > 21; SELECT * FROM customers WHERE age < 21; SELECT * FROM customers WHERE age >= 21; SELECT * FROM customers WHERE age <= 21; SELECT * FROM customers WHERE age <> 21; |
||
multiple conditions on field | SELECT * FROM customers WHERE age >= 21 AND age <= 65; |
||
logical operators |
AND OR NOT | ||
like | SELECT * FROM customers WHERE name LIKE 'Ed%'; |
||
dates | |||
postgresql | mysql | sqlite | |
current timestamp | SELECT now(); SELECT CURRENT_TIMESTAMP; |
select now(); select current_timestamp; |
select current_timestamp; |
join | |||
postgresql | mysql | sqlite | |
inner join | SELECT * FROM customers c JOIN orders o ON c.id = o.customer_id; |
SELECT * FROM customers c JOIN orders o ON c.id = o.customer_id; |
|
left outer join | include customers with no orders: SELECT * FROM customers c LEFT JOIN orders o ON c.id = o.customer_id; |
include customers with no orders: SELECT * FROM customers c LEFT JOIN orders o ON c.id = o.customer_id; |
|
full outer join | include customers with no orders and orders with no customers: SELECT * FROM customers c FULL JOIN orders o ON c.id = o.customer_id; |
SELECT * FROM customers c LEFT JOIN orders o ON c.id = o.customer_id UNION SELECT * FROM customers c RIGHT JOIN orders o ON c.id = o.customer_id; |
|
cartesian join | SELECT * FROM rows, columns; |
SELECT * FROM rows, columns; |
|
aggregate | |||
postgresql | mysql | sqlite | |
row count |
SELECT count(*) FROM customers; | select count(*) from customers; | select count(*) from customers; |
count of rows satisfying condition | SELECT count(*) FROM customers WHERE age > 21; |
||
count distinct | SELECT count(DISTINCT name) FROM customers; |
||
group by | SELECT count(*) FROM customers GROUP BY age; |
||
aggregation operators | |||
sort and limit | |||
postgresql | mysql | sqlite | |
sort in ascending order | SELECT * FROM customers ORDER BY name; |
select * from baz order by foo; |
select * from baz order by foo; |
sort in descending order | SELECT * FROM customers ORDER BY name DESC; |
select * from baz order by foo desc; |
select * from baz order by foo desc; |
sort by multiple columns | SELECT * FROM customers ORDER BY age, name; |
||
single row | SELECT * FROM customers WHERE name = 'Ed' LIMIT 1; |
select * from customers where name = 'Ed' limit 1; |
select * from customers where name = 'Ed' limit 1; |
limit | first 10 customers: SELECT * FROM customers ORDER BY name LIMIT 10; |
select * from customers limit 10; |
select * from customers limit 10; |
offset | second 10 customers: SELECT * FROM customers ORDER BY name LIMIT 10 OFFSET 10; |
||
insert, update, and delete | |||
postgresql | mysql | sqlite | |
insert | INSERT INTO customers (name, age) VALUES (34, 'Ed'); |
insert into foo (bar, baz) values (1, 'one'); |
insert into foo (bar, baz) values (1, 'one'); |
update | UPDATE customers SET age = 35 WHERE name = 'Ed'; |
update foo set baz = 'une' where bar = 1; |
update foo set baz = 'une' where bar = 1; |
merge | |||
delete | DELETE FROM customers WHERE name = 'Ed'; |
delete from foo where bar = 1; |
delete from foo where bar = 1; |
delete all rows | DELETE FROM customers; faster: TRUNCATE customers; |
delete from foo; faster on 5.0.3 and later: truncate foo; |
delete from foo; |
schema | |||
postgresql | mysql | sqlite | |
create table | CREATE TABLE customers ( name TEXT, age INT ); |
create table foo ( bar int, baz text ); |
create table foo ( bar int, baz text ); |
drop table |
DROP TABLE customers; | DROP TABLE customers; | DROP TABLE customers; |
show tables |
\d | show tables; | .tables |
describe table |
\d table | desc table; | .schema table |
export schema |
$ pg_dump -a db > db.sql | $ mysqldump --d db > db.sql | |
describe document |
table determines row type | table determines row type | |
sequences | |||
postgresql | mysql | sqlite | |
increment | |||
indices | |||
postgresql | mysql | sqlite | |
show indices | \di | show index from table; | .indices |
create index | CREATE INDEX foo_bar_idx on foo (bar); | InnoDB requires that the max length of a text or varchar column be less than 767 characters create index foo_bar_idx on foo (bar); |
create index foo_bar_idx on foo ( bar ); |
drop index | DROP INDEX foo_bar_idx; | drop index foo_bar_idx on foo; | drop index foo_bar_idx; |
create unique index | CREATE UNIQUE INDEX foo_bar_idx ON foo (bar); | create unique index foo_bar_idx on foo (bar); | create unique index foo_bar_idx on foo ( bar ); |
create compound index | CREATE INDEX foo_bar_baz_idx ON foo (bar, baz); | create index foo_bar_baz_idx on foo (bar, baz); | create index foo_bar_baz_idx on foo (bar, baz); |
index hint | |||
import and export | |||
postgresql | mysql | sqlite | |
import csv | $ echo $'1,"one, two, three"\n2,four\n3,"five\nsix\nseven"' > /tmp/test.csv $ echo 'create table test_csv ( col1 int, col2 text );' | psql $ ( echo 'copy test_csv from stdin with (format csv); '; cat /tmp/test.csv ) | psql trim header if there is one: ( echo 'copy test_csv from stdin with (format csv); '; sed -n '2,$p' /tmp/test.csv ) | psql |
$ echo $'1,"one, two, three"\n2,four\n3,"five\nsix\nseven"' > /tmp/test.csv $ sqlite3 > create table test_csv ( col1 int, col2 text ); > .mode csv > .import /tmp/test.csv test_csv |
|
export csv | $ echo 'copy foo to stdout with (format csv);' | psql > /tmp/foo.csv | grant FILE on *.* to 'joe'@'localhost'; SELECT * INTO OUTFILE '/tmp/dump.csv' FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n' FROM foo; |
.mode csv .output /tmp/foo.csv select * from foo; |
script | |||
postgresql | mysql | sqlite | |
run script | \i foo.sql $ psql -f foo.sql |
source foo.sql $ mysql db < foo.sql |
.read foo.sql |
function | |||
postgresql | mysql | sqlite | |
show functions | \df; | show function status; show procedure status; select routine_name from information_schema.routines; |
|
show function source | \df+ func_name; | ||
show built-in functions | select proname from pg_proc; select routine_name from information_schema.routines; |
||
define function | |||
query tuning | |||
postgresql | mysql | sqlite | |
explain plan | EXPLAIN SELECT * FROM customers; | EXPLAIN SELECT * FROM customers; | explain select * from foo; |
query stats | EXPLAIN ANALYZE SELECT * FROM customers; | .stats on .stats off |
|
timer | .timer on .time off |
||
stats tables | |||
user | |||
postgresql | mysql | sqlite | |
current user | select current_user; | select user(); | none |
list users | select usename from pg_user; |
table only readable by root: select user from mysql.user; |
none |
create user | at sql prompt: > create role fred with superuser createdb createrole login; at cmd line; will prompt for privileges: $ createuser fred |
create user 'fred'@'localhost' identified by 'abc123'; | none |
switch user | set role fred; | ||
drop user | > drop role fred; $ dropuser fred |
drop user 'fred'@'localhost'; | none |
set password | alter user fred with password 'xyz789'; | set password for 'fred'@'localhost' = password('xyz789'); | |
grant | grant select on test.foo to 'fred'@'localhost'; | ||
grant all | table foo in database test: grant all on test.foo to 'fred'@'localhost'; all tables in database test: grant all on test.* to 'fred'@'localhost'; |
||
revoke | revoke all on test.* from 'fred'@'localhost'; | ||
python | |||
postgresql | mysql | sqlite | |
install driver | $ sudo pip install psycopg | make sure MySQL development files are installed: $ sudo pip install MySQL-python |
Python ships with a driver |
import driver | import psycopg2 | import MySQLdb | import sqlite3 |
connect open, close |
conn = psycopg2.connect(database='foo') conn.close() |
conn = MySQLdb.Connect( db='cust', user='joe', passwd='xyz789', host='127.0.0.1') conn.close() |
conn = sqlite3.connect('/PATH/TO/DBFILE') conn.close() |
cursor create, close |
cur = conn.cursor() cur.close() |
cur = conn.cursor() cur.close() |
cur = conn.cursor() cur.close() |
execute | cur.execute('select * from bar') | cur.execute("select * from bar") | cur.execute('select * from bar') |
bind variable | cur.execute('select * from foo where bar = %s', vars=[1]) | cur.execute("select * from foo where bar = %s", (1,)) | cur.execute('select * from foo where bar = ?', (1,)); |
fetch all results | # returns list of tuples: rows = cur.fetchall() |
rows = cur.fetchall() | # returns list of tuples: rows = cur.fetchall() |
iterate through results | for row in cur: print(row[0]) |
for row in cur: print(row[0]) |
for row in cur: print(row[0]) |
fetch one result | # returns a tuple: row = cur.fetchone() |
# returns a tuple: row = cur.fetchone() |
# returns a tuple: row = cur.fetchone() |
transaction | |||
ruby | |||
postgresql | mysql | sqlite | |
install driver | $ sudo gem install ruby-pg | $ sudo gem install mysql | Ruby ships with a driver |
import driver | require 'pg' | require 'mysql' | require 'sqlite3' |
connect open, close |
conn = PGconn.open(:dbname => 'foo') ?? |
conn = Mysql.new conn.select_db("foo") ?? |
conn = SQLite3::Database.new "/tmp/db" conn.close() |
execute | result = conn.exec("select * from foo;") | stmt = con.prepare('select * from foo') stmt.execute |
rows = conn.execute("select * from foo") |
bind variable | stmt = con.prepare('select * from foo where bar = ?') stmt.execute(1) |
rows = conn.execute("select * from foo where bar = ?", [1]) | |
number of rows returned | result.cmdtuples | stmt.num_rows | rows.size |
fetch a row | # hash with column names as keys: result[0] |
# returns array: stmt.fetch |
rows[0] |
iterate through results | result.each do |row| puts row["bar"] end |
stmt.each do |row| puts row[0] end |
rows.each do |row| puts row[0] end |
transaction | |||
help | |||
postgresql | mysql | sqlite | |
man page | $ man 1 psql $ man 7 copy $ man 7 create_table |
||
admin | |||
postgresql | mysql | sqlite | |
admin user | postgres | root | none |
server process | postgres | mysqld | none |
start server | |||
stop server | |||
config file | |||
reload config file | |||
data directory | $ postgres -D /PATH/TO/DATA/DIR | $ mysqld --datadir /PATH/TO/DATA/DIR | specified on command line |
create database | at sql prompt: > create database foo; at command line: $ createdb foo |
User must have 'create' privilege. at sql prompt: > create database foo; at command line: $ mysqladmin create foo |
|
drop database | > drop database foo; $ dropdb foo |
User must have 'drop' privilege. at sql prompt: > drop database foo; at command line: $ mysqladmin drop foo |
|
backup database | $ pg_dump foo > /tmp/foo.sql $ pg_dump -F=c foo > /tmp/foo.postgres |
$ mysqldump foo > /tmp/foo.sql | |
restore database | $ psql -f /tmp/foo.sql $ pg_restore -d foo /tmp/foo.postgres |
$ mysql < /tmp/foo.sql | |
_________________________________ | _________________________________ | _________________________________ |
version used
The version used to test the examples in this sheet.
show version
How to determine the version of a database engine.
mysql:
MySQL supports different storage engines. Each storage engine has its own size limits, and features such as indexes, transactions, locking and foreign key support aren't available for all storage engines.
Here is how to determine the storage engine used for a table:
select engine
from information_schema.tables
where table_schema = 'test'
and table_name = 'foo';
Architecture
engine
The location of the database engine.
data
How data is organized in a database.
files
How data is stored in files on the file system.
postgresql:
Tables are split into multiple files when they exceed 2G; large attributes are stored in separate TOAST files.
persistence
What durability guarantee is made and how this is accomplished.
indices
Are indices available and what can be indexed.
transactions
Are transactions available and what can participate in a transaction.
security
Available security features.
server side language
Whether a server side programming language is available.
Client
invoke client
How to invoke the command line client.
postgresql:
If the database user is not specified, it will default to the operating system user. If the database is not specified, it will default to the operating system user. If the host is not specified, psql will attempt to connect to a server on the local host using a Unix domain socket.
client help
How to get a list of commands available at the command line client prompt.
default port
The default port used by the client to connect to the server.
The default ports used by PostgreSQL and MySQL are defined in /etc/services.
show databases
List the available databases on a server.
switch database
How to switch between databases when using the command line SQL prompt.
current database
chdir
shell command
client startup file
The name of the startup file used by the client.
custom prompt
Select
select *
project columns
exclude column
project expression
rename column
Where
filter rows
comparison operators
has key
multiple conditions on field
logical operators
like
Join
inner join
left outer join
full outer join
cartesian join
Aggregate
Sort and Limit
sort in ascending order
sort in descending order
sort by multiple columns
single row
limit
offset
Insert, Update, and Delete
insert
update
delete
delete all rows
Schema
create table
drop table
show tables
List the tables in the current database.
describe table
Show the columns for a table and their types.
Sequences
increment
Indices
Import and Export
import tab delimited
import csv
import json
export tab delimited
export csv
export json
Script
run sql script
How to run a SQL script at the command line.
Function
show functions
List the stored functions in the current database.
show function source
User
current user
list users
create user
switch user
drop user
set password
grant
grant all
revoke
Query Tuning
Python
Ruby
Admin
admin user
server process
start server
stop server
config file
reload config file
create database
How to create a database.
postgresql:
The user must have the CREATEDB privilege. When creating the database from the command line using createdb, the PostgreSQL user can be specified using the -U option.
drop database
How to drop a database.
backup database
restore database
Writing SELECT queries for open-source databases.
The reader is assumed to have written SELECT queries with FROM, WHERE, GROUP BY, HAVING, and ORDER BY clauses.
When we say that something is standard, we mean it conforms to the most recent SQL standard.
When we say that something is portable, we mean works on PostgreSQL, MySQL, and SQLite.
SQL
Types
A list of portable types:
- BOOLEAN
- INTEGER or INT
- REAL
- DOUBLE PRECISION
- NUMERIC(total digits, fractional digits)
- NUMERIC(total digits)
- CHARACTER(len) or CHAR(len)
- CHARACTER VARYING(len) or VARCHAR(len)
- TIMESTAMP
- DATE
- TIME
Note that NUMERIC(len) defines an integer type.
mysql:
MySQL maps BOOLEAN to TINYINT(1); REAL and DOUBLE PRECISION to DOUBLE; NUMERIC to DECIMAL.
Casts
This is the standard and portable way to cast:
SELECT cast('7' AS INTEGER) + 3;
The standard calls for implicit casts between numeric types.
The standard also calls for implicit casts between character types. In particular, character types can be concatenated, and the length of the concatenation type is the sum of the length of the argument types.
postgresql:
Other ways to cast:
> SELECT '7'::INTEGER + 3;
> SELECT INTEGER '7' + 3;
The type of string operations is TEXT, which is a character type of unlimited length.
It is an error to attempt to insert a string that is too long into a column with fixed or maximum length.
mysql:
When concatenating character types, the length of the type of the concatenation is the sum of the length of the type of the arguments.
MySQL silently truncates strings that are too long on insert.
sqlite:
SQLite does not enforce character type length limits.
Literals
The standard reserves these keywords: NULL, TRUE, and FALSE.
Numeric literals work like you would expect.
SQL strings are admirably simple. Single quote delimited, double the single quote to escape, double pipe for concatenation.
postgresql:
This code results in a type mismatch error:
SELECT TRUE = 1;
Converting a string containing a hex digit to an integer:
> select x'3bb'::int;
int4
------
955
The chr() function takes an integer representing a Unicode point as an argument:
> SELECT 'one' || chr(10) || 'two' || chr(10) || 'three';
?column?
----------
one +
two +
three
> SELECT chr(x'3bb'::int);
chr
-----
λ
There is syntax for strings with C-style backslash escapes:
select E'one\ntwo\nthree';
mysql:
TRUE and FALSE are synonyms for 1 and 0.
The || operator is used for logical disjunction. Use the concat() function for string concatenation.
SELECT concat('one', char(10), 'two');
+--------------------------------+
| concat('one', char(10), 'two') |
+--------------------------------+
| one
two |
+--------------------------------+
sqlite:
SQLite does not have TRUE and FALSE literals. Use 1 and 0 instead.
Strings can be single quote or double quote delimited.
Dates
The standard provides the keywords CURRENT_TIME, CURRENT_DATE, and CURRENT_TIMESTAMP for observing the clock.
There are no date literals; strings are used instead. Inserting a string into a TIME, DATE, or TIMESTAMP column will work if ISO 8601 format is used.
> CREATE TABLE foo (a DATE, b TIME, c TIMESTAMP);
> INSERT INTO foo VALUES ('2012-10-19', '18:00:00', '2012-10-19 18:00:00');
This works in both MySQL and PostgreSQL:
> SELECT date('2012-10-19'), time('18:00:00'), timestamp('2012-10-19 18:00:00');
EXTRACT, TIMESTAMP from DATE and TIME, STRFTIME, STRPTIME
INTERVAL type and date arithmetic
- TIMESTAMP - TIMESTAMP
- TIMESTAMP + INTERVAL and INTERVAL + TIMESTAMP
- INTERVAL + INTERVAL and INTERVAL - INTERVAL
- INTERVAL * NUMERIC and NUMERIC * INTERVAL
- INTERVAL / NUMERIC
mysql:
MySQL does not have an INTERVAL type. Subtracting two TIMESTAMPs yields a NUMERIC(20, 0) and subtracting two DATEs yields a NUMERIC(11, 0).
Identifiers
According to the standard, identifiers with unusual characters should be double quoted. A literal double quote is represented by two double quotes.
mysql:
MySQL uses backticks `` instead of double quotes "" to quote identifiers.
Operators
- AND OR NOT
- < > <= >= = != (<>)
- BETWEEN IN
- + - * / % ^
- & | # ~ << >>
- || LIKE (ESCAPE)
-- select rows where foo.x ends with percent sign
--
SELECT *
FROM foo
WHERE x LIKE '%\%' ESCAPE '\';
check mysql and sqlite
Functions
how to get a list of functions
MySQL 5.5 Functions and Operators
MySQL 5.5 Function Index
SQLite Core Functions
SQLite Aggregate Functions
SQLite Date and Time Functions
DISTINCT
Some of the places DISTINCT can be used:
- SELECT DISTINCT expr ...
- SELECT count(DISTINCT expr)
- UNION DISTINCT
- INTERSECT DISTINCT
- EXCEPT DISTINCT
UNION ALL, INTERSECT ALL, and EXCEPT ALL can be used to indicate multiset operations. UNION DISTINCT, INTERSECT DISTINCT, and EXCEPT DISTINCT indicate set operations. Since this is the default the use of DISTINCT is superfluous.
Qualified *
-- Example of a qualified *: only
-- return rows from foo:
--
SELECT foo.*
FROM foo
JOIN bar
ON foo.x = bar.x
Regular Expressions
SIMILAR TO Postgres
Sequences
Here is the SQL standard syntax for external sequences:
CREATE SEQUENCE foo;
CREATE SEQUENCE bar START WITH 1000 INCREMENT BY 10 MAXVALUE 2000;
SELECT NEXT VALUE FOR foo;
ALTER SEQUENCE foo RESTART WITH 10;
Here is the SQL standard syntax for internal sequences. None of the open source databases support this syntax, however.
CREATE TABLE foo (
foo_id INTEGER GENERATED ALWAYS AS IDENTITY (
START WITH 1
INCREMENT BY 1
MAXVALUE 10000)
)
postgresql:
PostgreSQL lacks SQL standard syntax for reading external sequences:
CREATE SEQUENCE foo;
CREATE SEQUENCE bar START WITH 1000 INCREMENT BY 10 MAXVALUE 2000;
SELECT nextval('foo');
ALTER SEQUENCE foo RESTART WITH 10;
SELECT setval('foo', 10);
The keywords WITH and BY are optional.
How to create an internal sequence:
CREATE TABLE foo
foo_id SERIAL,
foo_desc TEXT
);
mysql:
MySQL does not have external sequences.
sqlite:
SQLite does not have external sequences.
CREATE TABLE foo (
foo_id INTEGER PRIMARY KEY AUTOINCREMENT,
foo_desc TEXT
);
GROUP BY
GROUP BY, HAVING, and ORDER BY clauses can refer to SELECT list items by ordinal number. I don't think this is in the standard, but the feature is in PostgreSQL, MySQL, and SQLite.
Aggregation Functions
postgresql | mysql | sqlite | |
---|---|---|---|
count, sum, min, max, avg | count, sum, min, max, avg | count, sum, min, max, avg | |
count(distinct *) count(distinct expr) sum(distinct expr) |
count(distinct *) count(distinct expr) sum(distinct expr) |
count(distinct *) count(distinct expr) sum(distinct expr) |
|
bit_and bit_or |
bit_and bit_or |
||
bool_and bool_or |
|||
string_agg(expr, delim) | group_concat(expr) group_concat(expr separator delim) |
||
array_agg | |||
stddev_samp stddev_pop var_samp var_pop |
stddev_samp stddev_pop var_samp var_pop |
||
cor(X, Y) cov_samp(X, Y) cor_pop(X, Y) regr_intercept(X, Y) regr_slope(X, Y) |
Window Functions
limits on use
- can they be used with group by
- window functions in WHERE
- different window functions in SELECT
usefulness
- pct of total
- pct of category
- cumsum
- rank
> SELECT state, fruit, avg(x) FROM produce GROUP BY state;
ERROR: column "produce.fruit" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT state, fruit, avg(x) FROM produce GROUP BY state;
^
> SELECT state, fruit, avg(x) OVER (PARTITION BY state) FROM produce;
state | fruit | avg
------------+------------+------------------------
arizona | banana | 1.00000000000000000000
california | orange | 8.6000000000000000
california | banana | 8.6000000000000000
california | apple | 8.6000000000000000
california | banana | 8.6000000000000000
california | orange | 8.6000000000000000
nevada | banana | 6.3333333333333333
nevada | apple | 6.3333333333333333
nevada | orange | 6.3333333333333333
oregon | grapefruit | 2.5000000000000000
oregon | grapefruit | 2.5000000000000000
washington | grapefruit | 2.5000000000000000
washington | apple | 2.5000000000000000
WHERE Clause Subqueries
Subqueries can be used in a WHERE clause with EXISTS, IN, and the comparison operators: = < > <= >= != (<>).
JOIN
The following two queries are equivalent.
SELECT *
FROM a, b
WHERE a.x = b.x
AND a.y > 0;
SELECT *
FROM a
JOIN b
ON a.x = b.x
WHERE a.y > 0;
The latter form is perhaps preferred. The latter separates the join condition from the expression, keeping the expression simpler. Each JOIN clause must have an ON clause, reducing the chance of writing a Cartesian join by accident.
To perform an outer join—LEFT, RIGHT, or FULL—one must use a JOIN clause.
JOINs can be used to replace (NOT) EXISTS with a subquery:
SELECT *
FROM customers c
WHERE NOT EXISTS (
SELECT customer_id
FROM orders o
WHERE c.id = o.customer_id
);
SELECT c.*
FROM customers c
LEFT JOIN orders o
ON c.id = o.customer_id
WHERE o.customer_id is NULL;
FROM Clause Subqueries
Subqueries inside parens can appear in FROM and JOIN clauses. They must be given an alias.
AS
Select list items and tables in FROM and JOIN clauses can be given an alias using AS. If the aliased item is a table or column its previous name is hidden. Use of the AS keyword is optional and can be omitted.
LIMIT and OFFSET
The standard is:
OFFSET start { ROW | ROWS }
FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY
Of these databases, only PostgreSQL provides a mechanism for selecting the row number, and it requires using a window function: row_number() OVER ().
WITH
CREATE TABLE directed_edge (
start_node TEXT,
end_node TEXT
);
INSERT INTO directed_edge VALUES ( 'a', 'b');
INSERT INTO directed_edge VALUES ( 'b', 'c');
INSERT INTO directed_edge VALUES ( 'c', 'd');
INSERT INTO directed_edge VALUES ( 'x', 'y');
WITH RECURSIVE directed_path(start_node, end_node) AS (
SELECT start_node, end_node
FROM directed_edge
UNION
SELECT dp.start_node, de.end_node
FROM directed_path AS dp
JOIN directed_edge de
ON dp.end_node = de.start_node
)
SELECT *
FROM directed_path;
start_node | end_node
------------+----------
a | b
b | c
c | d
x | y
a | c
b | d
a | d
NULL
NULL Handling in SQLite Versus Other Database Engines
- NULL propagates in arithmetic: NULL + 0 is NULL, NULL * 0 is NULL.
- NULLs distinct in UNIQUE
- NULLs not distinct according to DISTINCT
- NULL is FALSE in CASE: "CASE WHEN null THEN 1 ELSE 0 END"
- THREE VALUE LOGIC: NULL OR TRUE is TRUE, NULL AND FALSE is FALSE.
-- return 'bar' if foo is NULL
coalesce(foo, 'bar')
-- return NULL if foo is 'bar'
nullif(foo, 'bar')
Sets and Multisets
SELECT 'foo', 3 UNION SELECT 'bar', 7;
SELECT 'foo', 3 INTERSECT SELECT 'bar', 7;
SELECT 'foo', 3 EXCEPT SELECT 'bar', 7;
ALL and DISTINCT can be used after UNION, INTERSECT, and EXCEPT to indicate multiset or set operations. Set operations (i.e. DISTINCT) are the default.
Session Objects
Scripts
Temporary tables and variables.
Reflection
The standard calls for a schema called INFORMATION_SCHEMA. The starting point for learning about a database is:
SELECT * FROM INFORMATION_SCHEMA.TABLES;
The standard also provides these:
> SELECT CURRENT_USER;
> SELECT CURRENT_ROLE;
> SELECT CURRENT_SCHEMA;
> SELECT CURRENT_CATALOG;
sqlite:
SQLite does not have INFORMATION_SCHEMA. Use the .schema command to get a list of tables and their DDL.
Sargable Expressions
Sargable (en.wikipedia.org)
Transactions
Idempotent SQL
Idempotent DDL scripts are desirable. CREATE TABLE statements fail if the table already exists. Both PostgreSQL and MySQL support DROP TABLE foo IF EXISTS; which is not part of the standard.
MERGE (MySQL REPLACE)
TEMP tables and WITH.
Query information_schema. This requires a language which can branch.