-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathsql.rmd
615 lines (489 loc) · 12.8 KB
/
sql.rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
---
title: "SQL cheatsheet"
author: Emil Rehnberg
bibliography: refs.bib
csl: shiki.csl
output:
pdf_document:
highlight: zenburn
html_document:
toc_float: TRUE
css: styles.css
---
```{r set-options, echo=FALSE, cache=FALSE}
options(width = 200)
```
```{r echo=FALSE, eval=FALSE}
library(rmarkdown); library(shiny)
rmdFilePath <- "sql.rmd"
rmarkdown::render(rmdFilePath, output_format="html_document") # "all_document"
```
```{r echo=FALSE, message=FALSE}
set.seed(308)
library(magrittr)
```
## 目的
Cheatsheet for SQL. It's a place to dump typical annotated SQL-code.
### General / Tricks
Sections for ANSI, Transact-SQL, Postgresql, MySQL
### ANSI { .tabset .tabset-fade .tabset-pills }
#### Overview
Standard SQL and "common" SQL (e.g. window functions)
#### Rollbacks
- use `BEGIN TRAN(SACTION)` to open a transaction
- `BEGIN TRAN T1` names the transaction `T1`
- use-case: applying nested/multiple transactions
- using `WITH MARK [descr]` with named transaction makes the (named) log restorable
```sql
BEGIN TRAN T1;
UPDATE table1 ...;
BEGIN TRAN M2 WITH MARK;
UPDATE table2 ...;
COMMIT TRAN M2;
UPDATE table3 ...;
COMMIT TRAN T1;
```
#### Find first missing key/number
Main idea: use <code>NOT EXISTS</code> to <i>catch</i> when <i>key + 1 does not exist</i> ( and `NOT EXISTS` catches this)
```sql
SELECT
CASE
WHEN NOT EXISTS(
SELECT *
FROM dbo.T1
WHERE key = 1)
THEN 1
ELSE (...subquery for min NA...)
END AS ...
...
```
Returns 1 if 1 doesn't exist, otherwise, it returns the subquery result. The subquery returns the minimum missing value >1
```sql
SELECT MIN(A.key) + 1 as missing
FROM dbo.T1 AS A
WHERE NOT EXISTS(
SELECT *
FROM dbo.T1 AS B
WHERE B.key = A.key + 1)
```
The `NOT EXISTS` predicate returns `TRUE` only for values in `T1` that are right before a gap.
A nice solution using `OUTER JOIN`.
- Identify `NULL`s in outer rows for smallest missing key (keys > 1)
```sql
SELECT MIN(A.keycol) + 1
FROM dbo.T1 AS A
LEFT OUTER JOIN dbo.T1 AS B
ON B.keycol = A.keycol + 1
WHERE B.keycol IS NULL;
```
To put your knowledge of the `EXISTS` predicate into action, try to solve the following problem. Notice that keycol must be positive
```sql
USE tempdb;
GO
IF OBJECT_ID('dbo.T1') IS NOT NULL
DROP TABLE dbo.T1;
GO
CREATE TABLE dbo.T1
(
keycol INT ...,
datacol VARCHAR(10) ...
);
INSERT INTO dbo.T1(keycol, datacol)
VALUES
(3, 'a'),
(4, 'b'),
(6, 'c'),
(7, 'd');
```
Write a query that returns the lowest missing key (starting at 1). E.g. the table is currently populated with the keys 3, 4, 6, and 7. Query should return the value 1. If you insert two more rows, with the keys 1 and 2, your query should return 5.
Alternative solution (perhaps harder to read)
```sql
SELECT COALESCE(MIN(A.keycol) + 1, 1)
FROM dbo.T1 AS A
WHERE
NOT EXISTS(
SELECT * FROM dbo.T1 AS B
WHERE B.keycol= A.keycol + 1)
AND EXISTS(
SELECT * FROM dbo.T1
WHERE keycol = 1);
```
#### Removing duplicates with DELETE
use `ROW_NUMBER` over unique traits and `DELETE` where row number is over 1
```sql
WITH Dups AS (
SELECT orderid, custid, empid, orderdate,
ROW_NUMBER()
OVER(PARTITION BY orderid
ORDER BY (SELECT 0)) AS rn
FROM dbo.OrdersDups
)
DELETE FROM Dups
WHERE rn > 1;
```
calculate `ROW_NUMBER` and `RANK` over unique traits and delete where they differ
```sql
WITH Dups AS (
SELECT orderid, custid, empid, orderdate,
ROW_NUMBER()
OVER(PARTITION BY orderid
ORDER BY (SELECT 0)) AS rn,
RANK()
OVER(PARTITION BY orderid
ORDER BY (SELECT 0)) AS rnk
FROM dbo.OrdersDups
)
DELETE FROM Dups
WHERE rn <> rnk;
```
#### last order for each customer
Extract last order for each customer
standard SQL
```sql
SELECT id, customer_id, order_date, shipping_country
FROM orders o1
JOIN (
SELECT customer_id AS cid,
MAX(order_date) AS maxdate
FROM orders
GROUP BY customer_id
) o2
ON o1.customer_id = o2.cid
AND o1.order_date = o2.{{maxdate}};
```
#### Relational division
return orders that contain a basket with products 2, 3, and 4.
```sql
SELECT orderid
FROM (SELECT
orderid,
COUNT(CASE WHEN productid = 2 THEN productid END) AS P2,
COUNT(CASE WHEN productid = 3 THEN productid END) AS P3,
COUNT(CASE WHEN productid = 4 THEN productid END) AS P4
FROM dbo.OrderDetails
GROUP BY orderid) AS P
WHERE P2 = 1 AND P3 = 1 AND P4 = 1;
```
derived table returns `A` and `B` rows from
```
order_id P2 P3 P4
-------- ---- ---- --
A 1 1 1
B 1 1 1
C NULL 1 1
D NULL NULL 1
```
#### Logical and bit
for logicals expression in BIT to access tools yielding True/False
```sql
SELECT
CAST(CASE WHEN [lf].[InvoiceDimDateId] > 0 THEN 1
ELSE 0
END
AS BIT)
FROM dbo.Order
```
### Postgresql { .tabset .tabset-fade .tabset-pills }
#### Overview
open source dialect, boosting standard ANSI sql.
#### last order for each customer
Extract last order for each customer
standard SQL
```sql
SELECT DISTINCT ON(customer_id)
*
FROM orders
ORDER BY customer_id::remember last order,
order_date DESC;
```
```sql
id | customer_id | order_date | shipping_country
----+-------------+------------+------------------
3 | 2 | 2015-12-12 | NO
2 | 4 | 2015-10-05 | JA
(2 rows)
```
### T-SQL { .tabset .tabset-fade .tabset-pills }
#### Overview
MS sql dialect transact-sql for SQL server
#### Date to integer convertion
to convert a datetime or date column/value to a dateId in integer use the `CONVERT` function
```sql
CONVERT(VARCHAR(8), StartTime, 112)
```
#### Triggers
triggers work like callback::programming terms. The following prints 'INSERT detected' after an `INSERT`
```sql
CREATE TRIGGER trgCustsINSERT
ON Custs
AFTER INSERT
AS
PRINT 'INSERT detected.';
GO
```
#### Adding DATETIME and TIME
```sql
DECLARE @d DATETIME = '2013-02-18T18:34:40.330',
@t TIME(7) = '00:11:00.0000000';
SELECT DATEADD(
SECOND,
DATEDIFF(SECOND, 0, @t),
@d);
```
```sql
...
-----------------------
2013-02-18 18:45:40.330
```
#### partitioned views
two orthogonal classifications
- physical placement of the component tables
- local: all tables constituting a partitioned view are located in a single instance of SQL Server
- distributed: tables are located across two or more instances
- updatability
- updatable: has a single column constraint on each participating table to determine which table a row belongs to
- not updatable: -
#### Comparing Partitioned Views / Tables
partitioned table
parts must reside in the same DB
must have same constraints and primary keys
partitioned view
can be split up over servers and machines
can be slower and quicker than table defending on the systems
optimazation is typically more complex
more objects
#### create empty table
```sql
SELECT TOP (0) column_list
INTO table
FROM other_table
```
#### emptying a table
keep table columns, reset ID counting, no logging.
```sql
TRUNCATE TABLE tableName;
```
#### archive deleted records
archive deleted data with `DELETE OUTPUT (INTO)`
```sql
WHILE 1 = 1
BEGIN
DELETE TOP(5000) FROM orders
OUTPUT deleted.orderid, deleted.date
INTO archive(orderid, date)
WHERE date < '20060101';
IF @@rowcount < 5000 BREAK;
END
```
#### delete lots of rows in a table
if you want delete lots of rows in a table
- TRUNCATE TABLE not doing it
- want logging
- want to keep a few rows
- technique is useful when you want to archive deletions w DELETE OUTPUT
```sql
DELETE in batches
WHILE 1 = 1
BEGIN
DELETE TOP (5000)
FROM orders
WHERE orderdate < '20060101';
IF @@rowcount <= 5000 BREAK;
END
```
#### Apply clause
The `APPLY` operator involves one or two steps:
1. Apply Right Table Expression to Left Table Rows
2. Add Outer Rows
`CROSS APPLY` uses 1., `OUTER APPLY` uses both.
I think of `APPLY` as a way to loop over a set of rows (something which might be nice when you want to match on a set element-by-element)
query returning the two (highest order IDs) orders for each
customer:
```sql
SELECT C.customerid, C.city, A.orderid
FROM dbo.Customers AS C
CROSS APPLY
(SELECT TOP (2) O.orderid, O.customerid
FROM dbo.Orders AS O
WHERE O.customerid = C.customerid
ORDER BY orderid DESC) AS A;
```
This query generates the following output:
```sql
custid city orderid
------ ------ -------
FRNDO Madrid 2
FRNDO Madrid 1
KRLOS Madrid 5
KRLOS Madrid 4
MRPHS Zion 6
```
Notice that FISSA is missing from the output because the table expression A returned an empty set for it. If you also want to return customers who placed no orders, use `OUTER APPLY` as follows:
```sql
SELECT C.customerid, C.city, A.orderid
FROM dbo.Customers AS C
OUTER APPLY
(SELECT TOP (2) O.orderid, O.customerid
FROM dbo.Orders AS O
WHERE O.customerid = C.customerid
ORDER BY orderid DESC) AS A;
```
This query generates the following output:
```sql
custid city orderid
------ ------ -------
FISSA Madrid NULL
FRNDO Madrid 2
FRNDO Madrid 1
KRLOS Madrid 5
KRLOS Madrid 4
MRPHS Zion 6
```
#### Grouping ID
- helps with identifying grouped rows
- instead of matching on `NULL`s
- accepts a list of attributes and constructs an integer bitmap for each attribute
- bit is 0 grouping set member attributes, 1 otherwise
- the rightmost bit represents the rightmost input attribute
- `GROUPING_ID(a, b, c, d)` returns
- 0 (0×8 + 0×4 + 0×2 + 0×1) for grouping set (a, b, c, d)
- 1 (0×8 + 0×4 + 0×2 + 1×1) for grouping set (a, b, c)
- 2 (0×8 + 0×4 + 1×2 + 0×1) for grouping set (a, b, d)
- 3 (0×8 + 0×4 + 1×2 + 1×1) for grouping set (a, b)
- ...
```sql
SELECT
GROUPING_ID(
custid,
empid,
YEAR(date),
MONTH(date),
DAY(date) ) AS grp_id,
c_id, e_id,
YEAR(date) AS year,
MONTH(date) AS mnth,
DAY(date) AS day,
SUM(qty) AS qty
FROM dbo.Orders
GROUP BY
CUBE(c_id, e_id),
ROLLUP(YEAR(date), MONTH(date), DAY(date))
```
`grp_id` value 25 is grouping set `(year, mnth)`. I.e. the second and third bits (value 2,4).
```sql
grp_id c_id e_id year mnth day qty
------ ---- ---- ---- ---- ---- ---
0 C 3 2006 4 18 22
16 NULL 3 2006 4 18 22
0 A 3 2006 8 2 10
24 NULL NULL 2006 4 18 22
25 NULL NULL 2006 4 NULL 22
16 NULL 3 2006 8 2 10
24 NULL NULL 2006 8 2 10
25 NULL NULL 2006 8 NULL 10
0 D 3 2006 9 7 30
16 NULL 3 2006 9 7 30
...
```
#### INSERT EXEC stored procedure
```sql
CREATE PROC dbo.GetRows
@n AS INT = 10
AS
SELECT ...
...
CREATE PROC ...
@to_rownum AS INT ...
AS
...
INSERT
INTO #CachedPages
EXEC dbo.GetRows
@n = @to_rownum::set proc argument @n to @to_rownum;
...
```
#### PIVOT clause
- operator allowing you to rotate, or pivot, data between columns and rows, performing aggregations along the way.
- requires alias
- will otherwise return Incorrect syntax near ';'.
```sql
SELECT *
FROM (
SELECT
empid,
YEAR(orderdate) AS year,
val
FROM Sales.OrderValues)
PIVOT(SUM(val)
FOR year
IN([2006],[2007],[2008]) AS P);
```
query generates the following
```
empid 2006 2007 2008
----- -------- --------- --------
3 18223.96 108026.17 76562.75
6 16642.61 43126.38 14144.16
9 9894.52 26310.39 41103.17
7 15232.16 60471.19 48864.89
1 35764.52 93148.11 63195.02
4 49945.12 128809.81 54135.94
2 21757.06 70444.14 74336.56
5 18383.92 30716.48 19691.90
8 22240.12 56032.63 48589.54
```
e.g. query for the total value of orders handled by each employee for each order year. output to have a row for each employee, a column for each order year, and the total value in the intersection of each employee and year.
#### Relational division
return orders that contain a basket with products 2, 3, and 4.
using PIVOT:
```sql
SELECT orderid
FROM (
SELECT orderid, productid
FROM dbo.OrderDetails) AS D
PIVOT(COUNT(productid)
FOR productid
IN([2],[3],[4])) AS P
WHERE [2] = 1 AND [3] = 1 AND [4] = 1;
```
derived table returns `A` and `B` rows from
```
order_id P2 P3 P4
-------- ---- ---- --
A 1 1 1
B 1 1 1
C NULL 1 1
D NULL NULL 1
```
### SQLite { .tabset .tabset-fade .tabset-pills }
#### Overview
Simple embedded sql dialect. Usable in just about any embedded software!!
#### run code for db
from command line
```sql
sqlite3 ex2.db < ex2.sql
```
#### print schema from db
from command line
```sql
sqlite3 ex2.db .schema
```
```sql
CREATE TABLE person (
id INTEGER PRIMARY KEY,
first_name TEXT,
last_name TEXT,
age INTEGER
);
CREATE TABLE pet (
id INTEGER PRIMARY KEY,
name TEXT,
breed TEXT,
age INTEGER,
dead INTEGER
);
CREATE TABLE person_pet (
person_id INTEGER,
pet_id INTEGER
);
```