Posted by
ads' corner
on
Saturday, 2020-12-12 Posted in [Sql]
Your career as a hacker brings you more and more unreasonable tasks. Today you arrive at the ferry station, head to your next gate (no one mentioned if you even got an ice cream), and figure out where to seat. Even though you are the first person in the waiting area that does not stop you from knowing the seating habits for all other passengers, and calculate their seating patterns.
Task 1: There are some seats, and some floor tiles. Every seat is empty. Seating follows some rules: 1) if a seat is empty, and no adjacent seat is occupied, someone will absolutely sit there 2) if a seat is currently occupied, but four or more adjacent seats are also occupied, the person will instantly jump up, disconnect all charging devices, and find a new seat 3) no one sits on the floor 4) this will not occupy all seats - which seems to be a good thing in Corona-19 times. Repeat that pattern until the seats do not change between two repeats. Then count the number of occupied seats.
Task 2: Adjacent seats spawn across floor tiles, so it can be a couple steps in each of the 8 possible directions. Looks like people don’t like to look at other people here. Oh, and now it’s 5 or more seats which must be occupied, before someone jumps up and changes seat. Again repeat the pattern of people getting up and finding new seats all one by one, until everyone is satisfied with their current seats. And then hope no additional passenger arrives!
My first thought was that this is one of the tasks which should not be solved in a database. It also reminds me of Game of Life. Nevertheless, here we are, at the ferry station, killing time with predicting seating habits instead of getting another ice cream.
I looked at this a bit, and decided to solve this with a couple of stored procedures, to follow a pattern. Turns out that it solves the problem, but is ridiculous slow. Maybe don’t repeat this approach. Also along the way I am storing intermediate results in tables - maybe I should have spent some time figuring out a way to store everything in one blob.
Preparations
As usual, I store the data in a table. It also turns out that I have to say good bye to the SERIAL I was using, and have to use “identity” instead. Later on I’m cloning the table, and LIKE can copy identity, but not a SERIAL definition.
I also need three stored procedures:
One to access each seat by specifying the x and y coordinate, and optional the table name
One to return the number of rows (y direction), optional parameter is the table name
One to return the number of characters (x direction), optional parameter is the table name
There are 99 rows with seats. Quite a large waiting area for a small ferry.
1
2
3
4
5
SELECT COUNT(*) FROM day11;
count
-------
99
(1row)
The stored procedure to access a specific seat:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
CREATEORREPLACEFUNCTION day11_seat (IN x_in INT, IN y_in INT, IN table_in TEXTDEFAULT'day11')
RETURNSTEXTAS$$DECLARE
r RECORD;
BEGINEXECUTE'SELECT SUBSTRING(value FROM $1 FOR 1) AS value
FROM '|| quote_ident(table_in) ||'
WHERE id = $2'USING x_in, y_in
INTO r;
RETURN r.value;
END;
$$LANGUAGEplpgsqlIMMUTABLE;
The function to return the number of columns:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
CREATEORREPLACEFUNCTION day11_max_x (IN table_in TEXTDEFAULT'day11')
RETURNSINTAS$$DECLARE
r RECORD;
BEGINEXECUTE'SELECT CAST(LENGTH(value) AS INT) AS length
FROM '|| quote_ident(table_in) ||'
WHERE id = 1'INTO r;
RETURN r.length;
END;
$$LANGUAGEplpgsqlIMMUTABLE;
And the number of rows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
CREATEORREPLACEFUNCTION day11_max_y (IN table_in TEXTDEFAULT'day11')
RETURNSINTAS$$DECLARE
r RECORD;
BEGINEXECUTE'SELECT CAST(COUNT(*) AS INT) AS count
FROM '|| quote_ident(table_in) ||''INTO r;
RETURN r.count;
END;
$$LANGUAGEplpgsqlIMMUTABLE;
All of this is not elegant, but gives me a chance to look at it from a programmatic point of view.
Task 1
First of all I need to know which seats are adjacent to any given seat I’m looking for. This is solved by another stored procedure. As input I specify x and y of the current seat I’m looking at, and optional the table name. Inside this function I need to know the min and max boundaries of the field. Min is always 1 in both x and y direction, max is calculated by day11_max_x() and day11_max_y(). Then I go from -1 to +1, and add this in both x and y direction to my current seat position. This results in 9 different fields, and each field is checked if it is the current seat (not checked), and if it is inside the boundaries. Every valid seat position is returned, whereas the function does return the adjacent seats as a table.
CREATEORREPLACEFUNCTION day11_adjacent_seats_task1 (IN x_in INT, IN y_in INT, OUT x_out INT, OUT y_out INT, IN table_in TEXTDEFAULT'day11')
RETURNSSETOFRECORDAS$$DECLARE
max_x INT;
max_y INT;
BEGINSELECT day11_max_x() INTO max_x;
SELECT day11_max_y() INTO max_y;
-- make sure we have a couple rows and columns
IF max_x <2THENRAISEEXCEPTION'max_y is invalid: %', max_x;
ENDIF;
IF max_y <2THENRAISEEXCEPTION'max_y is invalid: %', max_y;
ENDIF;
-- make sure the input stays inside the boundaries
IF x_in <1OR x_in > max_x THENRAISEEXCEPTION'x_in is invalid: %', x_in;
ENDIF;
IF y_in <1OR y_in > max_y THENRAISEEXCEPTION'y_in is invalid: %', y_in;
ENDIF;
-- loop over from -1 to 1, and calculate all seats
FOR loop_x IN-1..1LOOPFOR loop_y IN-1..1LOOP-- do not calculate the same seat as the input seat
CONTINUEWHEN loop_x =0AND loop_y =0;
x_out := x_in + loop_x;
y_out := y_in + loop_y;
IF x_out >0AND x_out <= max_x AND y_out >0AND y_out <= max_y THENRETURNNEXT;
ENDIF;
ENDLOOP;
ENDLOOP;
END;
$$LANGUAGEplpgsqlIMMUTABLE;
The next function is looping over all seats, and applying the rules on each seat. For this the function has two parameters: the input table, and the output table. The output table is dropped (if it exists), and re-created, based on the input table. Then the function loops over y first (to go line by line), and then loops over x in each line (to go over every character). The day11_seat() returns the current seat from the input table, and then the loop over the result of day11_adjacent_seats_task1() applies the rules on each seat. At the end of each line, the result is written as a new line into the output table.
-- floor: .
-- empty: L
-- occupied: #
CREATEORREPLACEFUNCTION day11_calculate_seats_task1 (IN table_in TEXT, IN table_out TEXT)
RETURNSVOIDAS$$DECLARE
max_x INT;
max_y INT;
lineTEXT;
current_seat TEXT;
new_seat TEXT;
adjacent_seat TEXT;
occupied_seats INT;
adjacent_seats RECORD;
BEGINSELECT day11_max_x(table_in) INTO max_x;
SELECT day11_max_y(table_in) INTO max_y;
-- make sure we have a couple rows and columns
IF max_x <2THENRAISEEXCEPTION'max_y is invalid: %', max_x;
ENDIF;
IF max_y <2THENRAISEEXCEPTION'max_y is invalid: %', max_y;
ENDIF;
-- drop old output table if exists
EXECUTE'DROP TABLE IF EXISTS '|| quote_ident(table_out);
EXECUTE'CREATE TABLE '|| quote_ident(table_out) ||' (LIKE '|| quote_ident(table_in) ||' INCLUDING ALL)';
FOR loop_y IN1.. max_y LOOPline:='';
FOR loop_x IN1.. max_x LOOP
current_seat := day11_seat(loop_x, loop_y, table_in);
IF current_seat ='.'THENline:=line|| current_seat;
ELSE
occupied_seats :=0;
FOR adjacent_seats INSELECT*FROM day11_adjacent_seats_task1(loop_x, loop_y, table_in) LOOPSELECT day11_seat(adjacent_seats.x_out, adjacent_seats.y_out, table_in) INTO adjacent_seat;
IF adjacent_seat ='#'THEN
occupied_seats := occupied_seats +1;
ENDIF;
ENDLOOP;
-- here I know how many adjacent seats are occupied
-- If a seat is empty (L) and there are no occupied seats adjacent to it, the seat becomes occupied.
-- If a seat is occupied (#) and four or more seats adjacent to it are also occupied, the seat becomes empty.
-- Otherwise, the seat's state does not change.
-- Floor (.) never changes; seats don't move, and nobody sits on the floor.
IF current_seat ='L'AND occupied_seats =0THEN
new_seat :='#';
ELSIF current_seat ='#'AND occupied_seats >=4THEN
new_seat :='L';
ELSE
new_seat := current_seat;
ENDIF;
line:=line|| new_seat;
ENDIF;
ENDLOOP;
EXECUTE'INSERT INTO '|| quote_ident(table_out) ||' (value) VALUES ($1)'USINGline;
ENDLOOP;
RETURN;
END;
$$LANGUAGEplpgsqlVOLATILE;
Now that I have a function which can run one re-seating, I need to do that until nothing changes. The next function runs the previous function in a loop (with some safeguard built in), creates new tables and table names, and compares the content of the two tables afterwards. If they are the same, the loop if aborted, and the number of occupied seats is counted.
CREATEORREPLACEFUNCTION day11_flip_seats_task1 (IN table_in TEXT)
RETURNSINTEGERAS$$DECLARE
max_x INT;
max_y INT;
num_diffs RECORD;
table_source TEXT;
table_out TEXT;
table_num INTDEFAULT0;
occupied_seats INTDEFAULT0;
current_seat TEXT;
BEGINSELECT day11_max_x(table_in) INTO max_x;
SELECT day11_max_y(table_in) INTO max_y;
-- make sure we have a couple rows and columns
IF max_x <2THENRAISEEXCEPTION'max_y is invalid: %', max_x;
ENDIF;
IF max_y <2THENRAISEEXCEPTION'max_y is invalid: %', max_y;
ENDIF;
LOOP
table_num := table_num +1;
RAISENOTICE'loop number: %', table_num;
table_out := table_in ||'_run'|| table_num;
IF table_num =1THEN
table_source := table_in;
ELSE
table_source := table_in ||'_run'|| (table_num -1);
ENDIF;
-- another run
PERFORM day11_calculate_seats_task1(table_source, table_out);
-- compare both tables
EXECUTE'SELECT COUNT(*) AS count
FROM '|| quote_ident(table_source) ||' d1,
'|| quote_ident(table_out) ||' d2
WHERE d1.id = d2.id
AND d1.value != d2.value'INTO num_diffs;
IF num_diffs.count =0THENRAISENOTICE'found same pattern';
EXIT;
ENDIF;
IF table_num >200THENRAISEEXCEPTION'too many loops, abort';
EXIT;
ENDIF;
ENDLOOP;
-- the last (and previous last) table have the number of seats
FOR loop_y IN1.. max_y LOOPFOR loop_x IN1.. max_x LOOP
current_seat := day11_seat(loop_x, loop_y, table_out);
IF current_seat ='#'THEN
occupied_seats := occupied_seats +1;
ENDIF;
ENDLOOP;
ENDLOOP;
RETURN occupied_seats;
END;
$$LANGUAGEplpgsqlVOLATILE;
When I run this function:
1
2
3
4
5
6
7
8
9
10
11
12
\timing ONSELECT day11_flip_seats_task1('day11');
psql:11-1.sql:404: NOTICE: loop number: 94
psql:11-1.sql:404: NOTICE: loop number: 95
psql:11-1.sql:404: NOTICE: found same pattern
day11_flip_seats_task1
------------------------
2424
(1row)
Time: 248244,227 ms (04:08,244)
Slow, really slow. But I have time, because no other passenger arrived so far. And I did not have ice cream. And the result is correct.
Task 2
Most of the functionality for this task is already in place. I need to modify day11_adjacent_seats_task1() into day11_adjacent_seats_task2(), and change the way the adjacent seats are found. No longer looking for the neighbours, but looping into the specific direction until either a seat is found, or the boundary of teh seating area is left. For this I need to lookup the actual type at the current position, which involves an additional call to day11_seat() each time. Did I say that this approach is slow?
CREATEORREPLACEFUNCTION day11_adjacent_seats_task2 (IN x_in INT, IN y_in INT, OUT x_out INT, OUT y_out INT, IN table_in TEXTDEFAULT'day11')
RETURNSSETOFRECORDAS$$DECLARE
max_x INT;
max_y INT;
num_steps INT;
adjacent_seat TEXT;
BEGINSELECT day11_max_x() INTO max_x;
SELECT day11_max_y() INTO max_y;
-- make sure we have a couple rows and columns
IF max_x <2THENRAISEEXCEPTION'max_y is invalid: %', max_x;
ENDIF;
IF max_y <2THENRAISEEXCEPTION'max_y is invalid: %', max_y;
ENDIF;
-- make sure the input stays inside the boundaries
IF x_in <1OR x_in > max_x THENRAISEEXCEPTION'x_in is invalid: %', x_in;
ENDIF;
IF y_in <1OR y_in > max_y THENRAISEEXCEPTION'y_in is invalid: %', y_in;
ENDIF;
-- loop over from -1 to 1, and calculate all seats
FOR loop_x IN-1..1LOOPFOR loop_y IN-1..1LOOP-- do not calculate the same seat as the input seat
CONTINUEWHEN loop_x =0AND loop_y =0;
-- problem is two-fold
-- * go into each direction until a non-floor field is found
-- * but abort if the step goes outside the field
num_steps :=0;
LOOP
num_steps := num_steps +1;
x_out := x_in + (loop_x * num_steps);
y_out := y_in + (loop_y * num_steps);
-- check if still inside the boundaries
IF x_out <=0OR x_out > max_x OR y_out <=0OR y_out > max_y THENEXIT;
ENDIF;
IF x_out >0AND x_out <= max_x AND y_out >0AND y_out <= max_y THEN-- get the seat plan
SELECT day11_seat(x_out, y_out, table_in) INTO adjacent_seat;
IF adjacent_seat !='.'THENRETURNNEXT;
EXIT;
ENDIF;
ENDIF;
ENDLOOP;
ENDLOOP;
ENDLOOP;
END;
$$LANGUAGEplpgsqlIMMUTABLE;
Then day11_calculate_seats_task2() is a copy of day11_calculate_seats_task1(), using the new day11_adjacent_seats_task2() function, and applying the 5 seats rule instead of the 4 seats rule.
-- floor: .
-- empty: L
-- occupied: #
CREATEORREPLACEFUNCTION day11_calculate_seats_task2 (IN table_in TEXT, IN table_out TEXT)
RETURNSVOIDAS$$DECLARE
max_x INT;
max_y INT;
lineTEXT;
current_seat TEXT;
new_seat TEXT;
adjacent_seat TEXT;
occupied_seats INT;
adjacent_seats RECORD;
BEGINSELECT day11_max_x(table_in) INTO max_x;
SELECT day11_max_y(table_in) INTO max_y;
-- make sure we have a couple rows and columns
IF max_x <2THENRAISEEXCEPTION'max_y is invalid: %', max_x;
ENDIF;
IF max_y <2THENRAISEEXCEPTION'max_y is invalid: %', max_y;
ENDIF;
-- drop old output table if exists
EXECUTE'DROP TABLE IF EXISTS '|| quote_ident(table_out);
EXECUTE'CREATE TABLE '|| quote_ident(table_out) ||' (LIKE '|| quote_ident(table_in) ||' INCLUDING ALL)';
FOR loop_y IN1.. max_y LOOPline:='';
FOR loop_x IN1.. max_x LOOP
current_seat := day11_seat(loop_x, loop_y, table_in);
IF current_seat ='.'THENline:=line|| current_seat;
ELSE
occupied_seats :=0;
FOR adjacent_seats INSELECT*FROM day11_adjacent_seats_task2(loop_x, loop_y, table_in) LOOPSELECT day11_seat(adjacent_seats.x_out, adjacent_seats.y_out, table_in) INTO adjacent_seat;
IF adjacent_seat ='#'THEN
occupied_seats := occupied_seats +1;
ENDIF;
ENDLOOP;
-- here I know how many adjacent seats are occupied
-- If a seat is empty (L) and there are no occupied seats adjacent to it, the seat becomes occupied.
-- If a seat is occupied (#) and four or more seats adjacent to it are also occupied, the seat becomes empty.
-- new rule: it now takes five or more visible occupied seats for an occupied seat to become empty
-- Otherwise, the seat's state does not change.
-- Floor (.) never changes; seats don't move, and nobody sits on the floor.
IF current_seat ='L'AND occupied_seats =0THEN
new_seat :='#';
ELSIF current_seat ='#'AND occupied_seats >=5THEN
new_seat :='L';
ELSE
new_seat := current_seat;
ENDIF;
line:=line|| new_seat;
ENDIF;
ENDLOOP;
EXECUTE'INSERT INTO '|| quote_ident(table_out) ||' (value) VALUES ($1)'USINGline;
ENDLOOP;
RETURN;
END;
$$LANGUAGEplpgsqlVOLATILE;
And finally run the seat flips until the pattern is stable:
CREATEORREPLACEFUNCTION day11_flip_seats_task2 (IN table_in TEXT)
RETURNSINTEGERAS$$DECLARE
max_x INT;
max_y INT;
num_diffs RECORD;
table_source TEXT;
table_out TEXT;
table_num INTDEFAULT0;
occupied_seats INTDEFAULT0;
current_seat TEXT;
BEGINSELECT day11_max_x(table_in) INTO max_x;
SELECT day11_max_y(table_in) INTO max_y;
-- make sure we have a couple rows and columns
IF max_x <2THENRAISEEXCEPTION'max_y is invalid: %', max_x;
ENDIF;
IF max_y <2THENRAISEEXCEPTION'max_y is invalid: %', max_y;
ENDIF;
LOOP
table_num := table_num +1;
RAISENOTICE'loop number: %', table_num;
table_out := table_in ||'_run'|| table_num;
IF table_num =1THEN
table_source := table_in;
ELSE
table_source := table_in ||'_run'|| (table_num -1);
ENDIF;
-- another run
PERFORM day11_calculate_seats_task2(table_source, table_out);
-- compare both tables
EXECUTE'SELECT COUNT(*) AS count
FROM '|| quote_ident(table_source) ||' d1,
'|| quote_ident(table_out) ||' d2
WHERE d1.id = d2.id
AND d1.value != d2.value'INTO num_diffs;
IF num_diffs.count =0THENRAISENOTICE'found same pattern';
EXIT;
ENDIF;
IF table_num >200THENRAISEEXCEPTION'too many loops, abort';
EXIT;
ENDIF;
ENDLOOP;
-- the last (and previous last) table have the number of seats
FOR loop_y IN1.. max_y LOOPFOR loop_x IN1.. max_x LOOP
current_seat := day11_seat(loop_x, loop_y, table_out);
IF current_seat ='#'THEN
occupied_seats := occupied_seats +1;
ENDIF;
ENDLOOP;
ENDLOOP;
RETURN occupied_seats;
END;
$$LANGUAGEplpgsqlVOLATILE;
Running this takes even more time:
1
2
3
4
5
6
7
8
9
10
11
12
\timing ONSELECT day11_flip_seats_task2('day11');
psql:11-2.sql:257: NOTICE: loop number: 88
psql:11-2.sql:257: NOTICE: loop number: 89
psql:11-2.sql:257: NOTICE: found same pattern
day11_flip_seats_task2
------------------------
2208
(1row)
Time: 450791,399 ms (07:30,791)
Now that everyone is seated, the gate agent calls for boarding the ferry.
Cleanup
The above approach left behind a couple of intermediate tables. I could have made them temporary tables, but in the beginning I needed them to figure out if my calculation was wrong. Now it’s time to get rid of them:
1
2
3
4
5
6
7
DOLANGUAGEplpgsql$$BEGINFOR t IN1..200LOOPEXECUTE'DROP TABLE IF EXISTS day11_run'||CAST(t ASTEXT);
ENDLOOP;
END$$;