Advent of Code 2020: "Handy Haversacks" - Day 7

Posted by ads' corner on Monday, 2020-12-07
Posted in [Sql]

They really trust you a lot … Now all flights are delayed, that’s quite usual in advent times. Except it’s 2020 and no one is traveling anyway. But they also impose more regulations, and now bags need to have colours, and bags need to go into other bags. And someone has to figure it out. That’s right, it’s you! I hope the stopover is long enough to at least grab some ice cream!

Task 1: There is a huge list of rules which colores bag can hold how many other colored bags. You have a shiny golden bag, but apparently you are not allowed to carry it around, it has to go into another bag. Find out which ones are allowed, and how many of them. Oh, and bags can go into bags, which means: recursion.

Task 2: As usual someone made a mistake. You are, after all, allowed to carry your shiny golden bag, but now you also have to have other bags in it. Calculate the possibilities, based on the rules. Yes, recursion again, and tree climbing.

And if your head is swirling: so is mine. Why on North Pole Earth does anyone even remotely care about your shiny golden bag?

Preparation

Let’s start with data loading. The method of just loading the raw data into a single TEXT field becomes quite useful. So far all the processing can be done in the database.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
DROP TABLE IF EXISTS day7;

CREATE TABLE day7 (
    id      SERIAL NOT NULL PRIMARY KEY,
    value   TEXT
);

COPY day7 (value) FROM STDIN;
bright olive bags contain 5 dotted white bags, 2 wavy lavender bags.
plaid brown bags contain 3 bright lime bags, 5 plaid coral bags.
dim coral bags contain 1 shiny maroon bag, 2 bright orange bags, 3 clear lavender bags.
dull olive bags contain 5 wavy purple bags, 2 plaid white bags, 4 vibrant magenta bags, 1 light brown bag.
dim yellow bags contain 1 bright cyan bag, 3 striped green bags, 5 dim white bags.
...
\.

I haveĀ 594 rows in the table:

1
2
3
4
5
SELECT COUNT(*) FROM day7;
 count
-------
   594
(1 row)

This time this means it’s 594 rules. Why on earth are there 594 rules to follow for the luggage? From here it gets a bit more complicated. First I want to verify that all rules follow the same pattern:

1
2
3
4
5
6
SELECT (SELECT COUNT(*) FROM day7 WHERE value LIKE '% bags contain %')
     - (SELECT COUNT(*) FROM day7) AS diff;
 diff
------
    0
(1 row)

Good, at least all rules follow the bags contain pattern, and I don’t need to care about different formats. Then the format is color name of bag, followed by bags contain, followed by one or multiple rules. That’s easy to split with a regexp.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
WITH RECURSIVE _day7_split_data(split) AS (
    SELECT REGEXP_MATCHES(value, '^([\w ]+) bags contain (.*)')
      FROM day7
),
day7_split_data AS (
    SELECT split[1] AS c1,
           split[2] AS c2
      FROM _day7_split_data
)
SELECT * FROM day7_split_data;

This query uses two CTE, although I could have wrapped this into one little bit more complicated one. However this way it is more descriptive what the query is doing. The first part _day7_split_data just splits by left and right from bags contain. The second part day7_split_data (no underscore) selects the array fields into columns. This results in a data set which looks like:

1
2
3
4
5
6
7
8
9
         c1          |                                               c2
---------------------+-------------------------------------------------------------------------------------------------
 bright olive        | 5 dotted white bags, 2 wavy lavender bags.
 plaid brown         | 3 bright lime bags, 5 plaid coral bags.
 dim coral           | 1 shiny maroon bag, 2 bright orange bags, 3 clear lavender bags.
 dull olive          | 5 wavy purple bags, 2 plaid white bags, 4 vibrant magenta bags, 1 light brown bag.
 dim yellow          | 1 bright cyan bag, 3 striped green bags, 5 dim white bags.
 drab chartreuse     | 3 shiny lime bags, 2 bright indigo bags, 2 muted yellow bags, 5 dim tan bags.
 dim cyan            | 4 drab tomato bags, 3 dotted teal bags, 1 posh purple bag, 2 faded brown bags.

c1 is the bag colour, c2 is the rule(s). Let’s run another check:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
WITH RECURSIVE _day7_split_data(split) AS (
    SELECT REGEXP_MATCHES(value, '^([\w ]+) bags contain (.*)')
      FROM day7
),
day7_split_data AS (
    SELECT split[1] AS c1,
           split[2] AS c2
      FROM _day7_split_data
)
SELECT (SELECT COUNT(DISTINCT(c1)) FROM day7_split_data)
     - (SELECT COUNT(*) FROM day7) AS diff;

 diff
------
    0
(1 row)

The number of distinct bag colours is equal the number of rules. This means there rules do not contain multiple lines for a single colour.

Next step is breaking up the rule(s) into multiple lines, one per rule. A “regexp_split_to_table()” is quite useful to get one line per rule:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
WITH RECURSIVE _day7_split_data(split) AS (
    SELECT REGEXP_MATCHES(value, '^([\w ]+) bags contain (.*)')
      FROM day7
),
day7_split_data AS (
    SELECT split[1] AS c1,
           split[2] AS c2
      FROM _day7_split_data
),
day7_splitup AS (
    SELECT c1 wherein,
           split[2] AS what,
           CAST(split[1] AS INT4) AS howmany
      FROM day7_split_data,
   LATERAL (
            SELECT regexp_match(rstt, '^(\d+) ([\w ]+) bag[s]?\.?$') split
              FROM regexp_split_to_table(c2, ', ') rstt
           ) split
)
SELECT * FROM day7_splitup;

This query returns the following data:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
       wherein       |        what
---------------------+--------------------
 bright olive        | dotted white
 bright olive        | wavy lavender
 plaid brown         | bright lime
 plaid brown         | plaid coral
 dim coral           | shiny maroon
 dim coral           | bright orange
 dim coral           | clear lavender
 dull olive          | wavy purple
 dull olive          | plaid white
 dull olive          | vibrant magenta
 dull olive          | light brown
 dim yellow          | bright cyan
 dim yellow          | striped green
 dim yellow          | dim white

Task 1

To find out how many rules apply for which colored bag, and how many bags can be in that bag, it needs a recursive tree over the rules. This works by joining day7_build_tree with itself, and with day7_splitup.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
WITH RECURSIVE _day7_split_data(split) AS (
    SELECT REGEXP_MATCHES(value, '^([\w ]+) bags contain (.*)')
      FROM day7
),
day7_split_data AS (
    SELECT split[1] AS c1,
           split[2] AS c2
      FROM _day7_split_data
),
day7_splitup AS (
    SELECT c1 wherein,
           split[2] AS what,
           CAST(split[1] AS INT4) AS howmany
      FROM day7_split_data,
   LATERAL (
            SELECT regexp_match(rstt, '^(\d+) ([\w ]+) bag[s]?\.?$') split
              FROM regexp_split_to_table(c2, ', ') rstt
           ) split
),
day7_build_tree AS (
    SELECT s.wherein AS wherein,
           s.what AS what
      FROM day7_splitup s
     UNION
    SELECT s.wherein AS wherein,
           bt.what AS what
      FROM day7_build_tree bt,
           day7_splitup s
     WHERE bt.wherein = s.what
)
SELECT * FROM day7_build_tree;

From here the answer for the first task is to find out how many distinct wherein values are in the result set for the golden shiny bag:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
WITH RECURSIVE _day7_split_data(split) AS (
    SELECT REGEXP_MATCHES(value, '^([\w ]+) bags contain (.*)')
      FROM day7
),
day7_split_data AS (
    SELECT split[1] AS c1,
           split[2] AS c2
      FROM _day7_split_data
),
day7_splitup AS (
    SELECT c1 wherein,
           split[2] AS what,
           CAST(split[1] AS INT4) AS howmany
      FROM day7_split_data,
   LATERAL (
            SELECT regexp_match(rstt, '^(\d+) ([\w ]+) bag[s]?\.?$') split
              FROM regexp_split_to_table(c2, ', ') rstt
           ) split
),
day7_build_tree AS (
    SELECT s.wherein AS wherein,
           s.what AS what
      FROM day7_splitup s
     UNION
    SELECT s.wherein AS wherein,
           bt.what AS what
      FROM day7_build_tree bt,
           day7_splitup s
     WHERE bt.wherein = s.what
)
SELECT COUNT(DISTINCT(wherein)) AS count
  FROM day7_build_tree
 WHERE what = 'shiny gold';

The result is:

1
2
3
4
 count
-------
   235
(1 row)

235 different color combinations where I can hide the golden cow bag … Good luck enforcing that.

Task 2

Task 2 is only slightly different, as the calculation climbs the same recursive tree, but now since the golden bag can contain any of the other bags, we need the product of those. This is calculated with:

1
CAST(bt.howmany AS INT4) * CAST(s.howmany AS INT4) AS howmany

in day7_build_tree. Much to my surprise this ended up in an error message:

ERROR:  integer out of range

Hopefully airport security got a good calculator to figure out the endless bag fashion options. To make the calcuation fit I changed the initial [INT4](https://www.postgresql.org/docs/current/datatype-numeric.html) into an INT8.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
WITH RECURSIVE _day7_split_data(split) AS (
    SELECT REGEXP_MATCHES(value, '^([\w ]+) bags contain (.*)')
      FROM day7
),
day7_split_data AS (
    SELECT split[1] AS c1,
           split[2] AS c2
      FROM _day7_split_data
),
day7_splitup AS (
    SELECT c1 wherein,
           split[2] AS what,
           CAST(split[1] AS INT4) AS howmany
      FROM day7_split_data,
   LATERAL (
            SELECT regexp_match(rstt, '^(\d+) ([\w ]+) bag[s]?\.?$') split
              FROM regexp_split_to_table(c2, ', ') rstt
           ) split
),
day7_build_tree AS (
    SELECT s.wherein AS wherein,
           s.what AS what,
           CAST(s.howmany AS INT8) AS howmany
      FROM day7_splitup s
     UNION ALL
    SELECT s.wherein AS wherein,
           bt.what AS what,
           CAST(bt.howmany AS INT8) * CAST(s.howmany AS INT8) AS howmany
      FROM day7_build_tree bt,
           day7_splitup s
     WHERE bt.wherein = s.what
)
SELECT * FROM day7_build_tree;

After all this producesĀ 787602 fashion rows. All I need is the sum() of howmany for the golden shiny bag:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
WITH RECURSIVE _day7_split_data(split) AS (
    SELECT REGEXP_MATCHES(value, '^([\w ]+) bags contain (.*)')
      FROM day7
),
day7_split_data AS (
    SELECT split[1] AS c1,
           split[2] AS c2
      FROM _day7_split_data
),
day7_splitup AS (
    SELECT c1 wherein,
           split[2] AS what,
           CAST(split[1] AS INT4) AS howmany
      FROM day7_split_data,
   LATERAL (
            SELECT regexp_match(rstt, '^(\d+) ([\w ]+) bag[s]?\.?$') split
              FROM regexp_split_to_table(c2, ', ') rstt
           ) split
),
day7_build_tree AS (
    SELECT s.wherein AS wherein,
           s.what AS what,
           CAST(s.howmany AS INT8) AS howmany
      FROM day7_splitup s
     UNION ALL
    SELECT s.wherein AS wherein,
           bt.what AS what,
           CAST(bt.howmany AS INT8) * CAST(s.howmany AS INT8) AS howmany
      FROM day7_build_tree bt,
           day7_splitup s
     WHERE bt.wherein = s.what
)
SELECT SUM(howmany) AS count
  FROM day7_build_tree
 WHERE wherein = 'shiny gold';

Oh, and we are looking at wherein now, because we want to know what can be in the bag, no longer where the bag can be inside.

1
2
3
4
 count
--------
 158493
(1 row)

So many options …


Categories: [Sql]