Advent of Code 2020: "Handy Haversacks" - Day 7

They really trust you a lot ... Now all flights are delayed, that's quite usual in advent times. Except it's 2020 and no one is traveling anyway. But they also impose more regulations, and now bags need to have colours, and bags need to go into other bags. And someone has to figure it out. That's right, it's you! I hope the stopover is long enough to at least grab some ice cream!

Task 1: There is a huge list of rules which colores bag can hold how many other colored bags. You have a "shiny golden" bag, but apparently you are not allowed to carry it around, it has to go into another bag. Find out which ones are allowed, and how many of them. Oh, and bags can go into bags, which means: recursion.

Task 2: As usual someone made a mistake. You are, after all, allowed to carry your "shiny golden" bag, but now you also have to have other bags in it. Calculate the possibilities, based on the rules. Yes, recursion again, and tree climbing.

And if your head is swirling: so is mine. Why on North Pole Earth does anyone even remotely care about your shiny golden bag?

Preparation

``````DROP TABLE IF EXISTS day7;

CREATE TABLE day7 (
id      SERIAL NOT NULL PRIMARY KEY,
value   TEXT
);

COPY day7 (value) FROM STDIN;
bright olive bags contain 5 dotted white bags, 2 wavy lavender bags.
plaid brown bags contain 3 bright lime bags, 5 plaid coral bags.
dim coral bags contain 1 shiny maroon bag, 2 bright orange bags, 3 clear lavender bags.
dull olive bags contain 5 wavy purple bags, 2 plaid white bags, 4 vibrant magenta bags, 1 light brown bag.
dim yellow bags contain 1 bright cyan bag, 3 striped green bags, 5 dim white bags.
...
\.``````

I have 594 rows in the table:

``````SELECT COUNT(*) FROM day7;
count
-------
594
(1 row)``````

This time this means it's 594 rules. Why on earth are there 594 rules to follow for the luggage? From here it gets a bit more complicated. First I want to verify that all rules follow the same pattern:

``````SELECT (SELECT COUNT(*) FROM day7 WHERE value LIKE '% bags contain %')
- (SELECT COUNT(*) FROM day7) AS diff;
diff
------
0
(1 row)``````

Good, at least all rules follow the "bags contain" pattern, and I don't need to care about different formats. Then the format is "color name of bag", followed by "bags contain", followed by one or multiple rules. That's easy to split with a regexp.

``````WITH RECURSIVE _day7_split_data(split) AS (
SELECT REGEXP_MATCHES(value, '^([\w ]+) bags contain (.*)')
FROM day7
),
day7_split_data AS (
SELECT split[1] AS c1,
split[2] AS c2
FROM _day7_split_data
)
SELECT * FROM day7_split_data;``````

This query uses two CTE, although I could have wrapped this into one little bit more complicated one. However this way it is more descriptive what the query is doing. The first part "_day7_split_data" just splits by left and right from "bags contain". The second part "day7_split_data" (no underscore) selects the array fields into columns. This results in a data set which looks like:

``````         c1          |                                               c2
---------------------+-------------------------------------------------------------------------------------------------
bright olive        | 5 dotted white bags, 2 wavy lavender bags.
plaid brown         | 3 bright lime bags, 5 plaid coral bags.
dim coral           | 1 shiny maroon bag, 2 bright orange bags, 3 clear lavender bags.
dull olive          | 5 wavy purple bags, 2 plaid white bags, 4 vibrant magenta bags, 1 light brown bag.
dim yellow          | 1 bright cyan bag, 3 striped green bags, 5 dim white bags.
drab chartreuse     | 3 shiny lime bags, 2 bright indigo bags, 2 muted yellow bags, 5 dim tan bags.
dim cyan            | 4 drab tomato bags, 3 dotted teal bags, 1 posh purple bag, 2 faded brown bags.``````

"c1" is the bag colour, "c2" is the rule(s). Let's run another check:

``````WITH RECURSIVE _day7_split_data(split) AS (
SELECT REGEXP_MATCHES(value, '^([\w ]+) bags contain (.*)')
FROM day7
),
day7_split_data AS (
SELECT split[1] AS c1,
split[2] AS c2
FROM _day7_split_data
)
SELECT (SELECT COUNT(DISTINCT(c1)) FROM day7_split_data)
- (SELECT COUNT(*) FROM day7) AS diff;

diff
------
0
(1 row)``````

The number of distinct bag colours is equal the number of rules. This means there rules do not contain multiple lines for a single colour.

Next step is breaking up the rule(s) into multiple lines, one per rule. A "regexp_split_to_table()" is quite useful to get one line per rule:

``````WITH RECURSIVE _day7_split_data(split) AS (
SELECT REGEXP_MATCHES(value, '^([\w ]+) bags contain (.*)')
FROM day7
),
day7_split_data AS (
SELECT split[1] AS c1,
split[2] AS c2
FROM _day7_split_data
),
day7_splitup AS (
SELECT c1 wherein,
split[2] AS what,
CAST(split[1] AS INT4) AS howmany
FROM day7_split_data,
LATERAL (
SELECT regexp_match(rstt, '^(\d+) ([\w ]+) bag[s]?\.?\$') split
FROM regexp_split_to_table(c2, ', ') rstt
) split
)
SELECT * FROM day7_splitup;``````

This query returns the following data:

``````       wherein       |        what
---------------------+--------------------
bright olive        | dotted white
bright olive        | wavy lavender
plaid brown         | bright lime
plaid brown         | plaid coral
dim coral           | shiny maroon
dim coral           | bright orange
dim coral           | clear lavender
dull olive          | wavy purple
dull olive          | plaid white
dull olive          | vibrant magenta
dull olive          | light brown
dim yellow          | bright cyan
dim yellow          | striped green
dim yellow          | dim white``````

To find out how many rules apply for which colored bag, and how many bags can be in that bag, it needs a recursive tree over the rules. This works by joining "day7_build_tree" with itself, and with "day7_splitup".

``````WITH RECURSIVE _day7_split_data(split) AS (
SELECT REGEXP_MATCHES(value, '^([\w ]+) bags contain (.*)')
FROM day7
),
day7_split_data AS (
SELECT split[1] AS c1,
split[2] AS c2
FROM _day7_split_data
),
day7_splitup AS (
SELECT c1 wherein,
split[2] AS what,
CAST(split[1] AS INT4) AS howmany
FROM day7_split_data,
LATERAL (
SELECT regexp_match(rstt, '^(\d+) ([\w ]+) bag[s]?\.?\$') split
FROM regexp_split_to_table(c2, ', ') rstt
) split
),
day7_build_tree AS (
SELECT s.wherein AS wherein,
s.what AS what
FROM day7_splitup s
UNION
SELECT s.wherein AS wherein,
bt.what AS what
FROM day7_build_tree bt,
day7_splitup s
WHERE bt.wherein = s.what
)
SELECT * FROM day7_build_tree;``````

From here the answer for the first task is to find out how many distinct "wherein" values are in the result set for the golden shiny bag:

``````WITH RECURSIVE _day7_split_data(split) AS (
SELECT REGEXP_MATCHES(value, '^([\w ]+) bags contain (.*)')
FROM day7
),
day7_split_data AS (
SELECT split[1] AS c1,
split[2] AS c2
FROM _day7_split_data
),
day7_splitup AS (
SELECT c1 wherein,
split[2] AS what,
CAST(split[1] AS INT4) AS howmany
FROM day7_split_data,
LATERAL (
SELECT regexp_match(rstt, '^(\d+) ([\w ]+) bag[s]?\.?\$') split
FROM regexp_split_to_table(c2, ', ') rstt
) split
),
day7_build_tree AS (
SELECT s.wherein AS wherein,
s.what AS what
FROM day7_splitup s
UNION
SELECT s.wherein AS wherein,
bt.what AS what
FROM day7_build_tree bt,
day7_splitup s
WHERE bt.wherein = s.what
)
SELECT COUNT(DISTINCT(wherein)) AS count
FROM day7_build_tree
WHERE what = 'shiny gold';``````

The result is:

`````` count
-------
235
(1 row)``````

235 different color combinations where I can hide the golden cow bag ... Good luck enforcing that.

Task 2 is only slightly different, as the calculation climbs the same recursive tree, but now since the golden bag can contain any of the other bags, we need the product of those. This is calculated with:

``CAST(bt.howmany AS INT4) * CAST(s.howmany AS INT4) AS howmany``

in "day7_build_tree". Much to my surprise this ended up in an error message:

``ERROR:  integer out of range``

Hopefully airport security got a good calculator to figure out the endless bag fashion options. To make the calcuation fit I changed the initial INT4 into an INT8.

``````WITH RECURSIVE _day7_split_data(split) AS (
SELECT REGEXP_MATCHES(value, '^([\w ]+) bags contain (.*)')
FROM day7
),
day7_split_data AS (
SELECT split[1] AS c1,
split[2] AS c2
FROM _day7_split_data
),
day7_splitup AS (
SELECT c1 wherein,
split[2] AS what,
CAST(split[1] AS INT4) AS howmany
FROM day7_split_data,
LATERAL (
SELECT regexp_match(rstt, '^(\d+) ([\w ]+) bag[s]?\.?\$') split
FROM regexp_split_to_table(c2, ', ') rstt
) split
),
day7_build_tree AS (
SELECT s.wherein AS wherein,
s.what AS what,
CAST(s.howmany AS INT8) AS howmany
FROM day7_splitup s
UNION ALL
SELECT s.wherein AS wherein,
bt.what AS what,
CAST(bt.howmany AS INT8) * CAST(s.howmany AS INT8) AS howmany
FROM day7_build_tree bt,
day7_splitup s
WHERE bt.wherein = s.what
)
SELECT * FROM day7_build_tree;``````

After all this produces 787602 fashion rows. All I need is the sum() of "howmany" for the golden shiny bag:

``````WITH RECURSIVE _day7_split_data(split) AS (
SELECT REGEXP_MATCHES(value, '^([\w ]+) bags contain (.*)')
FROM day7
),
day7_split_data AS (
SELECT split[1] AS c1,
split[2] AS c2
FROM _day7_split_data
),
day7_splitup AS (
SELECT c1 wherein,
split[2] AS what,
CAST(split[1] AS INT4) AS howmany
FROM day7_split_data,
LATERAL (
SELECT regexp_match(rstt, '^(\d+) ([\w ]+) bag[s]?\.?\$') split
FROM regexp_split_to_table(c2, ', ') rstt
) split
),
day7_build_tree AS (
SELECT s.wherein AS wherein,
s.what AS what,
CAST(s.howmany AS INT8) AS howmany
FROM day7_splitup s
UNION ALL
SELECT s.wherein AS wherein,
bt.what AS what,
CAST(bt.howmany AS INT8) * CAST(s.howmany AS INT8) AS howmany
FROM day7_build_tree bt,
day7_splitup s
WHERE bt.wherein = s.what
)
SELECT SUM(howmany) AS count
FROM day7_build_tree
WHERE wherein = 'shiny gold';``````

Oh, and we are looking at "wherein" now, because we want to know what can be in the bag, no longer where the bag can be inside.

`````` count
--------
158493
(1 row)``````

So many options ...