Skip to content

Advent of Code 2020: "Custom Customs" - Day 6

The flight is approaching a regional airport, but you still have to fill out a customs form. Since you are good with hacking things together, you walk around the plane and gather all the answers from all groups on the plane. There are 26 yes-or-no questions, marked from a to z.

Task 1: Count how many questions are answeres "yes" in each group. Multiple "yes" answers for the same question do not count. The data is provided as blocks, kind of like two days before from the passport scanners. I can re-use some of the code - which will turn out to be a major headache!

Task 2: Of course someone wrote the wrong instructions down! It's up to you to fix the mess now. You have to find out how many groups have the same question answered by anyone in the group.

 

Preparation

Load the data, same table structure as day 4:

DROP TABLE IF EXISTS day6;

CREATE TABLE day6 (
    id      SERIAL NOT NULL PRIMARY KEY,
    value   TEXT
);

And load the actual data:

COPY day6 (value) FROM STDIN;
qtmdwspah
sqwdamhpt
phwdaqsmt
stmdqwhap
pqawdhtms

bgsickuztovfwa
yiozauvgfsbtkwc
zygijavmtfkcuwobs
fvsuwtiadzrncboklg

...


frhdaek
fpuwosv

qm
m
smg
\.

This time I included a part from the end as well, which will become important later.

I have 2148 rows in the table, but again that does not mean anything as the data is grouped:

SELECT COUNT(*) FROM day6;
 count 
-------
  2148
(1 row)

I also will copy the CTE from day 4 which parse the data into groups, and clean it up.

WITH day6_data AS (
    SELECT id, value
      FROM day6
     UNION ALL
    SELECT (SELECT MAX(id) FROM day6) + 1, ''::TEXT
  ORDER BY id
),
day6_data_groups AS (
    SELECT COALESCE(LAG(id) OVER (ORDER BY id), 0) + 1 AS dg_begin,
           id - 1 AS dg_end
      FROM day6_data
     WHERE value = ''
),
day6_data_elements AS (
    SELECT dg.dg_begin,
           dg.dg_end,
           particle
      FROM day6_data_groups dg
      JOIN day6_data ON day6_data.id BETWEEN dg.dg_begin AND dg.dg_end
CROSS JOIN regexp_split_to_table(day6_data.value, '') as _(particle)
)
SELECT * FROM day6_data_elements;

I made a mistake in this CTE when I copied it from day 4: the "SELECT MAX(id) FROM day6" part to find out the last row and add an empty row I forgot to change. It stayed "SELECT MAX(id) FROM day4" for a while - I had the entire calculation for the number of answers figured out, but kept getting a result which the website reported as "too low".

When I peeked into "day6_data" I couldn't see much, and missed that the empty line at the end was missing. Only when I debugged "day6_data_elements" I found that it ends at the second last group: "frhdaek" + "fpuwosv" and is not showing the last group of data. What happened is that the wrong query selected a number from "day4", which is lower than the number of entries in the table for "day6". Which means I got two rows with the same ID, and the ORDER BY in "day6_data" sorted the empty row somewhere in the middle. Not helpful!

Once I fixed that problem I got the correct result.

The "day6_data" and "day6_data_groups" CTEs are the same as for day4 (with changed names), the "day6_data_elements" CTE is a bit different:

    SELECT dg.dg_begin,
           dg.dg_end,
           particle
      FROM day6_data_groups dg
      JOIN day6_data ON day6_data.id BETWEEN dg.dg_begin AND dg.dg_end
CROSS JOIN regexp_split_to_table(day6_data.value, '') as _(particle)

I'm no longer selecting a SUBSTRING(), but just "particle", that is the one character. And the regexp_split_to_table() plits on '' instead of ' '.

Task 1

Using the single characters in "particle" and the position in "dg_begin" (or "dg_end", doesn't matter) I have to figure out a count over all the distinct characters for one group. PostgreSQL offers DISTINCT ON which can be used to return the first row among equal rows. For this I specify "dg_begin" and "particle" as match, and then the ORDER BY has to include the same columns as well - otherwise the same rows are not ordered together and can't be eliminated.

  SELECT DISTINCT ON (dg_begin, particle) dg_begin, particle
    FROM day6_data_elements
ORDER BY dg_begin, particle

This produces a list of rows where each character in a group only appears once:

 dg_begin | particle 
----------+----------
        1 | a
        1 | d
        1 | h
        1 | m
        1 | p
        1 | q
        1 | s
        1 | t
        1 | w
        7 | a
        7 | b
        7 | c
        7 | d
        7 | f
        7 | g
        7 | i
        7 | j
        7 | k
        7 | l
        7 | m
        7 | n
        7 | o
        7 | r
        7 | s
        7 | t
        7 | u
        7 | v
        7 | w
        7 | y
        7 | z

At this point I don't need the "dg_begin" column anymore, and can just count  the "c" column:

WITH day6_data AS (
    SELECT id, value
      FROM day6
     UNION ALL
    SELECT (SELECT MAX(id) FROM day6) + 1, ''::TEXT
  ORDER BY id
),
day6_data_groups AS (
    SELECT COALESCE(LAG(id) OVER (ORDER BY id), 0) + 1 AS dg_begin,
           id - 1 AS dg_end
      FROM day6_data
     WHERE value = ''
),
day6_data_elements AS (
    SELECT dg.dg_begin,
           dg.dg_end,
           particle
      FROM day6_data_groups dg
      JOIN day6_data ON day6_data.id BETWEEN dg.dg_begin AND dg.dg_end
CROSS JOIN regexp_split_to_table(day6_data.value, '') as _(particle)
)
  SELECT COUNT(particle) c
    FROM (
  SELECT DISTINCT ON (dg_begin, particle) dg_begin, particle
    FROM day6_data_elements
ORDER BY dg_begin, particle
         ) AS count;
  c   
------
 6532
(1 row)

That's already my final result of "yes" answers on the custom forms. With the wrong subselect in "day6_data" I ended up with "6528", which is too low.

Task 2

To solve this question I need to know the number of questions which are answered in each group, and the number of people which answered questions.

Can re-use the first three CTE from task 1, and need to further aggregate "day6_data_elements": the number of groups tells me the number of questions.

day6_element_count AS (
    SELECT de.dg_begin,
           de.dg_end,
           particle,
           COUNT(*) AS count
      FROM day6_data_elements de
  GROUP BY de.dg_begin, de.dg_end, particle
)

The number of "day6_data_groups" shows how many people answered questions. This grouped together answers the question of how many questions got answered by all people:

SELECT COUNT(*) AS count
  FROM day6_element_count ec,
       day6_data_groups dg
 WHERE ec.dg_end = dg.dg_end
   AND ec.count = 1 + dg.dg_end - dg.dg_begin;

The full query for task 2:

WITH day6_data AS (
    SELECT id, value
      FROM day6
     UNION ALL
    SELECT (SELECT MAX(id) FROM day6) + 1, ''::TEXT
  ORDER BY id
),
day6_data_groups AS (
    SELECT COALESCE(LAG(id) OVER (ORDER BY id), 0) + 1 AS dg_begin,
           id - 1 AS dg_end
      FROM day6_data
     WHERE value = ''
),
day6_data_elements AS (
    SELECT dg.dg_begin,
           dg.dg_end,
           particle
      FROM day6_data_groups dg
      JOIN day6_data ON day6_data.id BETWEEN dg.dg_begin AND dg.dg_end
CROSS JOIN regexp_split_to_table(day6_data.value, '') as _(particle)
),
day6_element_count AS (
    SELECT de.dg_begin,
           de.dg_end,
           particle,
           COUNT(*) AS count
      FROM day6_data_elements de
  GROUP BY de.dg_begin, de.dg_end, particle
)
SELECT COUNT(*) AS count
  FROM day6_element_count ec,
       day6_data_groups dg
 WHERE ec.dg_end = dg.dg_end
   AND ec.count = 1 + dg.dg_end - dg.dg_begin;

Obviously the result can only be as high as the result from task 1, in the best case. More likely it is lower:

 count 
-------
  3427
(1 row)

 

  • Twitter
  • Bookmark Advent of Code 2020: "Custom Customs" - Day 6 at del.icio.us
  • Facebook
  • Google Bookmarks
  • FriendFeed
  • Digg Advent of Code 2020: "Custom Customs" - Day 6
  • Bloglines Advent of Code 2020: "Custom Customs" - Day 6
  • Technorati Advent of Code 2020: "Custom Customs" - Day 6
  • Fark this: Advent of Code 2020: "Custom Customs" - Day 6
  • Bookmark Advent of Code 2020: "Custom Customs" - Day 6 at YahooMyWeb
  • Bookmark Advent of Code 2020: "Custom Customs" - Day 6 at Furl.net
  • Bookmark Advent of Code 2020: "Custom Customs" - Day 6 at reddit.com
  • Bookmark Advent of Code 2020: "Custom Customs" - Day 6 at blinklist.com
  • Bookmark Advent of Code 2020: "Custom Customs" - Day 6 at Spurl.net
  • Bookmark Advent of Code 2020: "Custom Customs" - Day 6 at Simpy.com
  • Bookmark Advent of Code 2020: "Custom Customs" - Day 6 at blogmarks
  • Bookmark Advent of Code 2020: "Custom Customs" - Day 6 with wists
  • wong it!
  • Bookmark using any bookmark manager!
  • Stumble It!
  • Identi.ca

Trackbacks

No Trackbacks

Comments

Display comments as Linear | Threaded

No comments

Add Comment

Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.
E-Mail addresses will not be displayed and will only be used for E-Mail notifications.
To leave a comment you must approve it via e-mail, which will be sent to your address after submission.
Form options