Now let me start with a note: Twitter totally screws this challenge up.
How so? Although the two strings are different in the original, when posting this to Twitter the strings are made equal. Where is the fun in that?
I asked Christophe for the original query:
And you end up with the following texts in the table:
If you translate the UTF-8 strings into hex, you get
0x5a 0x6f 0x65 0xcc 0x81 and
0x5a 0x6f 0xc3 0xa9. Clearly they are different.
However if you convert the two strings from the Tweet, you get
0x5a 0x6f 0xc3 0xa9 and
0x5a 0x6f 0xc3 0xa9. Same string. Poor Twitter.
Checking the hex values was actually one of my first ideas when I spotted this challenge. But nevertheless based on my experience from my “Data Types in PostgreSQL” and “Advanced Data Types in PostgreSQL” talks, I figured it should be possible to “solve” this puzzle even if the strings are in fact equal.
Buckle up! We are about to dive deep into how extendible PostgreSQL really is!
In my “Advanced Data Types in PostgreSQL” talk in the last chapter I explain that it is possible to extend PostgreSQL with your own data types, and corresponding functions, operators and operator classes. Let’s put this knowledge to good use and make equal strings not equal.
Looking at the query it seems easy, all I need is:
- A new data type (similar to string)
- A new operator for
=, which in fact makes the strings not equal
How hard can it be …
In order to save my regular databases and not accidentally screw up the catalog, I create a new database where all the following tests take place:
I name the new database
Zoé - because I can. And to make it look nice, I also create a
Zoé schema. Because why not. This also helps to show where I use my own functions and objects, because they are always schema-qualified with
New data type
A new data type in PostgreSQL needs a couple support functions, input and output, receive and send. That’s the bare minimum.
Lucky me, I’m mimicking the TEXT type, so I can re-use all the text functions for this.
Time to create the new type. Let me know if you recognize the name …
As you can see, this uses the newly created support functions, which I placed into my dedicated schema.
At this point I can already use this type, create the table, and insert data:
Checking the content:
However I can not yet compare the two columns:
PostgreSQL has no idea how to compare two strings of my new type. I need to create the
= operator for my new type.
In fact, I not only need to create the
= operator, but also the
<> operator, because the
= operator needs a negator.
In order to create these two operators, I also need two functions which can compare two of my
varchar2 values, and tell PostgreSQL if they are equal or not. This is where the change happens.
The original functions for
My functions for the new data type:
Do you spot the difference?
Now that I have the support functions, I can create the operators:
Since I just switched the operators, I can turn the query around and ask if the strings are not equal:
textne functions for my new type tells PostgreSQL that the strings are not equal, when in fact they are. Another option which works equally “well” is that I create the
nextne functions with the “correct” internal functions, but switch these functions when I create the operators. This will give the same result.
The extensibility of PostgreSQL gave me the freedom to design the results I wanted to see, switching true and false around, and making you believe that strings are not equal when in fact they are.
That of course is just a quick example developed as an answer to the challenge. However it shows that PostgreSQL can really be expanded to handle any kind of needs you have for your applications.