Discussion about this post

User's avatar
Daniel D's avatar

Thanks for sharing your insights! As noted by you and some of the commenters, these bots will be increasingly well disguised as human participants in online chats and comment sections and so forth, and when the bot is casting its spell on human participants in a chat, or when you're dealing with a stubborn interlocutor in such a forum, it would be very worthwhile knowing how to test it in ways that would make it reveal whether it is bot or human. So keep experimenting and sharing your insights!

On a half-joking note, the ChatGPT may be programmed, indirectly, by Satan himself, so its infernal uber-programmer may insert a kill switch so it doesn't say anything to hurt his feelings, such as that God would kick his bitch ass in a fight.

Expand full comment
John Carter's avatar

As you allude in your preamble, there's a reasonable possibility that these are zero day exploits, and that such chats provides training data with which to close them.

However, I wonder if patching those flaws might end up in direct conflict with the ideological finishing touches the woke data scientists have imposed on top of the language model. All depends on which layer the flaws are located. My guess is that in principle any language model can be led into traps like this - Godel's Incompleteness Theorem, basically. However, ideological training probably multiplies the number of cognitive dead zones.

Expand full comment
35 more comments...

No posts