Yesterday I spent a couple of hours playing with ChatGPT. I know, we have some other recent posts about it. It’s so amazing that I couldn’t resist writing another. Apologies for that.
The goal of this post is to determine if I can effectively use ChatGPT as a programmer/mathematician assistant. OK. It was not my original intention, but let’s pretend it was, just to make this post more interesting.
So, I started asking a few very simple programming answers like the following:
Can you implement a function to compute the factorial of a number using a cache? Use python.
And this is what I got.
A clear and efficient implementation of the factorial. This is the kind of answer you would expect from a first year CS student.
I continued with a few more boring typical coding questions and it managed to implement all them without mistakes. At that point I started felt a bit overwhelmed. Was this all? Can ChatGPT just replace coders like the steam engine replaced horses?
Then I though that, perhaps, it was so good at those exercises because they were pretty common, and just by a simple google search you can find a copy paste solution for them.
So I next tried a maths questions. Something that a high school student should know how to do it, but that it is not easy to find in the internet and that should indicate some degree of understanding.
The question I asked to ChatGPT was the following:
Can you tell me which is the intersection between the unit circle and the curve “y= x*x”?
And this is what I got
It sounded as if it knew what it was talking about but… the answer was totally wrong. So I started feeling more relaxed. It may know how to code, but plain maths are not so easy for it. Totally understandable for an AI trained on text only though.
Finally, the last exercise consists of both programming and a bit of maths. I wanted it to to translate (shift) a 2D image. And this is what I got:
Again, a pretty sensible solution. Next, I wanted to complicate things a bit more, and I asked it to perform the same operation as before but with an image represented in Fourier space.
And even though the beginning of an answer was promising, the result was a total disaster. The chatbot correctly said that a translation in Fourier space can be represented as a phase shift, but then, it just repeated the previous code as if the images were in real space.
I decided to give it another try, and I gave it a hint.
But unfortunately, I got basically the same answer. Did I give up? No, I just decided to give it another useful hint.
And what I got was a complete mess. It just used the real space translation as in the other examples, but only on the Fourier image phases. I guess that this is a reasonable guess if you don’t know how to phase shift phase, but it is definitely wrong.
At that point, I was a bit sceptic. On the one hand, after each hint the solution got a bit better. On the other, a simple google search will get multiple easy answers (thanks Stack Overflow!), and you can even notice that there is a scipy function that does exactly what I was aiming for (scipy. ndimage.fourier_shift). So, I just gave it my last hint and crossed my fingers.
And bingo! It did it! It only took me 6 attempts to actually get the code automatically generated. Of course, it was not as efficient as writing it myself, but let’s be honest, this is the kind of question that most of us will need some time to figure out how to do it.
Lastly, as a bonus task, I asked it to convert the previous snipped into a reusable function. And after a few iterations, I got a perfectly usable function without writing a single line of code.
What are my conclusions? First of all, I have to say that I am impressed. Not only by this coding exercise, but also by the free text generation results I have seen. It is incredible to see how it is able to provide sensible answers for a wide variety of tasks and to me, it seems obvious that the Turing test is a thing of the past.
Leaving aside these general impressions, for the particular question of whether or not we can use ChatGPT as a programmer/mathematician assistant, I am afraid we should not. My general impression is that simple code generation is quite fast and accurate, and it can save some time. However, the main but, and it is an important one, is that since we cannot trust the suggested solution, we always need to read through the code and check if it makes sense. And while reading simple snippets is easy, writing them is easy too, so productivity gains can be minor in the end. Of course, if we are developing code driven by tests, automatic code generation is a great option, since we will also get automatic validation. But for general scientific coding, in which we work by fast prototyping, I don’t think that this approach is feasible. It goes without saying that automatic code generation for complex tasks is still not yet. Consequently, my take-home message after this two hours playing with ChatGPT is that we can only benefit from it i a quite limited set of scenarios, thus, it is far from becoming our next problem solving assistant. But given the improvement speeds we are seeing, I would not be surprised if I have to scrap this whole post in a matter of years.