49 Issues and ChatGPT

The New Yorker publishes every Monday. However, they have 4 issues each year that are double issues, each covering 2 weeks.

The first issue for 2024 is a Jan. 1/Jan. 8. I wondered whether the publisher’s decision was 48 issues a year or 4 double issues each year. I thought it would be interesting to know which years they have a choice.

In other words, I wondered what years in the 21st century have 53 Mondays in instead of the regular 52. When I asked ChatGPT 4 for the answer, I was surprised at how it found its answer.

Rather than having specific date training that could be consulted, ChatGPT wrote some Python code. It then ran the code giving the answer of 2001, 2007, 2012, 2018, 2024, 2029, 2035, 2040, 2046, 2052, 2057, 2063, 2068, 2074, 2080, 2085, 2091, 2096. If you look at the list, the years are spaced out every 5 or 6 years.

The code was:

from datetime import datetime, timedelta

# Function to check if a year has 53 Mondays
def has_53_mondays(year):
    # January 1 of the year
    jan_1 = datetime(year, 1, 1)
    # Count the number of Mondays
    monday_count = 0
    for i in range(365 + (jan_1.year % 4 == 0 and jan_1.year % 100 != 0 or jan_1.year % 400 == 0)):
        if (jan_1 + timedelta(days=i)).weekday() == 0:
            monday_count += 1
    return monday_count == 53

# List of years in the 21st century with 53 Mondays
years_with_53_mondays = [year for year in range(2001, 2101) if has_53_mondays(year)]
years_with_53_mondays

It’s understandable with a little work and doesn’t make any sophisticated assumptions. To get the answer, it visits every day so that it takes about 36525 steps to get the 21st century result. (With a deeper understanding of the calendar, it wouldn’t need the inner loop). This code offered a solution that is verifiable. (I just did a little research and Python’s calendar library doesn’t account for 1752.)

I only checked a few of its answers but the code looks robust.

I was curious what else it could do. I asked it for the first prime power greater than 1000000. The tool wrote some code and gave what looked like a reasonable answer.

# To find the first prime power greater than 1,000,000, we start by finding prime numbers and then checking their powers
# until we find a value greater than 1,000,000.

from sympy import primerange, isprime

def first_prime_power_above(limit):
    for prime in primerange(2, limit):
        power = 2  # Start from the square of the prime
        while True:
            prime_power = prime ** power
            if prime_power > limit:
                return prime_power
            power += 1

# Setting a limit for the search range. The exact limit is a guess; we need a range large enough to find our number.
search_limit = 10000
first_prime_power_above(1000000)

The code has a problem because it will only return a power of two. It only exits the inner loop with “return” and thus only analyzes powers of 2.

Armed with that understanding, I saw that it would fail on 150, so when I asked, it gave me 256. I knew 169 was the right answer. When I pointed out that it was using a greedy algorithm that only returns a power of two. It generated new code, below. It uses Python sets to calculate the result.

# Revisiting the approach to correctly find the first prime power greater than 150, considering all primes and their powers.

def first_correct_prime_power_above_v2(limit):
    prime_powers = set()
    for prime in primerange(2, limit):
        power = 2
        while True:
            prime_power = prime ** power
            if prime_power > limit:
                prime_powers.add(prime_power)
                break
            power += 1
    return min(prime_powers)

first_correct_prime_power_above_v2(150)

I was surprised that ChatGPT would generate code to make its analysis in both cases. The team implementing the engine saw that Python works as a lingua franca to implement algorithms. Simple Python is popular and has clean syntax.

My experiment demoed a feature that I didn’t expect. I can validate the code. ChatGPT lets me look under the hood while the textual responses won’t allow. It also shows that even unambiguous mathematical questions can yield incorrect results.

Bard was not useful. On the date question, it gave a list of incorrect answers. Rather than having a verifiable algorithm, it hallucinates a rule to explain its answer. Often its verbal algorithm isn’t correct and may not even make sense. Nor does the answer even follow the heuristic. I pointed out one of the mistakes in its year list and it just came up with a different confused rule it didn’t follow and got an even worse answer. I’ve noticed that Bard is inflexible and will double down on wrong answers.

For the prime power, Bard gave nonsensical steps to explain what it calulated. And then didn’t use that procedure to find the answer.


The first prime power greater than 150 is 169.

Here's the reasoning:

1. Start with the closest prime number after 150, which is 151.

2. Check if 151 is a prime power: 151 is only divisible by 1 and itself, making it a prime number, but not a prime power (prime powers must be raised to an exponent greater than 1).

3. Move on to the next prime number, which is 157. Again, 157 is only divisible by 1 and itself, confirming it's prime.

4. However, 157 raised to any positive exponent will still be an odd number, so it can't be the first prime power greater than 150.

5. Continue this process with the next prime numbers: 163 and 167.

6. 163 is not a prime power (163 = 7 * 23).

7. Finally, 167 is both a prime number and a prime power: 167 = 7^2, fulfilling both criteria.

Therefore, 169 (7^2) is the first prime power greater than 150. 

Bard offered two alternate responses, but they are just as jumbled and confused as this one. Step 4 is irrelevant. Step 6 is worse because 163 is prime. I don’t know why it has 167 and 169 confused nor why it uses seven squared.

Rather than lean on the English language to answer a mathematical question, ChatGPT uses an expertise in programming to answer math adjacent questions. This experiment demonstrates that it has basic algorithmic reasoning capabilities but is not consistently correct.

Going back to the original question, when I looked at a blow-in ad card, it said there are 4 double issues, so this year, I expect 49 issues with 4 of those being double issues.

GitHub Copilot Honor Pledge

I’ve been using GitHub Copilot recently with Microsoft Visual Studio while programming in C++. GitHub Copilot is an AI based software development assistant.

The initial observation is that it can be helpful. However, I found that it can violate the privacy of users whose code has been ingested by the language model.

In some big successes, it used comments as prompts to generate Win32 code. Once, the generated code revealed the exact function I needed. I didn’t know what to search to find the API that identifies the edge the taskbar is anchored. The code wasn’t exactly what I needed, but the results showed me where to look in Microsoft’s Win32 documentation.

When I was exploring a neural network project, it filled in a long list of varied neural network classes. As I customized my code, later code replicated the altered class outline. When it generates correct code after you set up the types and variable names, it can be fun.

My first critique is that it is sluggish. The time between when I am presented with a suggestion and Visual Studio registering my acceptance of it can be slow. Occasionally it can be insistent on making a change when I want to choose other code. Perhaps my PC needs an upgrade?

It is able to write descriptive comments about code. However, I have found that the description can be incorrect. It can write clear English, but what it says might not reflect an accurate understanding of the code. It can comment on the purpose of code based on insufficient evidence and can misinterpret a declaration and synthesize incorrect text. This requires a lot of vigilance. For example, I noticed, days later, that some of the text was superficially correct, but when I read it carefully, I saw that it was wrong. When Copilot is generating text, even if it isn’t correct, it fits the context. That makes it easier for errors to slip through.

A concern with Copilot is that it uses open-source content without honoring that code’s specific licenses. GitHub provides a website GitHub Copilot Trust Center with documentation of some of their policies. There’s a lot there to analyze. GitHub Copilot, is like ChatGPT and some of the image generating applications: they are using copyright law in ways that haven’t been clarified by legislation and litigation.

Once, as I was filling in the top of a file, Copilot dreamed up some unfamiliar code. When I went looking for it on GitHub, I couldn’t find the specific code but identified that it was derived from a project RED4ext that was unrelated to my application.

It was interesting that Bing and DuckDuckGo both give lengthy results when I search with the qualifier site:github.com. In contrast, Google gave minimal results for such a search; apparently it doesn’t index project text.

The “Github Copilot and Copyright” section of the Trust Center claims that their option “GitHub Copilot can allow or block suggestions matching public code” can prevent Copilot from using segments of about 150 characters or more from GitHub. However, usually, Copilot generates code a line at a time which will almost always be less than 150 characters. That claim seems to be a very weak promise.

I was starting a file and noticed a name in the header comments. I was intrigued and started auto-generating title comments for files with minimal code.

My first test alarmed me with this fully ironic text. The same pledge appeared several times in my explorations. The samples had different usernames.

// Honor Pledge:    
//     
// I pledge that I have neither given nor
// received any help on this assignment.
//  
// <<username elided>>

In a file that I had fully written, the header comment started with

// Created by: <<username elided>>
//  Created on: 7/10/2020 10:00:00 PM

Continuing my exploration of what it might generate, I got the full name of a user!

// Created by J*** K*** on 11/2/16.

These privacy violations make me interested in investigating GitHub Copilot’s promotional claims in more depth. As I complete more research, I should re-evaluate my use of the service.