Challenge 3

This contest gave an interesting challenge, as the number given to factor was very large. In fact, it takes 40 bits to express an integer that big. This is an issue, because many implementations of programming languages only use 32 bits for standard signed or unsigned integers, which is not enough to hold the value. This large size did hinder one of our entries (and probably thwarted potential contestants who did not submit a program). As such, the testing this week will be done with the original number (600,851,475,143) and a shorter 30-bit number (600,851,475).

One non-obvious issue with the smaller number, is that it has a much bigger largest prime factor. This both allowed code limited to 32-bit limitations to run, and exposed major differences in execution times. Whereas code that was able to factor the 40-bit number was able to run 100,000 times in a reasonable amount of time, the number of repetitions was reduced to only 1,000 for the 30-bit number. This highlights the importance of optimizations in code, reducing the amount of work that needs to be done. As prime numbers get larger, they are more sparse, and therefore take significantly more time to find.
  
The prime factors of 13195 are 5, 7, 13, and 29.

What is the largest prime factor of the number 600851475143?.

Construct a program that can find the largest prime factor of any positive integer. (We will test your program with more than just the two integers identified above). 

Solutions:

LabVIEW - Joe

Joe_3.vi

Joe_3.vi

Joe_3.vi

Joe_3.vi

Joe_3.vi

Joe_3.vi

Our first LabVIEW entry is from Joe. He implemented an interesting multi-threaded solution that employed one thread to to test potential factors, and then another thread to test the primeness of the found factors. He correctly used 64-bit unsigned integers for all of his numerical data, so there was no implicit casting (little red dots) or issues in factoring the 40-bit number. Joe’s program used threads and a queue structure to implement his solution. Threads enable spinning off computations to idle CPU cores, to do more work at the same time. Queue’s are a nice way to share data in a multi-threaded environment. This program effectively found the largest prime factor, and displayed all found prime factors, while showing some neat LabVIEW features. Unfortunately, as a whole this program was not very efficient. Establishing primeness is a very computationally expensive task. This expense gets worse as primes get bigger, and that showed.

For the 40-bit number, Joe’s code was 4.26 times slower than the fastest code. For the 30-bit number, his code was 174.61 times slower than the fastest at that task.

LabVIEW - Justin

Justin_3.vi

Justin_3.vi

Justin_3.vi

Justin_3.vi

Our second fastest LabVIEW entry was from Justin. His code is clean, neat, and succinctly commented. He used 64-bit signed integers for all of his numerical data, so there was no implicit casting (little red dots) or issues in factoring the 40-bit number. His solution recognized that two is the only even prime number and the lowest, so he treated it as a special case. By not testing the factors for primeness and only testing odd numbers as factors after two, he was able to shave a significant amount of time from his execution.

For the 40-bit number, Justin’s code was 1.79 times slower than the fastest code. For the 30-bit number, his code was 75.81 times slower than the fastest at that task.

LabVIEW - Len

Len_3.vi

Len_3.vi

Len_3.vi

Len_3.vi

Len had the fastest LabVIEW code for this contest. His approach is substantially similar to the one that Justin used, with one exception to both Justin and Joe’s code. Both of the other entries iterate until the last attempted factor is equal to the remaining portion left to be factored. At this point, the remaining portion should be a prime number. Len’s code differs in that it only iterates up to the integer cast of the square root of the remaining portion plus one. This greatly reduces execution time, as there won’t be an integer factor greater than the square root. The key to speed is figuring out what work you don’t need to do. At this point, the remaining portion IS the largest prime factor. This really helped for the 30-bit number which had a much bigger largest prime factor.

Here is the graph of relative performance for the 40-bit number of the LabVIEW code (100,000 cycles):

For completeness, here’s the graph of relative performance for the 30-bit number (1,000 cycles):

Python - Wes

Moving into Python, the situation is a bit more complex. All of the LabVIEW entries used 64-bit integers as base data types. While Python uses 64-bit integers by default, one of the entries this week ran into a 32-bit limitation on a different type of data structure, which prevented it from finding the solution.

That entry was provided by Wes. He used a technique called the Sieve of Eratosthenes to essentially find all of the primes below half of the number to be factored. As he acknowledged in his comments, this was not an efficient algorithm, and the limitation on the size of lists in Python was indeed a limiting factor. This code could not be run on the original 40-bit number, due to the limitation on lists.

After finding all of the primes below half of the input number, Wes’s program started at the top of the list of primes to check if it was a factor of the initial number. As soon as it found one, it stopped as this would by definition be the largest prime factor. This worked, but required an awful lot of work to generate the list of primes.

As mentioned, this code was not able to find the largest factor of the 40-bit number. For the 30-bit number, this code was over 1.9 million times slower than the fastest code. Yes, that is a very large number. As mentioned in the code comments, this is not a very efficient method.

/Users/len/Programming_Contests/Week_3/Entries/Wes_3.py


   1    #!/usr/bin/python
   2    
   3    ################################################################################
   4    # Author   : Wes Kelly
   5    # Date     : 9 June 2017
   6    # Name     : gprime_wes.py
   7    # Objective: Construct a program that can find the largest prime factor of any 
   8    #            positive integer. 
   9    #
  10    # This solution uses the Sieve of Aritosthenes to find primes which are less 
  11    # than the square root of the input, then checks each prime to see if it is a 
  12    # factor of the input. This is not an efficient algorithm for many reasons, and 
  13    # sieve algorithms are almost always written to utilize many parallel 
  14    # processors. I chose this algorithm because I don't have a PhD in mathematics,
  15    # nor endless time to implement a better algorithm such as GNFS:
  16    #   (http://www.ramk.fi/loader.aspx?id=84041118-a526-414f-847f-5ceb0afec3b1)
  17    #
  18    # NOTE: this solution also does not take arbitrary positive integers, but 
  19    # fails if the integers are larger than ~2**32/2. This is due to the maximum
  20    # length of a list in Python. I didn't have time to implement a solution which
  21    # accounts for this.
  22    ################################################################################
  23    
  24    import sys
  25    from math import sqrt
  26    
  27    # Store the arg (capitalized because it's constant)
  28    INTEGER = long(sys.argv[1])
  29    
  30    # Sieve or Aritosthenes implementation to find the largest prime factor of a 
  31    # given integer.
  32    #
  33    # @param  long the integer for which to find primes which are possible factors
  34    # @return list a list of elements which are possible prime factors, any empty
  35    #              list if 1 was given.
  36    def greatest_prime(INTEGER):
  37        # 1 has no prime factors
  38        if INTEGER == 1:
  39            return "1 has no prime factors"
  40        # 2 is a special case because it is prime and even
  41        elif INTEGER == 2:
  42            return 2
  43    
  44        # The Sieve of Aritosthenes uses a simple concept to find primes: mark all 
  45        # numbers below a given value which are not prime. The result is a list of 
  46        # only prime numbers. This is accomplished by finding the first unmarked 
  47        # value in a list and marking all of its multiples as not prime. The next 
  48        # unmarked value is guaranteed to be the next prime, and the process is 
  49        # repeated. This Sieve implementation is modified to exclude the last 
  50        # (integer / 2) elements and all even elements because none of these have a 
  51        # possibility of being prime factors.
  52    
  53        # Make a list of all odd numbers from 3 to integer / 2 + 1. Any number
  54        # larger than integer / 2 cannot be a factor, by definition. This range
  55        # also skips all even numbers which are guaranteed to never be prime. One
  56        # is added to the sqrt result to force [3] to be the result when 9 is the
  57        # input.
  58        odds = range(3, long(INTEGER / 2) + 1, 2)
  59        
  60        # assign this value now so that it is not calculated
  61        # many times in the doubly-nested loop
  62        length = len(odds) 
  63    
  64        # For each number, mark the multiples in the array. Numbers are marked by
  65        # being set equal to None
  66        for i, number in enumerate(odds):
  67            # if the number is not marked, it is prime
  68            if number != None:
  69                # A while-loop is used here instead of a for-loop with a range
  70                # because Python iterates over lists in its for-loops. The extra
  71                # overhead of generating these lists is unacceptable for large prime
  72                # searches.
  73    
  74                # the index of the first multiple of the found prime
  75                starting_index = i + number
  76                
  77                # the current index to be marked as not prime
  78                current_index = starting_index
  79    
  80                # mark the multiples of the current prime number
  81                while current_index < length:
  82                    # mark the current index
  83                    odds[current_index] = None
  84                    # increment the index to the next multiple
  85                    current_index = current_index + number
  86    
  87        # Create a list which contains all primes below the given integer value
  88        primes = [2] + list(filter(lambda x: x != None, odds))
  89    
  90        # For each candidate prime, test to see if it is a factor starting with the 
  91        # greatest prime.
  92        for prime in reversed(primes):
  93            if INTEGER % prime == 0:
  94                return prime
  95    
  96        # If none of the candidate primes were a factor, the integer is a prime
  97        # itself
  98        return INTEGER
  99    
 100    print greatest_prime(INTEGER)
 101    exit(0)

Python - Wes, a different way

Wes submitted another entry, with absolutely the shortest program imaginable to do the job. While admitting that this may be considered a bit weak, it was able to find the largest prime factor of the 40-bit number. This contest is generally about developing algorithms to solve a problem. If you are stuck, especially with a language like Python with an extensive library of contributed modules, you should look for something that will do the job. The most surprising thing about this entry, is that it was not the fastest code.

For the 40-bit number, the primefac() library function was 1.95 times slower than the fastest code. For the 30-bit number, the primefac() library function was 1.62 times slower than the fastest code. This ran faster than all but one of the LabVIEW codes for the 30-bit number.

/Users/len/Programming_Contests/Week_3/Entries/gprime_lib.py


   1    #!/usr/bin/python
   2    
   3    ################################################################################
   4    # Author   : Wes Kelly
   5    # Date     : 9 June 2017
   6    # Name     : gprime_lib.py
   7    # Objective: Construct a program that can find the largest prime factor of any 
   8    #            positive integer. 
   9    #
  10    # This solution uses the primefac library to find the largest prime factor of a
  11    # given integer. This solution was way too easy to be fun.
  12    ################################################################################
  13    
  14    # You may need to "pip install primefac" for this to work
  15    from primefac import primefac
  16    import sys
  17    
  18    # Return the maximum prime factor
  19    
  20    print max(primefac(long(sys.argv[1])))
  21    
  22    exit(0)

Python - Abby

The next fastest code, at least for factoring the 40-bit number in the contest problem, was from seven-year-old Abby. Abby's code is conceptually similar to Justin's LabVIEW code, attempting to factor with two, and then marching through the odd numbers. Many inline comments let us know what she was thinking, but it makes it dense to read through without color syntax highlighting.

For the 40-bit number, Abby's code was 1.16 times slower than the fastest code, pretty snappy! It was a different story for the 30-bit number where Abby's code was 55.11 times slower than the fastest code.

/Users/len/Programming_Contests/Week_3/Entries/Abby_3.py


   1    # Abby's program
   2    #
   3    # find the largest prime factor
   4    
   5    #
   6    # function definitions
   7    #
   8    def findFactor (number):
   9        # print (number)
  10        # loop where we divide by two till you can't divide it by two any more
  11        while number % 2 == 0:  # while dividing by two with no remainders
  12            number = number / 2 # replace "number" with "number" divided by two
  13            largestFactor = 2 # until proven otherwise, the largest factor is 2
  14            # print (number) this was for troubleshooting
  15        # if there is a factor greater than two, we find it in the next loop
  16        
  17        # divide by odds greater than two because two is the only even prime number
  18        factor = 3
  19        while number != 1: # at this point, number is the largest odd factor
  20            while number % factor == 0: # while dividing by 3 (or another odd number) with no remainders
  21                number = number / factor # replace "number" with "number" divided by 3 (or another odd number)
  22                largestFactor = factor # until proven otherwise, the largest factor is 3 (or another odd number)
  23                # print (number) this was for troubleshooting
  24                # if there is a factor greater than three, we will find it as we cycle through the factors
  25            factor = factor + 2 # now we are trying the next odd number to go through the same loop again
  26        return largestFactor # once "number" is equal to "factor," "number" becomes 1 so the "while" loop ends
  27    
  28    
  29    # main program
  30    #
  31    answer = findFactor (600851475143) # to find the answer, we call the "findFactor" program 
  32    print ("this program is so great")
  33    print ("largest factor is "+ str(answer))
  34     

Python - Len

The fastest Python code this contest for both the 40 and 30-bit numbers comes from Len. His Python code holds close to the structure of his LabVIEW program, but includes more comments and commentary. What is interesting to note is the inclusion of a commented out library (numba), that can greatly increase the speed by turning the interpreted Python code into Just In Time (JIT) compiled code. In testing, enabling the use of this library made this already quick code run 88 times faster, which was twice as fast as the fastest LabVIEW code! The performance numbers here were recorded with this library disabled. Like Wes’s primefac() submission, this code ran faster than all but the fastest LabVIEW submissions on the 30-bit number.

/Users/len/Programming_Contests/Week_3/Entries/Len_3.py


   1    # Team 585 Programming Contest - Week 3
   2    #
   3    #==============================================================================
   4    #
   5    # Problem:
   6    # The prime factors of 13195 are 5, 7, 13, and 29. 
   7    
   8    # What is the largest prime factor of the number 600851475143?.
   9    
  10    # Construct a program that can find the largest prime factor of any positive
  11    # integer.  (We will test your program with more than just the two integers
  12    # identified above). 
  13    #
  14    #==============================================================================
  15    
  16    # It would be optimal to march through a series of increasing prime numbers to
  17    # find the largest prime factor, but it is computationally intensive to find
  18    # prime numbers.  Marching through every number until the potential divisor
  19    # bigger than the remaining portion.  The remaining amount should be the
  20    # largest prime factor.
  21    
  22    # Two is the only even prime number, so if you first test for all occurrances
  23    # of two as a factor, then you can step the later divisor by odd numbers, and
  24    # reduce the time to march through potential divisors by a factor of two.
  25    #
  26    # For any given number, it's square root would be the largest a prime factor
  27    # could be.  As such, using the square root of the remaining portion to be
  28    # factored can exponentially cut the time required to iterate through possible
  29    # prime factors, as you wouldn't needlessly test the numbers between the square
  30    # root and the amount remaining to be factored.  The remaining part would be
  31    # the largest prime if a factor was not found below its square root.
  32    
  33    #==============================================================================
  34    
  35    # numba is a package that enables Just In Time (JIT) compilation of Python code
  36    # with the LLVM compiler.  To use it, numba has to be installed from pip, and
  37    # the LLVM compiler has to be installed on your computer. Uncomment the next
  38    # two lines for a massive speed up.
  39    
  40    # from numba import jit
  41    
  42    # @jit
  43    def largestFactor(bigNumber):
  44        # Start off the result at the bigNumber, in case the bigNumber is a prime
  45        result = bigNumber
  46        
  47        # Check to see if the number is even
  48        while (result % 2) == 0:
  49            # The result is even, so 2 is a factor
  50            # Divide the number by 2 to get the next portion to factor
  51            result = result / 2
  52        
  53        # Number should be odd.  Check odd numbers for factors
  54        currentCandidate = 3
  55        
  56        # Put an upper bound at the square root of the remaining number rounded up
  57        # to the next higher integer
  58        maxFactor = int(result**.5) + 1
  59    
  60        # Check increasing currentCandidate factors until they reach the limit or
  61        # the remaining result is a prime number
  62        while (result > 1) and (currentCandidate < maxFactor):
  63            # Check to see if the currentCandidateis a factor
  64            while (result % currentCandidate) == 0:
  65                # It is!
  66                # Divide the number by the currentCandidate to find the next
  67                # portion to factor
  68                result = result / currentCandidate
  69                
  70                # Check to see if the currentCandidate is the same as the remaining
  71                # amount to factor.  If it is, then stop.
  72                if result == currentCandidate:
  73                    # We're done!
  74                    break
  75            
  76            # Index currentCandidate to the next odd number
  77            currentCandidate += 2
  78            # Reset the limit based on the current amount that remains to be
  79            # factored
  80            maxFactor = int(result**.5) + 1
  81    
  82        if result == 1:
  83            # bigNumber was a prime
  84            return bigNumber
  85        else:
  86            return result
  87    
  88    # Find the largest factor of a number.
  89    # We are lucky that Python uses 64-bits for ints, as the number exceeds 2^31 - 1
  90    answer = largestFactor(600851475143)    # 600,851,475,143 needs 40 bits to be expressed
  91    
  92    # The next set of code is used as test cases to make sure that 
  93    # answer = largestFactor(3481)      # answer should be 59
  94    # answer = largestFactor(59)        # answer should be 59
  95    # answer = largestFactor(4000000)   # answer should be 5
  96    
  97    print(answer)

Here is the graph of relative performance for the 40-bit number for the Python codes (100,000 cycles):

For completeness, here’s the graph of relative performance for the 30-bit number of the Python code. Unlike the other graphs, this one is in logarithmic scale, so each line represents 10x the amount of previous. The two negative bars represent times lower than 1 second (1,000 cycles):