Investigating Patterns in the Formation of Weakly Complete Number Sequences

INVESTIGATING PATTERNS IN THE FORMATION OF WEAKLY COMPLETE NUMBER SEQUENCES

Michael D. Paul

Loyola College in Maryland, Mathematical Sciences ’10

Advisor

Dr. Michael Knapp

Department of Mathematical Sciences

Loyola College in Maryland

Sponsored by the Hauber Fellowship Fund

Loyola College in Maryland

Summer 2008

Abstract

This research focused on finding how to determine the terms of a weakly complete sequence with a given set of initial terms. For the sequences beginning with {1, n,...}, {n, n + 1,...}, or {n, 2n,...} for most values of n, specific formulae to determine the next terms of these sequences were found and proven. A conjecture formula for the terms of the sequence beginning with {n, n + m} has also been found, but has yet to be proven.

Background

A sequence of numbers is considered complete if every natural number can be written as a sum of distinct terms of the sequence. Perhaps the most famous example of a complete sequence is the Fibonacci sequence, {1, 1, 2, 3, 5, 8, 13, 21, 34,...}[1]. Any positive integer can be obtained by adding any combination of numbers in this sequence; for example, 17 = 13 + 3 + 1 and 33 = 21 + 8 + 3 + 1. A sequence is considered to be weakly complete if every integer greater than a given natural number (called the threshold of completeness) can be written as a sum of distinct terms of the sequence[2]. In this paper, the term number will henceforth mean any natural number.

Phase I: Investigating {1, n,...} Sequences

My research with Dr. Knapp this summer focused on finding ways to create a weakly complete sequence beginning with a given set of natural numbers in ascending order, using the last initial term as the threshold of completeness and having as few terms as possible. If we are given the initial terms of the sequence, we can obtain the next term by examining every number greater than the threshold of completeness in ascending order, and searching for the least number that cannot be formed by adding any distinct previous terms of the sequence.

To illustrate how a weakly complete sequence is formed, consider the sequence with initial terms 1 and 5 (this was the first sequence which Dr. Knapp and I investigated). We know that 6 is not in the sequence because 6 = 1 + 5, but 7 cannot be formed by adding any distinct combination of 1 and 5, so 7 is the next term in the sequence, which is now S = {1, 5, 7}. 8 = 7 + 1, so 8 is not in the sequence, but 9 cannot be formed from any combination of 1, 5, or 7, so we add 9 to the sequence to get S = {1, 5, 7, 9}. 10 = 11 + 1, so 10 is not in the sequence, but 11 cannot be formed by adding any set of distinct elements in S, so 11 must be the next term in the sequence, and we have S = {1, 5, 7, 9, 11}. 12 = 11 + 1 = 7 + 5, so 12 is not in the sequence, and neither is 13 (because 13 = 7 + 5). Continuing on in this fashion, we now have the sequence

S = {1, 5, 7, 9, 11, 29, 31, 89, 91, 269, 271, 809, 811,...}

Note that this sequence consists of pairs of numbers that are two apart from each other. Starting with 9 and 11, if one examines the number between each number in one of these pairs (10, 30, 90, 270, 810, etc.), one notices that each of these numbers is three times the one before it. Therefore, each element of the sequence is equal to 10 * 3i ± 1 for any natural number i.

The first phase of our research was to investigate other sequences beginning with 1 and another number n and determine if there were similar patterns in the terms of these sequences. The sequences beginning with 1 and n for each n from 3 to 10 are shown in Figure 1. The terms of sequence for beginning with 1 and 6 appear to be 8 followed by 11 * 3i ± 1 for i = 0, 1, 2,... The sequence beginning with 1 and 7 has 9 as its next term, but the next terms are sets of three numbers that are 2 apart from each other. Notice that the average of the terms in each set (i.e., the number around which set is centered) is 4 times the average of the terms of the previous set. Each of these sets is equal to 13 * 4i plus or minus 2 or 0. Using mathematical notation, we can write this sequence as

, where .

In each subsequent sequence that I investigated, I grouped the terms into sets of similar numbers, found the number about which each of these set was centered and the range of the terms in each set, and looked for patterns in these numbers. Through investigation, I found that the sequence beginning with 1 and 8 is

The sequence beginning with 1 and 9 is

The sequence beginning with 1 and 10 is

Using as a guide a proof of the conjecture for the {1, 5,...} sequence which Dr. Knapp had already written beforehand[3], I then wrote proofs of the conjectures for the {1, 6,...}, {1, 4,...}, {1, 7,...}, and {1, 9,...}sequences, given in figures in Figures 2 through 5.

Our next goal was to find a formula for the general formula for a sequence beginning with {1, n,...} for any value of n. Finding the general formula consisted basically of looking at each formula for a specific value of n and determining a pattern in each aspect of the formula. I arrived at the conclusion that for the sequence beginning with {1, n,...} for any n ≥ 3, the sequence is

, where ,

and si,j = axi – b + 2j, where , a = n + x + 2, and b = x – 2.

One we had found the formula, our next step was to prove it. In writing this proof, I used the previous proofs that I had written as a guide. The proof for the general case followed the same line of reasoning as the previous proofs, except that some statements of the general proof required more rigorous proof than their counterparts in the proofs for the more specific examples, which one could verify by simple calculation. The actual proof for the general case is given in Figure 6; the steps of the proof are outlined below:

Starting with 1 and n, prove that n + 2 and the elements of S0 are the next terms of the sequence. Then we know that we can represent every number from n to s0, b as a sum of distinct previous terms of the sequence.
Let, and assume that the sequence begins with Pk for some k ≥ 0. We then need to show that the next terms of the sequence are the elements of Sk+1.
First, we need to show that every number less than sk+1, 0 can be represented as a sum of distinct terms of the sequence. Define the set R as the set of all numbers represented as a sum of one or more elements of. Then by definition of the sequence, we know that R contains 1 and every number from n to axk – b – 1 inclusive.
Let Cm be the set whose elements are the all the possible sums of terms of Sk. We can represent every number of the same parity (even or odd) from the minimum value of Cm to the maximum value of Cm. By adding 1 to each element of Cm, we can represent every number from the minimum value of Cm plus 1 to the maximum value of Cm plus 1. By combining these two sets, we have representations for every number from the minimum of Cm to the maximum value of Cm plus 1, and we will call this interval Am.
We can also add each element cj of Cm to each of the remaining elements of R, so we now have a representation for every number from cj + n to cj + axk – b – 1. Once we show that the intervals of representation for consecutive values of j intersect or are consecutive, we know that we can represent every number from the smallest value of cj plus n to the largest value cj plus axk – b – 1. (Note: To prove continuous representation between two intervals of integers, we show that the difference when we subtract the smallest element of one interval from the greatest element of the other interval, the difference is non-negative (meaning the two intervals intersect) or -1 (meaning the two intervals are disjoint but consecutive).
Find the intervals of representation for A1 and B1, Ab and Bb, and Ab+1 and Bb+1 (note that there are b + 1 terms in each Si). We can find representations for any number between these intervals by adding one or more elements of Sk to either the sum of all of the elements of or the sum of all of the elements of except 1.
If n ≥ 9, then b ≥ 3, so there exists at least one integer between 1 and b – 1, so we need to show that there is continuous representation from Am to Bm for all values of m such that 2 ≤ m ≤ b – 1.
Prove that there is continuous representation fromto for all values of m such that 0 ≤ m < b. Then we have proven that we can represent every integer from n to sk+1, 0.
We now need to show that the elements of Sk+1 cannot be represented as sums of previous terms of the sequence. Note that because 1 is in the sequence, once we show that an element of Sk+1 is in the sequence, we will know that that element plus 1 is not in the sequence. We first show that any representation of an element of Sk+1 cannot be represented using previous elements of Sk+1.
Since each element of Sk+1 is larger than the sum of all the elements of , we know that any representation of an element of Sk+1 must use at least one element of Sk. We then show that unless m = b + 1 (i.e., the number of terms in Sk), the difference between any element of Sk+1 and any element of Cm is too large to be represented using only the elements of .
We then show that if we subtract all of the elements of Sk+1, the difference is an element of Sk, which by definition of the sequence cannot be represented as a sum of elements of . Therefore, the elements of Sk+1 cannot be represented as sums of previous terms of S, so these elements must be the next terms of S, and thus we have shown by induction that the sequence follows the formula stated in the conjecture. QED.

Phase II: Developing a Program

The next phase of our research was to find formulas for weakly complete sequences that began with any set of initial terms, such as {1, 2, n,...}, {2, n,...}, {n, n + 1}, and {n, 2n}. However, calculating these sequences by hand would be quite tedious. Therefore, we needed a computer program that would generate sequences quickly and efficiently. To calculate the {1,n,...} sequences (see Figure 1), Dr. Knapp had used a program that he had created using the Maple software package, which he had to modify manually every time he wanted to test a different sequence. Using the programming skills that I had learned in my freshman computer science class, I created a Java version of Dr. Knapp’s program, and modified it so that it could calculate any {1, n,...} sequence. This was version 1 of the program, and the Java code for it is given in Figure 7. Version 2 of the program calculates the sequence beginning with any set of initial terms, and the code for its most up-to-date incarnation, version 2.2, is given in Figure 8.

Unfortunately, version 2 had several serious flaws in the design. First, the user had to indicate at the beginning of the program’s execution how many initial terms he/she was entering. Secondly, the initial terms had to be entered in ascending order for the program to run properly. Finally, every time the user wanted to change even one initial term in the sequence, he would have to execute the program and re-enter every initial term all over again. I concluded that the best way to resolve these issues would be to create a version of the program that used a graphical user interface. The only problem was that I did not yet know how to create graphics using Java. I solved this problem by teaching myself using the Java All-in-One Desk Reference for Dummies by Doug Lowe and Barry Burd (Wiley, 2007), and within a few weeks was able to create version 3, the graphical version of the Sequence Program. The code for version 3.12 is given in Figure 9, and a screen shot of the program is shown in Figure 10. A description of the features of the program follows:

• Initial Terms List: Shows the numbers that the user has defined to be at the beginning of the sequence.

• Adding a Term: The user enters the term to be added to the list of initial terms in the Next Term box and clicks the Add Term button. The program is set up so that the user need not enter the initial terms in ascending order.

• Changing an Initial Term: The user selects the term in the Initial Terms list that he wants to change, enters the new value of the selected term into the Current Term box, and clicks the Update Term button. The program automatically places the new term in its appropriate place in the list.

• Removing Terms from the List: The user selects the term that he wants to remove from the list of initial terms and clicks the Remove Term button to remove the term from the list. Additionally, the user can remove all of the terms from the list by clicking the Clear All Initial Terms button.

• Calculation Range: The program only displays the terms of the sequence that are less than the number in this box.

• Display Numbers with Multiple Representations: The user can toggle this option on or off. This indicates if any number in the range of calculation can be written as a sum of distinct previous terms of the sequence in more than one way. The user can set minimum number of representations that a number must have to be displayed.

• Calculate Sequence: Calculates the sequence beginning with the numbers in the Initial Terms list, and displays them in the text frame below. Each sequence that the user generates is added to the text box after the previously calculated sequences. There are also buttons available to clear the output display and to print the outputs directly from the program.

Phase III: Investigating Other Weakly Complete Sequences

Now that I had a program to efficiently generate series of sequences, I could now investigate different types of sequences. The first set of sequences that I looked at were the sequences beginning with {1, 2, n,...}. These sequences are given in Figure 11. Using the same investigation techniques that I had used earlier, I found that for n ≥ 5, the sequence is

S = {1, 2, n, n + 4} ,

where Si = , and si,j = axi – 2 (x – 2) + 4j, where x = , and a = n + 2x + 4.

Though I have not yet written a proof of this conjecture, the techniques for proving it should be similar to those used to prove the {1, n,...} formula.

Next, I generated sequences beginning with {2, n,...}, and these are shown in Figure 12. Notice that while each of these sequences still has groups of similar numbers, there is an unusual pattern in each of these subsets – some numbers are 1 apart from each other, some are 3 apart, and if n is congruent to 3 modulo 4, there are some numbers that are 4 apart from each other. If we calculate the averages of the terms in each subset of a sequence, we find that these averages form a geometric sequence; i.e., there is a constant ratio of the average of one subset to the average of the previous subset. The sequences seem to have the following pattern:

, where , and si,,j = axi – b + j

However, I was unable to determine formulae for a, x, or b.

I then decided to investigate the {n, n + 1,...} sequences. These are shown in Figure 13. Once again, I compared the averages of each subset of a sequence and found that they formed a geometric sequence in which the ratio was equal to n. Using quadratic curve-fitting with the aid of the MATLAB software package, I was able to determine a value for a in terms of n. The formula that I found was: S = {n, n + 1}, where, and, whereand . The proof is given in Figure 16.

Using similar techniques, I found the formulae for the following sequences:

 Start with n, n + 2: S =, where, and, where. This works for n ≥ 2.

 Start with n, n + 3: S =, where, and, where. This works for n ≥ 4.

 Start with n, n + 4: S =, where, and, where. This works for n ≥ 5.

 Start with n, n + 4: S =, where, and, where. This works for n ≥ 6.

The sequences themselves are shown in Figures 15 through 18. Notice that these formulae seem to hold only when n > m. When n < m or n = m, these formulae are no longer valid, and the patterns when n < m and when n = m are different as well. With this in mind, I decided to categorize all sequences that begin with n and n + m in terms of the value of n compared to m. The three categories are alpha sequences (denoted by ), in which n > m; beta sequences (denoted by ), in which n = m; and gamma sequences (denoted by ), in which n < m. After analyzing the patterns in formulae that I had obtained for specific alpha sequences, I determined that the general formula for an alpha sequence is:

where, and,

where, and

I also investigated the set of beta sequences (i.e., the sequences that start with {n, 2n}), which are shown in Figure 19, and determined that the formula for a beta sequence is:

where, and, where, and

The final week of my research this summer was spent in writing the proof for this formula, which is given in Figure 20.

Conclusions and Future Research

Throughout my investigation of weakly complete sequences, I found that the formula for a particular sequence tends to follow a specific paradigm. A sequence beginning with {n, n + m} starts with the initial terms, followed by an infinite series of subsets containing similar numbers. Between the initial terms and the infinite series, there may exist other terms that do not belong to a particular subset, and the values of these terms depend on n and m. Each subset Si of the sequence is an indexed set of terms si, j. The range of Si and the terms that are omitted from it are dependent upon the values of n and m. The formula si, j = axi – b + j seems to be consistent for all sequences, but the values of a, x, and b are all determined by the values of n and m.

We have found and proven a formula for sequences beginning with {1, n,...}, and our categorization of sequences with two given initial terms into alpha, beta, and gamma sequences should set the tone for future research on weakly complete sequences. We have found and proven a formula for all beta sequences, and we have a conjecture for the general formula for all alpha sequences that needs only to be proven. However, though we did investigate some gamma sequences and conjectured on parts of the formulae for some, the patterns in the formulae for gamma sequences have so far eluded us. The subsets of these sequences increase in size as both n and m increase, which makes many calculations involving these sequences rather tedious. However, anyone with access to powerful computing machinery might be able to process these sequences more efficiently and perhaps determine what the pattern is, if there even is one.