Project Euler 40: Finding the nth digit of the fractional part of an irrational number
Project Euler 40 Problem Description
Project Euler 40: An irrational decimal fraction is created by concatenating the positive integers:
0.123456789101112131415161718192021…
It can be seen that the 12th digit of the fractional part is 1.
If dn represents the nth digit of the fractional part, find the value of the following expression.
d1 × d10 × d100 × d1000 × d10000 × d100000 × d1000000
Analysis
A simple solution is to create a string one million digits long by concatenating the counting numbers starting from 1 to 185,185 (known as Champernowne's constant). Then multiply together the single digits at the positions specified in the problem statememt. We know from the problem description that the first two terms d1 & d10 will evaluate to 1 and, therefore, have no effect on the product. Here’s my early perl implementation of this brute force method:
$s = '.'; $n = 1; $s.= $n++ while(length($s) <= 1_000_000); print "Answer to PE40 = ", substr($s,100,1)*substr($s,1e3,1)*substr($s,1e4,1)*substr($s,1e5,1)*substr($s,1e6,1);
Of course this will not handle the more assailing parameters of the HackerRank Project Euler 40 version with 7 positions in a 1018 digit number run with 100,000 consecutive trials. A better approach must be devised.
You can think of Champernowne's constant, hereinafter known simply as "constant," as a concatenation of many series of concatenated counting numbers. Each series represents an order of magnitude or, more succinctly, a power of 10 as follows:
Series (k) | Range | Number of terms | Number of characters in series | Number of characters total |
---|---|---|---|---|
1 | 1-9 | 9 | 9 | 9 |
2 | 10-99 | 90 | 180 | 189 |
3 | 100-999 | 900 | 2700 | 2889 |
4 | 1000-9999 | 9000 | 36000 | 38889 |
5 | 10000-99999 | 90000 | 450000 | 488889 |
We can make a few observations:
- The size of each term in the series is the series number, k. That is, series 4 is composed of 4-digit numbers.
- The range for each series, k, is 10k-1 to 10k-1 and has 9·10k-1 terms.
- The number of digits in each series is k * number of terms. So, series 4 is 4 * 9000 = 36000 digits
Making single digits in the constant accessible
We can reimagine this series of concatenated numbers as an array with 3 indexes. One for the series, one for the term in the series, and one for the decimal position inside the term. All of these indexes will be derived from a single index that describes the digit inside the constant. This is made possible by the organized construction of the constant.
A simple example
Let's say we want the 27th digit inside the constant. We would decode 27 into the 3 indexes as follows:
Series 1: 1,2,3,4,5,6,7,8,9
Series 2: 10,11,12,13,14,15,16,17,18,19,20,21,22,…,99
- All indexes, except the series, start with a base of zero. The 18th term is index 17; always one less.
- We would first determine the 27th digit is in the 2nd series as 9 < 27 ≤ 9+180.
- Since all terms in the 2nd series are 2-digits we find our term index 27-9 = 18th term in the constant and (18-1)/2 = 8 (remainder 1) as the series index or 9th in the series.
- Adding 10 (the start of the 2nd series) to 8 gives us our value as 18.
- The remainder is our last index and, again, starting from 0, points to the digit in the term. In this example the 2nd (index 1) character of 18 is the number 8.
The 27th digit in the constant is 8.
A more robust example
Let's take the 37371st digit in the constant as an example. We need first to find the series the digit resides:
#bc is the character length of each series: 9, 180, 2700, ..., 9·x·10^(x-1) sc = [pow(10, x-1) * 9*x for x in xrange(1, 50)] n = 37371 i = 0 while n>sc[i]: n-= sc[i]; i+= 1
At the end of this loop we find our digit, 37371, is in the 4th series which follows our understanding: 9+180+2700 < 37371 ≤ 9+180+2700+36000. Since all the terms in the 4th series are 4-digits we take the residual value of n, the 34482nd term in the constant or (34482-1)/4 = 8620 (remainder 1) in the series. The index is (34482-1) because our series starts with an index base of 0.
We add the starting range, 1,000, for series 4 and 9620 is the term's value. Since the remainder is 1, and we base our substring selection starting from 0, we take the second digit of this number for our answer.
The 37371st digit in the constant is 6.
Project Euler 40 Solution
Runs < 0.001 seconds in Python 2.7.Use this link to get the Project Euler 40 Solution Python 2.7 source.
Afterthoughts
- Reference: The On-Line Encyclopedia of Integer Sequences (OEIS) A033307: Decimal expansion of Champernowne constant (or Mahler's number), formed by concatenating the positive integers.
- Mathematica:
Times @@ Flatten[IntegerDigits@Range@2*^5][[10^Range@6]]
Generalizing the problem in this way won’t work so well. Better: there are 9 one-digit numbers, 90 two-digit numbers, etc. So writing all the numbers of N or fewer digits results in a 9*1 + 90*2 + 900*3 + … + (10^N – 10^(N-1))*N digit string; that sums to N*10^N – (10^N-1)/9 . So for example the billionth digit d_i comes after all the 788888889 digits obtained from all the 1, 2, …, 8 digit numbers. Taking the next 211111111 digits in sets of 9 takes 23456790 sets before starting the next set; those sets would be the numbers 100000000 through 123456789. Thus the billionth digit occurs as we write the first digit of the number 123456790, i.e. it’s “1”. Obviously this method can be generalized much faster than a program could compute e.g. the quadrillionth digit d_i !
The initial brute force perl solution has no scale beyond the problem’s parameters so your solution would work well for bigger placements.
Thanks.