Faffing round the edges Its easier than getting stuck in

28Nov/080

Problem 1 again

So I read a comment on reddit about how bad my maths was (and how badly I abuse Erlang too). I started reading Algorithms in a Nutshell as I figured I might have enough maths now to understand it. Well I don't really but I did understand the Big O notation stuff at the start enough to understand the comment on reddit:

Ah, what a misuse of Erlang, especially considering that all the 3 programs presented are all O(n) far from the optimal O(1). The guy should learn some math rather than experimenting with 'creating a process for every number'. His next post about problem 2 is not much better (he does 'find the sum of fibs below 4mln' by evaluating fib(n) from scratch until it becomes bigger than 4mln), but at least there he confesses: 'I am really let down by my lack of maths at school.'

So I read the PDF again from Project Euler solution (which I earned the right to read by misusing myself to a solution).

I think I understand the optimal solution well enough, and I was certainly able to implement it:

problem1(Target) ->
	sumDivBy(3, Target) + sumDivBy(5, Target) - sumDivBy(15, Target).

sumDivBy(N, Target) ->
	P = Target div N,
	N * (P * (P+1)) div 2.

And that is much better.

Now I'm not tooooo cut up by the comments (not ready yet). I didn't get any maths education beyond 16, and what I got up to then was rubbish but at least I am trying to rectify it. My parallel processing attempt was educational for me. I can never do anything without a practical motivation.

The problem is: I understand how the above solution works and I can see that it is optimal but without being taught it I doubt I would have got anywhere near it on my own.

Tagged as: , No Comments
24Nov/080

Problem 9: Pythagorean triplets.

Project Euler problem 9 is back in proper maths mode with this beauty:

A Pythagorean triplet is a set of three natural numbers, a b c, for which,
a2 + b2 = c2
For example, 32 + 42 = 9 + 16 = 25 = 52.

There exists exactly one Pythagorean triplet for which a + b + c = 1000.
Find the product abc.

Wow. Numbers are strange. Their relationships inscrutable and powerful. Euclid sorted this one out some time ago so I just stood on his giant shoulders (well, Wikipedia's actually) and had this for generating triplets:

generate_triple(M, N) ->
	A = 2 * (M*N),
	B = (M*M) - (N*N),
	C = (M*M) + (N*N),
	{A, B, C}.

Now, my friend Nick at work managed to express C in terms of A and B and save himself the trouble of the above but he is smarter and better educated than me (though I doubt he can do a short 21 for a 10 mile TT so I can comfort myself there).

The rest of the work is more brute force but Erlang does the heavy lifting for me:

prob9() ->
	L = [generate_triple(M, N) || N <- lists:seq(1,100), M <- lists:seq(2, 100), M > N],
	[{A,B,C}] = lists:filter(fun({A,B,C}) -> A+B+C =:= 1000 end, L),
	A*B*C.

I love the second line. The power of pattern matching is clear and it creates a readable, uncluttered way to get at the values of A, B and C. The question "...There exists exactly one..." assures me that [{A,B,C}] will match.

Again, Nick's Java effort is more optimal (he doesn't do the 2 loops (he needs a blog I guess)).

24Nov/080

Euler 8: Text handling is not a strong suit

Problem 8 in Project Euler is not so much a maths problem as a text manipulation one (isn't it?):

Find the greatest product of five consecutive digits in the 1000-digit number.

73167176531330624919225119674426574742355349194934
96983520312774506326239578318016984801869478851843
85861560789112949495459501737958331952853208805511
12540698747158523863050715693290963295227443043557
66896648950445244523161731856403098711121722383113
62229893423380308135336276614282806444486645238749
30358907296290491560440772390713810515859307960866
70172427121883998797908792274921901699720888093776
65727333001053367881220235421809751254540594752243
52584907711670556013604839586446706324415722155397
53697817977846174064955149290862569321978468622482
83972241375657056057490261407972968652414535100474
82166370484403199890008895243450658541227588666881
16427171479924442928230863465674813919123162824586
17866458359124566529476545682848912883142607690042
24219022671055626321111109370544217506941658960408
07198403850962455444362981230987879927244284909188
84580156166097919133875499200524063689912560717606
05886116467109405077541002256983155200055935729725
71636269561882670428252483600823257530420752963450

Again it just makes sense to me to brute force it. That is try every sequence of 5 digits numbers and chose the highest product. To whit:

prob8() ->
	BigNum = "7316717653133062491922511967442657474235534....(elided)",
	calculate_prod(5, BigNum, 1, []).

calculate_prod(Bite, BigNum, Start, Products) when Start > (length(BigNum) - Bite)
	-> lists:max(Products);
calculate_prod(Bite, BigNum, Start, Products) ->
	Prod = lists:foldl(fun(X, Acc) -> Acc * list_to_integer([X]) end, 1, lists:sublist(BigNum, Start, Bite)),
	calculate_prod(Bite, BigNum, Start+1,[Prod|Products]).

I like thinking in Erlang. It is nice to know that the number is fine as a number in Elrang so that there is no need for hacks or special classes (Java's BigDecimal) to manipulate these big numbers.

I like recursion, I like "overloading" functions with pattern matching and guards. I know there is a lot of hype around Erlang right now but it is an ideal balance, in my mind, of functional programming and practicality. It is, after all, a language of industry not research and it shows.

24Nov/080

Euler 7: No sieve ‘cos I can’t see how.

Project Euler's Problem 7 is back to the meaty business of generating primes. Now I know that lots of languages can do this for you but where is the fun of that? Likewise I could probably look up an algorithm on Wikipedia but only after I have an answer.

Answer three gave me a prime checker that I just reused here. I need to find a faster way to do this later but for now it seems adequate.

I tidied my prime checker to make it easier (for me) to read:

%%
%test for primality
%%
is_prime2(N) when N =:= 2 -> true;
is_prime2(N) when N rem 2 =:= 0 -> false;
is_prime2(N) ->
	Limit = round(math:sqrt(N) + 1),
	is_prime_tester(N, Limit, 3).

is_prime_tester(N, Limit, CandidateFactor) when CandidateFactor < Limit ->
	case N rem CandidateFactor =:= 0 of
		true -> false;
		false ->
			is_prime_tester(N, Limit, CandidateFactor+2)
	end;
is_prime_tester(_N, _Limit, _CandidateFactor) ->
	true.

With that to reuse the rest of problem 7 is just:

prob7(N) ->
	gen_primes(N, [2], 3).

gen_primes(N, Primes, _Candidate) when N =:= length(Primes) -> Primes;
gen_primes(N, Primes, Candidate) ->
	case is_prime2(Candidate) of
		true ->
			gen_primes(N, [Candidate|Primes], Candidate+2);
		false ->
			gen_primes(N, Primes, Candidate+2)
		end.

A colleague at work has a faster Java version (and he is my new hero) but I love hwo quick (and easy) it is for me to knock up an Erlang one. It is such a readable, expressive language.

24Nov/080

Euler 6: At last I find it easy!

Project Euler problem 6:

Find the difference between the sum of the squares of the first one hundred natural numbers and the square of the sum.

Ok...this makes sense to me. No special, magic formulae in a foreign script, no head twisting numbers that remain inscrutable even with 1000s of digits (primes, WTF?) Just something that lends itself to learning the functional programming paradigm of programming (I said paradigm, in a blog, check me).

Enough. My solution (for what it is worth):

prob6(N) ->
	L = lists:seq(1, N),
	square_of_sum(L) - sum_of_squares(L).

square_of_sum(L) ->
	Sum = lists:sum(L),
	Sum * Sum.

sum_of_squares(L) ->
	lists:foldl(fun(X, Acc) -> Acc + (X*X) end, 0, L).

I am sure there is a better way and if I spot it I will post it but I'm pretty pleased with that for a beginner in Erlang and maths.

24Nov/080

Euler 5. Cop out

Probelm 5 from Project Euler didn't float my boat.

2520 is the smallest number that can be divided by each of the numbers from 1 to 10 without any remainder.

What is the smallest number that is evenly divisible by all of the numbers from 1 to 20?

I just fiddled with my calculator for a bit and this is what I ended up with:


prob5() ->
	N = (2 * 3 * 4 * 5 * 6 * 7 * 11 * 13 * 17 * 19),
	Factors = euler_too:factorise( N ),
	io:format("~p has factors ~p~n", [N, Factors]).

So...I learned nothing...Or I learned about the Lowest Common Multiple and Prime Factors but I have yet had the time to go back and create a function that can calculate the lcm for any set of numbers. The Euler solution pdf that one gains access to with a correct answer is going to help me with that I am sure.

Like all software, one you have a working solution it is very hard to go back and make it better since you move ever forward so maybe I should take the time to get it right next time. I'm doing it for fun after all.

24Nov/080

Euler 4: The long and the short

I have been quiet as I have been very busy at work and my garage got broken into and my bikes stolen but I have also done the Euler's up to 10 now. Here is two versions of problem 4:

A palindromic number reads the same both ways. The largest palindrome made from the product of two 2-digit numbers is 9009 = 91 99.

Find the largest palindrome made from the product of two 3-digit numbers.

Sounds simple? Well my first attempt I made a proper meal of it. Here is the long listing:

prob4() ->
	Candidates = lists:seq(998001, 1000, -1),
	C = lists:filter(fun(X) -> is_palindrome(X) end, Candidates),
	PalinFacs = lists:foldl(fun(X, Acc) -> [{X,factorise(X)}|Acc] end, [], C),
	highest_palindrome(lists:reverse(PalinFacs)).

highest_palindrome([{Palindrome, Factors}|T]) ->
	ThreeDigitFactors = lists:filter(fun(X) -> is_three_digit(X) end, Factors),
	TwoThreeDigitFactors = lists:filter(fun(X) -> is_three_digit(Palindrome div X) end, ThreeDigitFactors),
	case length(TwoThreeDigitFactors) =/= 0 of
		true ->
			{Palindrome, lists:zipwith(fun(X, Y) -> {Palindrome div X, Palindrome div Y} end, TwoThreeDigitFactors, lists:reverse(TwoThreeDigitFactors))};
		false ->
			highest_palindrome(T)
	end.

is_three_digit(N) when N > 99, N < 1000 -> true;
is_three_digit(_N) -> false.

is_palindrome(N) ->
	A = integer_to_list(N),
	B = lists:reverse(A),
	(list_to_integer(A) =:= list_to_integer(B)).

I mean what was I thinking? What is all that "TwoThreeDigitFactors" stuff? I must have been thinking about something 'cos there is code there. I even use the factorising function from Euler 3 which is handy. So it got the answer so it "works". And fast enough too (well under second). I was trying to avoid the brute force of just looping around 100-999 twice and then looking for palindromes.

The right answer gave me access to the forum where a Haskell solution led me to this c'n'p erlang version:

prob4_alt() ->
	lists:max([A*B || A <- lists:seq(100,999), B <- lists:seq(100,999), is_palindrome(A*B)]).

Which is quicker too. Proves the Knuth maxim:

Premature optimization is the root of all evil

Ah. You live and learn. And you learn best through experience, innit?!

8Nov/080

Euler Problem 3

Project Euler problem 3 was a real challenge for me. I don't have much maths education (though I am getting more all the time) so the factoring of a number to find it's prime factors meant learning a lot (from wikipedia mainly).

One thing I noticed in my code compared to a lot of the solution on Project Euler is that it is longer. It is also less specific. I have a Prime test and a factorization function that I use (and have re-used on euler 4 and 5).

Anyway the code.

problem_3(N) ->
	lists:max(lists:filter(fun(X) -> is_prime2(X) end, factorise(N))).

This gets the largest prime number from the list of factors for N (in the case of the problem 600851475143)

This function gets the factors for N:-

factorise(N) ->
	Limit = round(math:sqrt(N)+1),
	factorise(N, Limit, 1, []).

factorise(N, Limit, Limit, Factors) ->
	case length(Factors) =:= 0 of
		true -> [1,N];
		false ->
			lists:sort(lists:merge([N div A || A <- Factors], Factors))
	end;
factorise(N, Limit, Candidate, Factors) ->
	case N rem Candidate =:= 0 of
		true -> factorise(N, Limit, Candidate +1, [Candidate|Factors]);
		false -> factorise(N, Limit, Candidate +1, Factors)
	end.

I'm sure the mathematicians know a more efficient factoring algorithm and I hope to learn one from the solution pdf on euler when I have the time (after this weekends Bug-o-rama maybe).

This code is my prime test:-

is_prime2(N) ->
	case N =:= 2 of
		true -> true;
		false ->
			case N rem 2 =:= 0 of
				true -> false;
				false->
					Limit = round(math:sqrt(N) + 1),
					is_prime_tester(N, Limit, 3)
			end
		end.

is_prime_tester(N, Limit, CandidateFactor) when CandidateFactor < Limit ->
	case N rem CandidateFactor =:= 0 of
		true -> false;
		false ->
			is_prime_tester(N, Limit, CandidateFactor+2)
	end;
is_prime_tester(_N, _Limit, _CandidateFactor) ->
	true.

This is like the factoring algorithm really. It is recursive and tests if a number can be divided. It misses out all the even numbers as there is no point in testing against them after testing against 2. Although, to be fair I could just factor the number and if it only has two factors (itself and 1) then it is prime but this was a tad faster. I think I need to look at refactoring my code in the light of how short some of my colleagues code is (see Chris's python solution on his blog) although this erlang effort still performs well for numbers 3 orders of magnitude larger.

I really enjoyed this problem and i am sure there are more optimal ways to test for primality and to factor. This example on the interweb is blazingly fast for even biggish numbers.

1Nov/080

Project Euler problem 1 parallel processing

Ok, so I have managed problem 3 finally but before I write about that I thought I'd write about my parallel version of problem 1.

This was just an exercise for me to learn Elrang so I am sure it is sub-optimal in many ways. The bottleneck in my solution is the generation of the terms in the line:

lists:sum(lists:seq(0, N, 3)) +
    lists:sum(lists:seq(0,N, 5)) -
    lists:sum(lists:seq(0, N, 15)).

So I thought the best thing to do was parallelize the term generation.

gen_term() ->
	receive
		{From, {gen, Limit, Incr, Action}} ->
			From ! {self(), lists:sum(lists:seq(0, Limit, Incr)), Action}
	end.

The process dies as soon as it has done its job. All this process does is calculated a sequence from 0 to Limit in steps of Incr (EG the sequence from 0, 999 in steps of 3). It passes through the value of Action so that another process I use for collecting and acting on the terms can process the result. The Pid of the collecting process is passed as From.

This process:

prob_1_accumulater(Sum) ->
	receive
		{From, Amount, Action} ->
			case Action of
		 		add -> prob_1_accumulater(Sum + Amount);
				sub -> prob_1_accumulater(Sum - Amount)
			end
	end.

Is the process that performs the calculation on the terms provided by the gen_term process. When the process is started Sum is set to zero. Then depending on the value of Action the process restarts itself with the result of Sum [+/-] Amount. The atoms add and sub are the keys to the action.

The function that controls all this is:

prob_1_para(N) ->
	AccumPid = spawn(?MODULE, prob_1_accumulater,[0]),
	Pid3 = spawn(fun gen_term/0),
	Pid5 = spawn(fun gen_term/0),
	Pid15 = spawn(fun gen_term/0),
	Pid3 ! {AccumPid, {gen, N, 3, add}},
	Pid5 ! {AccumPid, {gen, N, 5, add}},
	Pid15 ! {AccumPid, {gen, N, 15, sub}}.

All that happens here is that a process is created for calculating the terms and a process is created for each term we wish to generate. The prob_1_accumulater process prints the final total when all the terms have been gathered.

I am sure that there is a more elegant solution to this but the code above is a revelation to a poor old Java developer. To create what amounts to a 3 server instances for generating terms and one server instance to collect them in so few lines of code is amazing. The code above does run faster when I start Erlang's beam in SMP mode:

erl -smp

Running the sequential solution for 99999999 takes about 27-30 seconds. Running the parallel function takes 18-20 seconds. Not a massive overall improvement but still good. It is great to see the monitors for both cpu cores spring to life. With 3 cpus I guess it would be faster still. Still very much enjoying Erlang.

   

 

November 2008
M T W T F S S
« Oct   Dec »
 12
3456789
10111213141516
17181920212223
24252627282930

Archives