Technical Difficulties from on Top of the Mountain
2020-11-29
 
Quotes for the day,

However beautiful the strategy, you should occasionally look at the results.
Americans can always be trusted to do the right thing, once all other possibilities have been exhausted.

Both of these are attributed to Churchill, but as with everything on the internet, YMMV.

 
2020-11-14
  It was the mean() of times, it was the rms() of times.
I continue to poke and prod at Julia, learning more of both the good parts and the bad parts. Julia is very powerful, but it is extremely architecturally immature. Kind of like cars built by Tesla: you get some innovations, but also rookie mistakes and poor quality in things like paint where there's really no excuse. One of the downsides of this is that you really don't want to install Julia on Windows, because installation of modules, and compilation will be slower than mud. Why? Files. Lots and lots of files. Hundreds of thousands of files once you install just a few packages like Plots and ODE. If you're going to want to use it on a Windows, then hopefully you're using Windows 10 and you can install WSL 2 (windows subsystem linux), and then install Julia on a native linux filesystem. If not, Windows Defender is going to want to scan every one of those 400+ thousand files, and you might as well take the afternoon off and let it do its thing.

But that's not what this is about. This is about performance.

Once you get all the bad module and dependency stuff behind you, there's some interesting speed available for getting work done. But even with this experiment, the data answers a few questions, and raises some more. But let's setup the problem first.

In Jane Herriman's video (mentioned in the last post), she pretends to be introducing us to Julia, but she instead leads us on a whirlwind tour through all kinds of interesting work. Including benchmarking some functions: built-in, hand written, and external C libraries. Because a few bits were clipped off the edge of her screen shots, I tried to re-create her work from that section, but with my own twist. She was benchmarking mean() or taking the average. I decided to do rms() or the root mean square. This is probably because I'm both a programming nerd, and a power electronics nerd, and rms is one of those useful things you do from time to time to figure out how much delivered energy you're getting from changing voltage. The name basically describes the operations involved, but TeX version would be:
$\sqrt{ \frac{1}{N} \sum_{i=1}^N x_i^2}$
(I looked into putting a pretty figure here, but Chrome doesn't support most of MathML yet, and I didn't figure there would be that many people using firefox circa 2020.)

Julia doesn't have a built-in for rms(), but the forums have a number of suggestions:


sqrt(mean(A .^ 2.))
sqrt(sum(x->x^2, A)/length(A))
norm(A)/sqrt(length(A))
The first version is kind of slow (3.5 times slower than the baseline). The second one saves having to generate a temporary copy of the entire array, instead squaring each term one by one as its consumed by sum(), it comes in only four percent slower than the baseline. The third one I expected great things from, as the norm() operation is basically the root of the sum of the squares, but it was actually slower than the other version at thirteen percent over the baseline. So what was the baseline? Well, I wrote it out:

function rms(A)
  s= 0
  @simd for e in A
    s += e * e
  end
  sqrt( s / length(A) )
end
That little bit of magic dust before the for is required, probably to specify that none of the iterations have any dependency on any other, and let the compiler go crazy. I actually wrote this in C as well, using two different styles (tranditional and performant), and while there was a small variation in the timings (one was a smidge faster, and the other a sliver slower), the julia version and the C versions were practically identical for the clang version. The gcc version didn't do as well. But the whole ordeal with compile flags and such is a story for another time.

Reference: benchmark_rms.jl Pluto notebook.

Labels: , ,

 
2020-11-06
  Julia ups and downs.
Still struggling with Julia, which is both a criticism and how it should be. A criticism because certain things don't make sense, or work poorly, and that's an area where the language could improve. And how it should be, because learning a sufficiently interesting language requires understanding new idioms, different syntax, and large libraries of expertise.

Since I'm doing this all on a whim, I'm still mostly using Steve Brunton's classes as the exercises. I've down shifted to the Begining Scientific Computing series which is kind of review, so I can zip through the lectures faster. Unfortunately the videos are not at all organized, and its somewhat of a puzzle to work out the order. I'm doing my best to document what I think is the progression in the comments of each of my julia files here:

As I've been going along, one thing that has raised it head, is the form of data that plot() like to make time series, vs the form the linear algebra solvers use. The best I've come up with to pull one row of data out of a vector of vectors is [e[1] for e in vu], but I fear this is a copy of the data which makes me sad.


To brush up on my basics, I found this interesting introduction, which wasn't so much an introduction as a tour through a lot of interesting topics like benchmarking, multiple plots in one pane, inline C code, and even simd. Its also just fun watching someone get excited by the ternary operator.

using BenchmarkTools
using Random
using Statistics
using LinearAlgebra
using Plots

A = 2 * rand(10^7)
T_bench= @benchmark sqrt(mean(A .^ 2.))
T_bench2= @benchmark sqrt(sum(x->x*x, A)/ length( A ))
T_bench3= @benchmark norm(A) / sqrt(length(A))
histogram( T_bench.times )

Inlining C code, though the video cut the right edge off and I had to guess at what was missing,

using Libdl
C_code = """
#include <stddef.h>
#include <math.h>     
double c_rms(size_t n, double * X) {
  double s= 0.0 ;
  for ( size_t i= n ; ( i -- ) ; X ++ ) { s += ( *X * *X ) ; }
  return sqrt( s / n ) ;
}
double c_rmse(size_t n, double * X) {
  double s= 0.0 ;
  for ( size_t i= 0 ; ( i < n ) ; i ++ ) { s += X[i] * X[i] ; }
  return sqrt( s / n ) ;
}
"""
const Clib = tempname()
open( `gcc -fPIC -O3 -msse3 -xc -shared -ffast-math -o $(Clib * "." * Libdl.dlext) -`, "w" ) do f
  print(f, C_code)
end  
c_rms( X::Array{Float64}) = ccall((:c_rms, C_lib), Float64, (Csize_t, Ptr{Float64},), length(X), X )
c_rmse( X::Array{Float64}) = ccall((:c_rmse, C_lib), Float64, (Csize_t, Ptr{Float64},), length(X), X )
c_rms( A )

And finally, some parallel coding in Julia,

function rms(A)
  s = zero(eltype(A))  # generic versiion
  @simd for e in A
    s += e * e
  end
  sqrt( s / length(A) )
end

To try these pluto notebooks out without having to have Julia running locally, there's a Binder transform here, but I think I may eventually setup a pluto instance on my server.

Labels:

 
Life in the middle of nowhere, remote programming to try and support it, startups, children, and some tinkering when I get a chance.

ARCHIVES
January 2004 / February 2004 / March 2004 / April 2004 / May 2004 / June 2004 / July 2004 / August 2004 / September 2004 / October 2004 / November 2004 / December 2004 / January 2005 / February 2005 / March 2005 / April 2005 / May 2005 / June 2005 / July 2005 / August 2005 / September 2005 / October 2005 / November 2005 / December 2005 / January 2006 / February 2006 / March 2006 / April 2006 / May 2006 / June 2006 / July 2006 / August 2006 / September 2006 / October 2006 / November 2006 / December 2006 / January 2007 / February 2007 / March 2007 / April 2007 / June 2007 / July 2007 / August 2007 / September 2007 / October 2007 / November 2007 / December 2007 / January 2008 / May 2008 / June 2008 / August 2008 / February 2009 / August 2009 / February 2010 / February 2011 / March 2011 / October 2011 / March 2012 / July 2013 / August 2013 / September 2013 / October 2013 / November 2013 / December 2013 / December 2014 / February 2015 / March 2015 / July 2016 / September 2016 / December 2016 / April 2017 / June 2017 / July 2018 / November 2018 / January 2019 / February 2019 / April 2019 / December 2019 / March 2020 / April 2020 / May 2020 / September 2020 / November 2020 / March 2021 / May 2023 /


Blogroll
Paul Graham's Essays
You may not want to write in Lisp, but his advise on software, life and business is always worth listening to.
How to save the world
Dave Pollard working on changing the world .. one partially baked idea at a time.
SnowDeal
Eric Snowdeal IV - born 15 weeks too soon, now living a normal baby life.
Land and Hold Short
The life of a pilot.

The best of?
Jan '04
The second best villain of all times.

Feb '04
Oops I dropped by satellite.
New Jets create excitement in the air.
The audience is not listening.

Mar '04
Neat chemicals you don't want to mess with.
The Lack of Practise Effect

Apr '04
Scramjets take to the air
Doing dangerous things in the fire.
The Real Way to get a job

May '04
Checking out cool tools (with the kids)
A master geek (Ink Tank flashback)
How to play with your kids

Powered by Blogger