Technical Difficulties from on Top of the Mountain
  It was the mean() of times, it was the rms() of times.
I continue to poke and prod at Julia, learning more of both the good parts and the bad parts. Julia is very powerful, but it is extremely architecturally immature. Kind of like cars built by Tesla: you get some innovations, but also rookie mistakes and poor quality in things like paint where there's really no excuse. One of the downsides of this is that you really don't want to install Julia on Windows, because installation of modules, and compilation will be slower than mud. Why? Files. Lots and lots of files. Hundreds of thousands of files once you install just a few packages like Plots and ODE. If you're going to want to use it on a Windows, then hopefully you're using Windows 10 and you can install WSL 2 (windows subsystem linux), and then install Julia on a native linux filesystem. If not, Windows Defender is going to want to scan every one of those 400+ thousand files, and you might as well take the afternoon off and let it do its thing.

But that's not what this is about. This is about performance.

Once you get all the bad module and dependency stuff behind you, there's some interesting speed available for getting work done. But even with this experiment, the data answers a few questions, and raises some more. But let's setup the problem first.

In Jane Herriman's video (mentioned in the last post), she pretends to be introducing us to Julia, but she instead leads us on a whirlwind tour through all kinds of interesting work. Including benchmarking some functions: built-in, hand written, and external C libraries. Because a few bits were clipped off the edge of her screen shots, I tried to re-create her work from that section, but with my own twist. She was benchmarking mean() or taking the average. I decided to do rms() or the root mean square. This is probably because I'm both a programming nerd, and a power electronics nerd, and rms is one of those useful things you do from time to time to figure out how much delivered energy you're getting from changing voltage. The name basically describes the operations involved, but TeX version would be:
$\sqrt{ \frac{1}{N} \sum_{i=1}^N x_i^2}$
(I looked into putting a pretty figure here, but Chrome doesn't support most of MathML yet, and I didn't figure there would be that many people using firefox circa 2020.)

Julia doesn't have a built-in for rms(), but the forums have a number of suggestions:

sqrt(mean(A .^ 2.))
sqrt(sum(x->x^2, A)/length(A))
The first version is kind of slow (3.5 times slower than the baseline). The second one saves having to generate a temporary copy of the entire array, instead squaring each term one by one as its consumed by sum(), it comes in only four percent slower than the baseline. The third one I expected great things from, as the norm() operation is basically the root of the sum of the squares, but it was actually slower than the other version at thirteen percent over the baseline. So what was the baseline? Well, I wrote it out:

function rms(A)
  s= 0
  @simd for e in A
    s += e * e
  sqrt( s / length(A) )
That little bit of magic dust before the for is required, probably to specify that none of the iterations have any dependency on any other, and let the compiler go crazy. I actually wrote this in C as well, using two different styles (tranditional and performant), and while there was a small variation in the timings (one was a smidge faster, and the other a sliver slower), the julia version and the C versions were practically identical for the clang version. The gcc version didn't do as well. But the whole ordeal with compile flags and such is a story for another time.

Reference: benchmark_rms.jl Pluto notebook.

Labels: , ,

  Julia ups and downs.
Still struggling with Julia, which is both a criticism and how it should be. A criticism because certain things don't make sense, or work poorly, and that's an area where the language could improve. And how it should be, because learning a sufficiently interesting language requires understanding new idioms, different syntax, and large libraries of expertise.

Since I'm doing this all on a whim, I'm still mostly using Steve Brunton's classes as the exercises. I've down shifted to the Begining Scientific Computing series which is kind of review, so I can zip through the lectures faster. Unfortunately the videos are not at all organized, and its somewhat of a puzzle to work out the order. I'm doing my best to document what I think is the progression in the comments of each of my julia files here:

As I've been going along, one thing that has raised it head, is the form of data that plot() like to make time series, vs the form the linear algebra solvers use. The best I've come up with to pull one row of data out of a vector of vectors is [e[1] for e in vu], but I fear this is a copy of the data which makes me sad.

To brush up on my basics, I found this interesting introduction, which wasn't so much an introduction as a tour through a lot of interesting topics like benchmarking, multiple plots in one pane, inline C code, and even simd. Its also just fun watching someone get excited by the ternary operator.

using BenchmarkTools
using Random
using Statistics
using LinearAlgebra
using Plots

A = 2 * rand(10^7)
T_bench= @benchmark sqrt(mean(A .^ 2.))
T_bench2= @benchmark sqrt(sum(x->x*x, A)/ length( A ))
T_bench3= @benchmark norm(A) / sqrt(length(A))
histogram( T_bench.times )

Inlining C code, though the video cut the right edge off and I had to guess at what was missing,

using Libdl
C_code = """
#include <stddef.h>
#include <math.h>     
double c_rms(size_t n, double * X) {
  double s= 0.0 ;
  for ( size_t i= n ; ( i -- ) ; X ++ ) { s += ( *X * *X ) ; }
  return sqrt( s / n ) ;
double c_rmse(size_t n, double * X) {
  double s= 0.0 ;
  for ( size_t i= 0 ; ( i < n ) ; i ++ ) { s += X[i] * X[i] ; }
  return sqrt( s / n ) ;
const Clib = tempname()
open( `gcc -fPIC -O3 -msse3 -xc -shared -ffast-math -o $(Clib * "." * Libdl.dlext) -`, "w" ) do f
  print(f, C_code)
c_rms( X::Array{Float64}) = ccall((:c_rms, C_lib), Float64, (Csize_t, Ptr{Float64},), length(X), X )
c_rmse( X::Array{Float64}) = ccall((:c_rmse, C_lib), Float64, (Csize_t, Ptr{Float64},), length(X), X )
c_rms( A )

And finally, some parallel coding in Julia,

function rms(A)
  s = zero(eltype(A))  # generic versiion
  @simd for e in A
    s += e * e
  sqrt( s / length(A) )

To try these pluto notebooks out without having to have Julia running locally, there's a Binder transform here, but I think I may eventually setup a pluto instance on my server.


  Learning Maths
I decided that I needed to brush up on my math (probably from a glacing blow with Navier-Stokes equations while watching videos about airplane design), and I ran across some excellent lectures on Fourier transforms which I understood conceptually, but I had never learned how fast Fourier transforms were constructed. Now I know that the formula e=1 tells you everything you need to know, since you need n roots of 1.

But beyond that, I had a lot of fun with the explanations of Laplace transforms by Steve Brunton at the University of Washington:

After digging through all his published videos, I found he tought a series of graduate classes in Engineering Mathematics which for some reason is a Mechanical Engineering course, but I didn't let that stop me. I queued up the beginning of the series and got started.

The first two lectures were just review (even for me), and I breezed right through them. The third one was on Taylor series. Now I did Taylor series in college, and never gave them another thought, so it still felt a bit of a review; but then the professor jumped in to Matlab to do some calculations and graphing. I've seen Matlab before, heck I know people that use it, but its never looked fun, and more importantly its a commercial piece of software. Not that there's anything wrong with commercial software, heck that's what I do for a living, but I'm not going to go out and buy Matlab just because I'm watching some videos on the internet. The last time I bumped into this problem, I found there was an open-source version called scilab, but that was more than ten years ago which is forever in computer years. Time to brush up on what's available out there.

Asking the keeper of all knowledge for "matlab alternative free" (the free is both implied and auto-suggested), and it turns out the scilab is still kicking, but that GNU is trying to run it over with Octave. (You would think that open source people would get along better than commercial people, but you would be wrong. Apparently when you're no longer in it for the money, all that's left is honor and glory, and history has taught us that that never ends well.) There are also a couple of python based options, like NumPy and Sage which are wedded to Python grammer. And there's Julia, something nebulous thrown together by MIT.

The path of least resistance probably would have been the clones, as I could copy and paste the examples from the lectures and run them with minimal rework. But when I have bumped into Matlab before, its grammer has always seemed about as close to an actual programming language as PHP; and I just couldn't bring myself to do that. Something Python based would have been practical, since its a very popular dynamic language at work, and used by the ML groups; but I made the mistake of learning Perl earlier in my career, and if you ever read transition guides for perl to python, its a lot of putting the training wheels back on; plus the whitespace sensitive syntax can go wrong in horrible opaque ways (holds up hands to show the scars). So of course I chose Julia.

The example from the ME564 Lecture 3 video was to plot sin(x) and then plot the partial Taylor expansions of it to the 1st, 2nd, 3rd ... terms.

	using Plots
	using TaylorSeries
	using Random
	x= -5π/4:.01:5π/4
	sin_ish2= Taylor1([0,1,0,-1//(3*2)])
	sin_ish3= Taylor1([0,1,0,-1//(3*2),0,1//(5*4*3*2)])
	sin_ish4= Taylor1([0,1,0,-1//(3*2),0,1//(5*4*3*2),0,-1//(7*6*5*4*3*2)])
	taylors= [ x-> x             ## taylor of sin 𝒪(t)
	           x-> sin_ish2( x ) ## taylor of sin 𝒪(t^3)
	           x-> sin_ish3( x ) ## taylor of sin 𝒪(t^5)
	           x-> sin_ish4( x ) ## taylor of sin 𝒪(t^7)
	labels = [ "taylor1" "taylor2" "taylor3" "taylor4" ]
	pl_= plot( x, x-> sin(x), title= "approximations", label="sine", linewidth= 4)
	plot!( x, taylors, label= labels )
	plot!( x, x-> rand()/2-1/4, label= "noise" )
	savefig( pl_, "/tmp/julia_sin.pdf" )
This is run in Pluto, which is a HTML notebook system, kind of like Jupyter; which makes pretty pictures as you go. There's some things I like so far, and things I don't. There's no compact operator for factorial, and since 7*6*5*4*3*2 is shorter than factorial(7), I just used that. I used rational representations for the fractions just on a whim, no good reason. And the ability to dump out a PDF as you go is kind of cool.

osmosis I threw in the noise there, just because I was cribbing off some examples of scatter plots of random values, and was trying to get an understanding of its use through osmosis. (random(3) is not a random value between 0 and 3, like it would be in other languages, but is a vector of three random numbers.)

There still seems to be overlap with Matlab syntax, so I'm partially doomed there. My biggest gripe is that I have to call the Taylor1 generator first, and then create a function out of its result (x -> sin_ish2(x)) in taylors as putting the generator call in the function, even if I could figure out how to dereference it, would have it be called for every plot point. There's probably a way, but I didn't get anywhere close to finding it groping through the various getting started examples. I'm probably doing a dis-service to Taylor1 as well, as I really can't tell the difference between that and Poly().

  The internet and struggles of discovery.
If anyone thinks that all the problems in search have been solved, well let me tell you a quick story.

I was reading yet another article about the depressing future of open plan offices: Even the Pandemic Can’t Kill the Open-Plan Office (disclaimer citylab is owned by my employer).

And part way down, there was a picture of an open plan office spaced out more, at the Oakland offices of Gensler, an architectural and design firm. What was interesting to me was not the spacing, but chairs pictured.


I am kind of a chair junkie. I owned five or six different kind of Herman Miller chairs I picked up on ebay over the years. Mirra is actually my favorite, and I just can't stand the Envoy though it could be I don't have it adjusted right. Now that all my co-workers are working from home, I've recommended the chairs to them, but like cheap web-cams, they're a little hard to come by at the moment. So I'm always on the lookout for other interesting options.

These chairs in the Gensler office, looked interesting. But what were they?

Several searches on common office supply sites (and the borg of e-commerce, amazon) were useless. They kept circling back to cheap hundred dollar chairs I wouldn't be caught dead in. There's no good way to search for "high end office chair where the arm rests attach to the seat back". I started looking for the most expensive chair I could find, and then search the entire inventory of that online catalog for something that matched. I searched for mesh chairs, task chairs, white chairs, executive chairs, all sorts of combinations of the above and others.

Useless. Hours wasted.

Finally, in frustration I searched for anything that mentioned chairs in regards to that office:

"Gensler Oakland chair"
And I got back a breadcrumb:

located in Oakland, California. Gensler's Oakland office is characterized by. ...
Haworth Collection by Forest Side Chair · Forest Side Chair
Now what I did not realize was that where this link came back, had the picture of the chair I was looking at, and was annotated with its name. Instead I focused on the brand mentioned in the quoted text, hoping that the humans at that office were as predictable as most humans were, and if a site carried the Haworth collection, then it might also carry the other chair.

That landed me on Under office::task chairs, I started getting pretty close. There were a bunch of chairs from Knoll called regeneration that had what looked like the right back, but the arm rests were wrong. Then near the end of page two (who ever goes past page one on search results?), there it was "Knoll Generation Chair". At a modest price of $635, though when you add all the bells and whistles, its more like $930.

They even have a few on ebay, though the selection of colors is a little limited, and the most interesting one is local pickup in Dallas. Now who do I know that I could bug in Dallas ...

So once again, it comes as no surprise (at least to me) that the internet is terrible at helping you discover things that you didn't know were there. Discovery is a second class citizen. Even tools we had, we've lost. My college library back in the day was modernizing to add the ability to search for books on the mainframe (ya, I'm that old), and one of the features it had was the ability to see what books were next to the one you looked up on the bookshelf. I've never seen that feature since. Part of the wonder of the human brain is its ability to make connections between things, and through sharing those connections, create discovery. We have not even scratched the surface of how technology could support and strengthen that process.

Labels: , ,

  Strong curiosity
We were driving in the car a few weeks back, taking Zak back to the airport. Economic activity had started to decrease, so traffic was light, but there were plenty of trucks on the road. As I passed one, I noticed a diamond hazmat symbol on the side:

Which is a bit specific. Now I didn't remember how to read the hazards directly off this form, because I'm more familiar with the generic form:

which explains that whatever the truck is carrying can react somewhat with other chemicals, and can cause a significant health issue.

Never one to pass up an educational opportunity, I had the kids look up that particular hazamat number, to find out its Calcium Hypochlorite. Now I know what Calcium Choride is, its a salt (CaCl2). You can use it for ice melt because it will reduce the freezing temperature of water down below -40° and I also use it in my pool to increase the hardness. Its one of like eight crazy things you have to keep track of in order to have your water neutral so it won't start dissolving the walls of the pool or leaving scale on the ladders. The crazy thing is if you buy it from the pool supply store in their fancy packaging, you'll pay $3-5/pound, but if you just by the pro-ice melt which is 100% the same thing, its fifty cents a pound.

Anyways, Calcium Hypochlorite has some oxygen mixed in, so its Ca(ClO)2, and it turns out you could add it to the pool to increase the calcium as well, while getting lots of free Chlorine for sanitizing. Its a regular ingredient in pool shock, which luckily I never have to use. I have a salt pool, which has an electrolyzer which constantly splits the salt into sodium and chlorine, and its a beautiful thing. While the neighbor has to check and add Chlorine to his pool a couple of times a week, this thing runs for weeks at a time unattended.

But usually you don't use a huge amount of Calcium Hypochlorite, because the calcium will build up and make the water hard. However its good in water treatment plants, where the water is eventually consumed and replaced with new stock from underground or lakes or wherever. So now I know, now my kids know; and I can move onto wondering about the next thing that wanders by.

  Working remotely.
I've been working remotely for a very long time. Decided to read The year without pants, which is free right now. Some of it is kind of cheerleading and fluf, but as the author explores his clash of experience with how the new company worked, he realized a few key things. "The problem with modern work is how loaded workplaces are with cultural baggage. We faithfully follow practices we can't explain rationally." "All traditions are inventions; it's just a question of how old the invention is. There is nothing wrong with tradition until you want progress: progress demands change, and change demands a reevaluation of what the traditions are for and how they are practiced." "Anyone who eliminates superfluous traditions takes a step towards progress." "Once you take responsibility for your own future, you must work to continually eliminate useless traditions and introduce valuable ones. An organization where nothing ever changes is not a workplace but a living museum." Now I've changed the wording some here, because the original writing put this responsibility only on "managers" and "leaders". But anyone who works to be good at what they do and passionate about how its done is a leader. Maybe not on their business card, but those things don't always tell the full story.
  Surviving in Florida.
So life in central Florida is pretty quiet now. With the spring-breakers spreading contagion across the country, we've gone back to getting by as best we can. Co-workers from up North ask me how we all cope, and I explain that we usually have a natural disaster come through here every other year or so. This is kind of like an hurricane, only:
Now thankfully, I'm able to work from home, and I know that's not the case for everyone.  And people in the medical field are dealing with a lot of stress.  But for the time being, we're getting by, cooking with the BBQ not because we have to, but because its nice out, watching the rocket launches, and just keep'n on.
  Chemists help the world go round.
In high school chemistry, there wasn't a lot of time spent on how much Chemist actually have done to advance the bounds of knowledge when it comes to various molecules. How was super-conducting discovered? Chemists basically took every single element they could get their hands on and cooled them down to the coldest temperatures they could achieve, and then measured the various properties (including resistance) under those conditions. The book "Ignition" is full of lots of crazy chemists who studied the behavior of various "fuels" under less than optimum conditions. Spontaneous disassembly was a frequent outcome.

Even in modern times as we work to find ways to store and transport energy that doesn't involve carbon, chemists provide the map of the landscape. In a paper discussing the use of ammonia for hydrogen storage, they present a graph of all different compounds I had never even heard of, and plot out their hydrogen capacity and density:

hydrogen carriers

Does a good job showing that N is the next best thing to carbon if you just want to keep a lot of H around. My hat goes off to the chemists that sat down and classified all these various things over the years. Sure, there's probably different uses for things like Aluminium borohydride, so they needed to know tis properties anyways, but its probably not the kind of work that leads to a Nobel prize. Just filling in the details so that others that come after can build atop that, or avoid what's going to be a dead end.

Doesn't mean that this is the end of the story though. Just look at what Acetylene went through.

Acetylene (H2C2) is a welding gas, but is unstable in its pure form (ie, if you try to cram a bunch into a pressurized tank, it just explodes). Originally if you wanted a supply for welding you used a gas generated that combined water and calcium carbide. Finally, chemists figured out you could safely dissolve acetylene into acetone. So they started filling up canisters with acetone, and then pushing acetylene into that. Unfortunately, as you drew the acetylene out, you also got acetone vapor, which eventually depleted the acetone in the bottle, and again *boom*. Finally, they figured out you could lock the acetone in chalk, and it wouldn't evaporate; so a modern acetylene is actually full of chalk, which is full of acetone, which is full of acetylene.


Life in the middle of nowhere, remote programming to try and support it, startups, children, and some tinkering when I get a chance.

January 2004 / February 2004 / March 2004 / April 2004 / May 2004 / June 2004 / July 2004 / August 2004 / September 2004 / October 2004 / November 2004 / December 2004 / January 2005 / February 2005 / March 2005 / April 2005 / May 2005 / June 2005 / July 2005 / August 2005 / September 2005 / October 2005 / November 2005 / December 2005 / January 2006 / February 2006 / March 2006 / April 2006 / May 2006 / June 2006 / July 2006 / August 2006 / September 2006 / October 2006 / November 2006 / December 2006 / January 2007 / February 2007 / March 2007 / April 2007 / June 2007 / July 2007 / August 2007 / September 2007 / October 2007 / November 2007 / December 2007 / January 2008 / May 2008 / June 2008 / August 2008 / February 2009 / August 2009 / February 2010 / February 2011 / March 2011 / October 2011 / March 2012 / July 2013 / August 2013 / September 2013 / October 2013 / November 2013 / December 2013 / December 2014 / February 2015 / March 2015 / July 2016 / September 2016 / December 2016 / April 2017 / June 2017 / July 2018 / November 2018 / January 2019 / February 2019 / April 2019 / December 2019 / March 2020 / April 2020 / May 2020 / September 2020 / November 2020 /

Paul Graham's Essays
You may not want to write in Lisp, but his advise on software, life and business is always worth listening to.
How to save the world
Dave Pollard working on changing the world .. one partially baked idea at a time.
Eric Snowdeal IV - born 15 weeks too soon, now living a normal baby life.
Land and Hold Short
The life of a pilot.

The best of?
Jan '04
The second best villain of all times.

Feb '04
Oops I dropped by satellite.
New Jets create excitement in the air.
The audience is not listening.

Mar '04
Neat chemicals you don't want to mess with.
The Lack of Practise Effect

Apr '04
Scramjets take to the air
Doing dangerous things in the fire.
The Real Way to get a job

May '04
Checking out cool tools (with the kids)
A master geek (Ink Tank flashback)
How to play with your kids

Powered by Blogger