Technical Difficulties from on Top of the Mountain
  Program to program communication
Communicating between programs usually means across computers, especially when its a client program ( or web browser ), talking to a server somewhere else. For that you almost universally use a inet socket ( TCP, or UDP if you're a glutton for punishment).

net plug
There used to be a bunch of other protocols, but IP pretty much crushed them all. Ungerman Bass had its own SNA back when the internet was starting to form, Digital Equipment Corp had DECnet, IBM had SNA, and Novell Netware used IPX. But TCP/IP was good enough, pretty darn simple, and as we went from 1mbit to 10,000mbit, other things became the bottleneck. Sure if you run a telco, you may still wax on about ATM, but even the ATM network is carrying TCP traffic.

On a server however, you have a fair amount of traffic staying on the same machine, passing back and forth between different programs; usually a result of dividing a problem down into smaller parts that are hopefully harder to mess up. So you need a mechanism for setting up connections and passing messages.

Even in this case you could use INET sockets, but that's not your only option, nor is it the best choice for a number of reasons. First, there's a lot of overhead to INET sockets. Even if your packet isn't going to cross great distances, the operating system still does all the packet overhead like it would. This puts a limit on the number of packets you can read and write, especially a problem if your communications is a bunch of short messages. Secondly, when you create a INET service, it is visible to the network beyond your computer, allowing anyone running a portscan to find it. So speed and security are both good reasons to look elsewhere.

As the workstation market began to grow, and AT&T decided to wade back into the unix market; they added new kernel services to System V, called IPC ( inter process communication ). There were three parts: semaphores, shared memory and message queues. Semaphores allowed passing access or control between processes to a shared resource, shared memory seemed like a good way to avoid having to pass around large data sets when disk access was expensive, and message queues gave you both an orderly mechanism passing data, as well as atomically handling a message. Unfortunately at the time, all these data structures existed in kernel memory, and they were fixed in size ( originally compiled into the kernel settings ), so on a typical machine they were ridiculously small. One one HP machine with 64MB, the limits shown were 64 semaphores, 4k message queue, and 64k shared memory. Even today on a machine with 2GB of RAM, ipcs -l shows a queue size limit of 16k.

Moving on.

On unix, creating the raw socket() itself is protocol neutral. Its just that most everyone in the universe uses INET. You can also use sockets for RAW packets, ATM, Appletalk, IPX, X25; and one more format called AF_UNIX ( although its now refered to as AF_LOCAL for POSIX reasons, but the structs are still all un_ ). A AF_UNIX socket is for communicating locally on the machine. Originally, like INET sockets, there was a private namespace, using 32bit numbers which you used for the "port" number, but then they expanded it to also allow you to map the socket into the filesystem, so you would get a file that showed up like this:

% ls -lF
total 0
srw------- 1 woolstar users 0 2009-06-26 23:46 agent.16975=
ssh-agent uses this to allow ssh processes to authenticate against a stored key. Back when this first showed up, on some OS's you could actually just talk to this entry like it was an ordinary device. That was sort of the spirit of unix, everything was a file. You could open up /dev/serial0 the same way you would open up foo.txt. So back in the day, you could open this file mapped socket, send it some data, and then close it, and the interaction on the server side would look just like you had telnet'd in. Sadly, AF_UNIX sockets don't work like this any more. Even though they're sitting right there in the filesystem, you can't just echo "hi" > mytest.sock   You have to use socket functions to connect to it and read and write to it.

Still, if you control both sides, the upside is definitely worth it. In some tests I did around 2001, for smaller packets on a single processor machine, AF_UNIX connections could out-run AF_INET by over ten to one. Also AF_UNIX allows some funky *magic* data to be passed between machines. Like one process can pass an open file handle over to another process. You can also authenticate your user id and group across an AF_UNIX connection and the kernel will validate you to the recipient. And as I mentioned before, AF_UNIX connections are only available on the machine, so there's no outside hacking into these services.

But for my current project, the lack of transparently using AF_UNIX sockets was a bummer, because I have a project I'm working on where a process runs and then opens up an external file to write log entries to. I want to have those entries go immediately into another process for processing, and so I wanted to throw a named socket into the file system and have the first program log to that "file". The first process is this big third part server that I didn't want to have to mess with, so spoofing a file would have been ideal. Luckily there's something else available now that will do it.

fifo(7) or named pipes

A fifo is like a queue, where you put several things in, and then pull them out. In this case the first thing you put in is the first thing that comes out ( First In First Out ... fifo). There's also FILO, but it doesn't make as good an acronym in my opinion. So the modern linux kernel allows you to create a named pipe in the file system, and as an improvement, you don't even have to have any active processes attached to either end. It could just be sitting there. Then when you want, you attach and try to read from it, which doesn't do much cause there's nothing in it. When someone else comes along and write to it, then the message shows up to the reader.

There are some caveats of course. If no one is hanging around waiting for the message, you can't write to the named pipe. The kernel isn't going to save things up for you. Also, if several processes are reading to a named pipe, you can't tell the difference. Its not like sockets where each connection will have its own file handle and you can tell the lifespan of each client. But for the purposes of log processing it will do just fine. At least I hope it will. I will have to get back to you on that.

Comments: Post a Comment

<< Home
Life in the middle of nowhere, remote programming to try and support it, startups, children, and some tinkering when I get a chance.

January 2004 / February 2004 / March 2004 / April 2004 / May 2004 / June 2004 / July 2004 / August 2004 / September 2004 / October 2004 / November 2004 / December 2004 / January 2005 / February 2005 / March 2005 / April 2005 / May 2005 / June 2005 / July 2005 / August 2005 / September 2005 / October 2005 / November 2005 / December 2005 / January 2006 / February 2006 / March 2006 / April 2006 / May 2006 / June 2006 / July 2006 / August 2006 / September 2006 / October 2006 / November 2006 / December 2006 / January 2007 / February 2007 / March 2007 / April 2007 / June 2007 / July 2007 / August 2007 / September 2007 / October 2007 / November 2007 / December 2007 / January 2008 / May 2008 / June 2008 / August 2008 / February 2009 / August 2009 / February 2010 / February 2011 / March 2011 / October 2011 / March 2012 / July 2013 / August 2013 / September 2013 / October 2013 / November 2013 / December 2013 / December 2014 / February 2015 / March 2015 / July 2016 / September 2016 / December 2016 / April 2017 /

Paul Graham's Essays
You may not want to write in Lisp, but his advise on software, life and business is always worth listening to.
How to save the world
Dave Pollard working on changing the world .. one partially baked idea at a time.
Eric Snowdeal IV - born 15 weeks too soon, now living a normal baby life.
Land and Hold Short
The life of a pilot.

The best of?
Jan '04
The second best villain of all times.

Feb '04
Oops I dropped by satellite.
New Jets create excitement in the air.
The audience is not listening.

Mar '04
Neat chemicals you don't want to mess with.
The Lack of Practise Effect

Apr '04
Scramjets take to the air
Doing dangerous things in the fire.
The Real Way to get a job

May '04
Checking out cool tools (with the kids)
A master geek (Ink Tank flashback)
How to play with your kids

Powered by Blogger